manzan.dev

Python plugin for protoc (part 2)

01.02.2021 — python, protoc, software, how-to — 2 min read

Welcome back, and a late happy new year! 🎉

In part 1 we created a basic plugin for protoc, Google's official compiler for protobuf. The plugin is an executable Python script, to be specified when protoc is executed via the command line. Our overall aim is to extract useful information from a given protobuf definition (.proto) file and write that data into a JSON file (see protoc-gen-bq-schema for a great real-life example why something like this could be useful). We haven't produced any JSON output yet, since the method generating the response hasn't been fully implemented. Exiting... this is the goal for today!

The complete code and setup instructions can be found on GitHub.

A quick recap: We started by setting up a local environment, activating it, and most importantly installing the compiler itself. Then we created the Python script plugin.py, made it executable, and created a small test protobuf file example.proto. We ran protoc example.proto --plugin=protoc-gen-custom-plugin=./plugin.py --custom-plugin_out=. and saw… darkness. We just wrote an empty CodeGeneratorResponse to stdout.

❯ Step 1 — complete the plugin code:

Let's see how we can use the CodeGeneratorRequest to extract some meaningful information from our protobuf definitions this time!

First, let's complete the process function:

plugin.py

1def process(
2    request: plugin.CodeGeneratorRequest, response: plugin.CodeGeneratorResponse
3) -> None:
4    for proto_file in request.proto_file:
5        process_proto_file(proto_file, response)

This allows us to process each proto file in the CodeGeneratorRequest individually.

Important: Note that the request does not only contain the FileDescriptorProto of our example protobuf file, but also the ones of all its imports, in this case google/protobuf/empty.proto. Hence, we'll process two proto files in total!

plugin.py

1def process_proto_file(
2    proto_file: FileDescriptorProto, response: plugin.CodeGeneratorResponse
3) -> None:
4    logger.info(f"Processing proto_file: {proto_file.name}")
5
6    # Create dict of options
7    options = str(proto_file.options).strip().replace("\n", ", ").replace('"', "")
8    options_dict = dict(item.split(": ") for item in options.split(", ") if options)
9
10    # Create list of dependencies
11    dependencies_list = list(proto_file.dependency)
12
13    data = {
14        "package": f"{proto_file.package}",
15        "filename": f"{proto_file.name}",
16        "dependencies": dependencies_list,
17        "options": options_dict,
18    }
19
20    file = response.file.add()
21    file.name = proto_file.name + ".json"
22    logger.info(f"Creating new file: {file.name}")
23    file.content = json.dumps(data, indent=2) + "\r\n"

Note how we can now easily access different attributes of the proto file, such as proto_file.name or proto_file.package via its FileDescriptorProto. I would recommend playing around with it a bit more yourself – check out the docs for a complete list of available attributes and methods and if you wanna dig a bit deeper… of course Google defined the FileDescriptorProto structure itself as protobuf in descriptor.proto.

So how do we write the results into a JSON file? We simply create a dictionary with some useful data and add an output file to the CodeGeneratorResponse using response.file.add(). Lastly, we give our output file a name and dump the dict as JSON into it. That's all!

❯ Step 2 — run the plugin:

Now we can run protoc example.proto --plugin=protoc-gen-custom-plugin=./plugin.py --custom-plugin_out=. again, and this time we'll see some useful log messages:

2021-02-01 22:01:47,185 - __main__ - INFO - Processing proto_file: google/protobuf/empty.proto
2021-02-01 22:01:47,185 - __main__ - INFO - Creating new file: google/protobuf/empty.proto.json
2021-02-01 22:01:47,185 - __main__ - INFO - Processing proto_file: example.proto
2021-02-01 22:01:47,186 - __main__ - INFO - Creating new file: example.proto.json

As expected, we've processed both google/protobuf/empty.proto and example.proto. The output file for our example proto file should look like this:

1{
2  "package": "test_package",
3  "filename": "example.proto",
4  "dependencies": [
5    "google/protobuf/empty.proto"
6  ],
7  "options": {
8    "java_package": "com.example.foo"
9  }
10}

Note that protoc cares about the (relative) location/path of the proto files: by convention empty.proto.json is written to a newly created folder google/protobuf/.

I hope you found this helpful — for any feedback, comments or questions, please reach out.

~ manzan