manzan.dev

Python plugin for protoc (part 1)

19.12.2020 — python, protoc, software, how-to — 2 min read

A while ago, I tried to figure out how to write a Python plugin for protoc, which is Google's official compiler for protobuf (aka Protocol Buffer). Protocol Buffer is "Google's data interchange format" and it allows you to serialize structured data. My overall aim was to parse a set of Protocol Buffer definition (.proto) files and store the parsing results in JSON format (see protoc-gen-bq-schema for a real-life application of this idea).

The official protobuf library supports automatic code generation for a limited number of use-cases and languages. Luckily, protoc can easily be extended to support custom parsing via plugins!

From Google's docs:

A plugin is just a program which reads a CodeGeneratorRequest protocol buffer from standard input and then writes a CodeGeneratorResponse protocol buffer to standard output. These message types are defined in plugin.proto. We recommend that all third-party code generators be written as plugins, as this allows all generators to provide a consistent interface and share a single parser implementation.

I did some research, but couldn't find too much information. Most plugins out there seem to be written in Golang, and there is no official documentation or tutorial for a Python plugin. Hence, I want to provide a beginner-friendly introduction for working with protoc and how to write a simple, custom plugin using Python. The plugin we'll write in this example parses Protocol Buffer files and writes high-level information about the files, such as importing definitions or options, into JSON files.

❯ Step 1 — setup:

Note: All setup instructions and the complete example code can be found on GitHub.

You'll need Python 3 and protoc to be installed on your machine (I'm using Mac).

1brew install protobuf

Validate your installation with:

1protoc --version

The output should be libprotoc 3.14.0, or similar.

Furthermore, make sure to pip install protobuf==3.14.0 (versions should match) in your Python environment.

❯ Step 2 — write the plugin:

First, create a file plugin.py. The most basic thing a plugin can do is reading a CodeGeneratorRequest from stdin and write an empty CodeGeneratorResponse to stdout.

This can be accomplished as follows:

plugin.py

1#!/usr/bin/env python
2
3import sys
4
5from google.protobuf.compiler import plugin_pb2 as plugin
6
7
8def process(
9    request: plugin.CodeGeneratorRequest, response: plugin.CodeGeneratorResponse
10) -> None:
11    pass
12
13
14def main() -> None:
15    # Load the request from stdin
16    request = plugin.CodeGeneratorRequest.FromString(sys.stdin.buffer.read())
17
18    # Create a response
19    response = plugin.CodeGeneratorResponse()
20
21    process(request, response)
22
23    # Serialize response and write to stdout
24    sys.stdout.buffer.write(response.SerializeToString())
25
26
27if __name__ == "__main__":
28    main()

❯ Step 3 — run the plugin:

Now make the script executable with chmod +x plugin.py. Finally, we can try it out!

Create a test Protocol Buffer file example.proto and invoke the compiler by running:

1protoc example.proto --plugin=protoc-gen-custom-plugin=./plugin.py --custom-plugin_out=.

Confusing? Custom plugin names always start with protoc-gen-. Note that the term custom-plugin is both the last portion of the plugin name and the first part of the out argument (path where output files are written to). This term could be anything, as long as you follow the naming convention. Also note that in this example I specified the path to our plugin with =./plugin.py , but alternatively you can also make sure your plugin is a program called protoc-gen-custom-plugin (for example), available on your PATH.

Executing the full command will not result in any output, since we are not writing anything into the response yet. Soon, in part 2 we'll look into generating some useful output and magically write it into JSON files.

Stay tuned and Merry Christmas! 🎄

I hope you found this helpful — for any feedback, comments or questions, please reach out.

~ manzan

P.S.: Part 2 is now online!