Read Protocol Buffers data in Python3

Today, I'm talking about Python + Protocol Buffers, which suddenly changed my mind from the usual statistics and machine learning.

Protocol Buffers is a convenient serialization format. If you want to use it in Python 3, you need to use the latest code on GitHub at the time of writing.

Python 2 support has increased until 2020 The news says that you should use the new 2 system, but the latest technology As an engineer to follow, I think that there are many cases where there is a strong will to never write anything other than Python 3.x.

Environment

As an environment, clone Protocol Buffers from GitHub, build it, and set it up with Python as follows.

git clone git://github.com/openx/python3-protobuf.git
cd python3-protobuf
./autogen.sh
./configure --prefix=$PREFIX #Specify the installation destination of protobuf
make
make check
sudo make install

cd python #Python binding
python setup.py build
python setup.py test
sudo python setup.py install

You can now use Protocol Buffers from Python3. It's easy.

File structure definition and conversion

Protocol Buffers defines the structure in a file named .proto. It is said that the key in JSON is converted to a numerical value and can be exchanged in a smaller data size.

protoc -I=. --python_out=. schema.prot

A file such as schemaXX.py will be generated, so you can import it from the script you want to use.

Use from Python

This is the image that actually reads the data.

import schema_pb2 #Import the generated Python file
import base64
import json

event = schema_pb2.nb_event()
event.ParseFromString(base64.b64decode(value))
ts = event.timestamp

#Try converting to JSON
obj = {}
obj['event_type'] = event.type
obj['seq'] = event.seq
obj['timestamp'] = event.timestamp
obj['op'] = event.op

#Convert data once stored in the dictionary to JSON
json_dump = json.dumps(obj, ensure_ascii=False)
print(json_dump)