Moment For Technology

Protocol Buffers, a serialization framework that is 100 times faster than XML

Posted on Dec. 3, 2022, 8:26 a.m. by 袁佳穎
Category: The back-end Tag: The back-end java protobuf

We're used to using Json, XML, and other data storage formats, but many of us haven't heard of Protocol Buffer (Protobuf). Protobuf is a Google open source, language-independent, platform-independent communication protocol with a small, efficient, and friendly compatibility design that makes it widely used. Performance is much better than Json and XML!

Moreover, with the popularity of microservice architectures, RPC frameworks have become an important part of service frameworks. Protobuf is one of the best RPC designs that uses high-performance codec technology.

In other words, it is necessary to learn protobuf's use and principles in order to understand the underlying implementation of RPC in microservices architecture and to design efficient transport, serialization, encoding and decoding functions.

Protobuf profile

Protobuf is a serialized object framework (or codec framework). It has two parts of functionality: structured data (data storage structure) and serialization deserialization.

The function of data storage structure is similar to XML and JSON. Serialization and deserialization work in a similar way to Java's own serialization, Facebook's Thrift, and JBoss Marshalling.

In summary: Protobuf is a data storage /RPC data exchange function by defining structured data and providing serialization and deserialization of data.

It features:

  • Language independent, platform independent
  • concise
  • High performance (fast serialization small data volume after serialization)
  • Good compatibility

For an intuitive look at the serialization response times of the different frameworks:

As you can see, Protobuf's performance is much higher than other frameworks.

Protobuf usage process

Protobuf's features are described above, but we don't know how it works by just knowing those features. Read a lot of articles on the Internet, either directly start to write code or directly start to analyze the message format, for the novice is often confused.

So, let's go through the steps of using Protobuf.

Using protobuf in the figure above is divided into four steps:

  • Step 1: Setting up the environment: Using Protobuf, you need to define data structures for communication and compile them to generate code in different programming languages. This requires a compiler environment.
  • Step 2: Build the data: Protobuf is used to transfer the data, so what the data contains, what the items are, and what the overall hierarchy looks like. The data structure is defined based on protobuf syntax.
  • Step three, project integration: integrate POM dependencies (Java as an example), integrate compiled Java classes (compared to proto files);
  • Step four, specific use: through the integration of Java classes, to build messages, assign values, and then based on protobuf serialization, the receiver deserialization operation;

After understanding the above steps, the following steps are specific to the actual combat demonstration.

This demo is based on the Mac OS operating system and Java programming language. If you are using other operating systems and programming languages, the basic idea is the same, and you can find specific operations in different steps.

Install the Protocol Buffers

Protobuf is installed for the definition of data structures and the generation of corresponding programming language code. There are usually two ways: local installation and IDE plug-ins. Let's start with a local installation.

Protobuf code is hosted on GitHub at github.com/protocolbuf... .

Click the release link on the right side of the project to see the corresponding version: github.com/protocolbuf... .

Versions for various programming languages and environments are included. This text is protobuf-java-3.17.3.zip.

On the Mac operating system, you need to install the dependencies before you can compile and install protobuf.

Install dependent components:

// Install the Protocol Buffer dependency // The Protocol Buffer depends on autoconf, automake, libtool, and curl brew install autoconf automake libtool curlCopy the code

Decompress protobuf-java-3.17.3.zip, go to the root directory, and run the following command:

Sh // Run the configure.sh script./configure // compile the uncompiled dependency packages make // Check whether the dependency packages are complete make check // Start to install the Protocol  Buffer make installCopy the code

Install complete, verify version:

$protoc --version
libprotoc 3.14.0
Copy the code

If the version information is displayed, the installation is successful.

The protoc command is the Protocol Buffer compiler that compiles the.proto file into the header and source files for the platform.

Another way is to install the IDE plug-in. Here, use IDEA as an example to search for the plug-in:

There are a lot of plugins for Protobuf, so choose what suits you.

The gRPC official then recommends a more elegant pose that can be easily handled by Maven (with the "Protobuf Support" plugin shown above). This is to introduce some components of GRPC, then configure them in Maven's build and compile the Proto files into Java code. This way is not expanded temporarily, can directly see the project integration part of the source code.

Build a data

In Java, if we want to transfer data through JSON, we first define an object, using Person as an example:

public class Person {
    private String name;
    private Integer id;
    // ... getter/setter
}
Copy the code

So, what would the data structure look like if you defined the Person object with Protobuf?

Create a person.proto file and define the following structure:

syntax = "proto3"; // Define file package Tutorial for Protobuf 3; option java_package = "com.choupangxia.protobuf.message"; Option java_outer_className = "Person"; // Declare the path to the Java package that generates the message class option java_outer_className = "Person"; Message PersonProto {string name = 1; int32 id = 2; }Copy the code

See the comments section for a detailed description of each of these syntax items. Of course, the structure of Person could be much richer, but this is just a simple example for demonstration purposes. See the official documentation for more syntax.

Compile the protot file

Once the definition is complete, we can generate the target Java classes in two ways. The natively installed compiler is used first.

Before executing the protoc command, you can run the -h command to view the instructions for using the protoc.

protoc -h
Copy the code

Go to the directory where the person.proto file is located and run the following command to compile it:

protoc --java_out=.. /java ./person.protoCopy the code

The -- javA_out argument specifies the path to the output of the Java class, and the second argument executes to compile the file person. Proto in the current directory.

Execute the command, you'll find com. Choupangxia. Protobuf. Message or a class called Person. Note that the message name defined in proto should not be the same as the Java class name. Otherwise, command execution will fail.

The corresponding Person class is complex and even has some syntactic errors or improvements, which can be optimized if needed.

The figure above shows part of the structure of the generated Person class. For example, the java.lang.string getName() method can be optimized without specifying a String package.

Project integration

In fact, the generated Person code is put into the project as part of the project integration. If no protobuf dependencies are introduced, the above code will still report an error.

Add protobuf dependencies to the Maven project POM file:

 the dependency   groupId  com. Google. Protobuf  / groupId   artifactId  protobuf - Java  / artifactId   version  3.17.3  / version  /dependencyCopy the code

If you want to compile the proto file directly with IDEA, you need to install the "Protobuf Support" plug-in. You also need to introduce the GRPC dependency. The complete dependency is as follows:

/protobuf.version 3.17.3/protobuf.version /properties dependencies  dependency groupIdcom.google.protobuf/groupId artifactIdprotobuf-java/artifactId version${protobuf.version}/version /dependency ! groupId artifactIdgrpc-netty/artifactId grpc-netty/artifactId version${grpc.version}/version scopeprovided/scope /dependency dependency groupIdio.grpc/groupId artifactIdgrpc-protobuf/artifactId version${grpc.version}/version scopeprovided/scope /dependency dependency groupIdio.grpc/groupId artifactIdgrpc-stub/artifactId version${grpc.version}/version scopeprovided/scope /dependency /dependencies build extensions extension groupIdkr.motd.maven/groupId artifactId version1.5.0.Final/version /version /extension /extensions plugins .  the groupId  org. Xolstice. Maven plugins  / groupId   artifactId  protobuf - maven - plugin  / artifactId   version  0.5.0  / version  configuration protocArtifactcom.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}/protocArtifact pluginIdgrpc-java/pluginId pluginArtifactio.grpc:protoc-gen-grpc-java:${grpc.version}:exe:${os.detected.classifier}/pluginArtifact /configuration executions execution goals goalcompile/goal goalcompile-custom/goal /goals /execution  /executions /plugin /plugins /buildCopy the code

Before executing the maven compile command, place the proto files to be compiled in the/SRC /main/proto directory in the same directory as SRC /main/ Java.

At this point, you can copy the generated Java to the corresponding package.

Business applications

With everything in place, it's time to write an example using the corresponding code.

public class Test { public static void main(String[] args) throws InvalidProtocolBufferException { Person.PersonProto sourcePersonProto = Person.PersonProto.newBuilder().setId(123).setName("Tom").build(); . / / the serialized byte [] binaryInfo = sourcePersonProto toByteArray (); Println (" Serialize the bytecode content: "+ Arrays.toString(binaryInfo)); System.out.println(" Serialized bytecode length: "+ binaryinfo.length); System. The out. Println (" -- -- -- -- -- -- -- -- -- -- - the following for the receiver deserialization operation -- -- -- -- -- -- -- -- -- -- -- -- -- "); / / deserialization Person. PersonProto targetPersonProto = Person. PersonProto. ParseFrom (binaryInfo); System. The out. Println (" deserialization result: "+ targetPersonProto. ToString ()); }}Copy the code

The above code is based on the basic use of the generated Person class. The packet information is serialized by first encapsulating the parameters with the inner class and Builder method in the Person class, and then by calling its toByteArray method. The receiver, on the other hand, has the same set of code that gets the Person.PersonProto object and executes the parseFrom method to deserialize it.

Why is Protobuf efficient

The analysis is based solely on the volume of serialized data. Compared with text protocols such as XML and JSON, ProtoBuf is encoded using T-(L) -v (tag-length-value) and does not require ", {,}, and: separators to structure information. At the same time, Protobuf uses Varint compression at the coding level, so the serialized volume of the same information is much smaller, which consumes less network traffic in the network transmission, so protobuf protocol is a good choice for the scenarios where network resources are tight and performance requirements are very high.

Here's a simple, intuitive example:

{"id":1,"firstName":"Chris","lastName":"Richardson","email":[{"type":"PROFESSIONAL","email":"[email protected]"}]}
Copy the code

For the JSON data above, the size of the data serialized with JSON is 118byte, while the size of the data serialized with Protobuf is 48byte. If the data volume is larger and the hierarchy is more complex, the gap is still noticeable.

In terms of serialization/deserialization speed, Protobuf serializes/deserialization faster than XML and JSON, 20-100 times faster than XML.

Protobuf, however, is based on a binary protocol, and the encoded data is not readable. Without an IDL file, the binary data stream cannot be understood and is not debug friendly.

summary

This article takes you through the steps of using protobuf from 0 to 1. Many articles are confusing because the whole core logic of using Protobuf is not clear. As long as you know how to set up the environment, how to write the data structures, how to compile them, how to integrate them into your project and use them. So, protobuf's other knowledge points can be gradually supplemented in practice.

As microservices continue to evolve, RPC frameworks for efficient communication will inevitably follow the trend of using frameworks like Protobuf. It is also a necessary knowledge to better learn the basics of microservices architecture.

This article source: github.com/secbr/proto...

Author of SpringBoot Tech Insider, loves to delve into technology and write about it.

Official account: "Program New Horizon", the official account of the blogger, welcome to follow ~

Technical exchange: please contact the blogger wechat id: Zhuan2quan

Search
About
mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.