This is the 19th day of my participation in the August Wenwen Challenge.More challenges in August

Introduction to the

Protocol Buffer is a google-developed way to serialize objects. It is popular for its small size and fast transfer. Protobuf is a platform-independent and language-independent protocol, which can be easily converted into multiple language implementations through protobuf definition files.

Today I’m going to introduce you to the basic use of Protobuf and a concrete case study with Java.

Why use Protobuf

We know that the data in the network transmission in binary, generally we use bytes byte to said, one byte is eight bits, if you want to transfer objects at the network, generally need to object serialization, the purpose of serialization is to convert objects to byte array in the network transmission, when the receiver to receive after the byte array, The Byte array is then deserialized and eventually converted into objects in Java.

There are several possible ways to serialize a Java object:

  1. Use the JDK’s own serialization of objects, but there are some problems with the JDK’s own serialization, and it is only suitable for transfer between Java programs, not non-Java programs such as PHP or GO.
  2. You can also customize the serialization protocol, which is flexible, but not universal, and can be complex to implement and cause unexpected problems.
  3. Convert the data to XML or JSON for transmission. The beauty of XML and JSON is that they both have starting symbols that distinguish objects, and you can read the entire object by determining the position of those symbols. But the downside of both XML and JSON is that the converted data is large. The deserialization process also consumes more resources.

So we need a new method of serialization, which is Protobuf, which is a flexible, efficient, automated solution.

By writing a.proto data structure definition file and then calling protobuf’s compiler, the corresponding class is generated that automatically encodes and parses Protobuf data in an efficient binary format. The generated class provides getter and setter methods for the data fields in the definition file, and provides processing details for reading and writing. Importantly, Protobuf is forward compatible, meaning that older binaries can be read using the latest protocols.

Define.proto files

The.proto file defines the message object that you will serialize. Let’s take the basic student.proto file, which defines the basic properties of the student object.

Let’s start with a simpler.proto file:

syntax = "proto3";

package com.flydean;

option java_multiple_files = true;
option java_package = "com.flydean.tutorial.protos";
option java_outer_classname = "StudentListProtos";

message Student {
  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
  }

  message PhoneNumber {
    optional string number = 1;
    optional PhoneType type = 2;
  }

  repeated PhoneNumber phones = 4;
}

message StudentList {
  repeated Student student = 1;
}
Copy the code

The first line defines the syntax protocol used in Protobuf, which is proto2 by default, because the latest protocol is Proto3, so we use Proto3 as an example here.

We then define the package to which the file is generated at compile time. This is a namespace, and although we define javA_Package later, it is necessary to define package in order to conflict with protocols in non-Java languages.

Then there are three options specifically for Java programs. Java_multiple_files, JAVA_Package, and javA_outer_className.

Java_multiple_files refers to the number of Java files that have been compiled. If true, each Java object will be a class; if false, the defined Java objects will be contained in the same file.

Java_package specifies the Java package name that the generated class should use. If not explicitly specified, the value of the previously defined package is used.

The JAVA_outer_className option defines the classname that will represent the wrapper class for this file. If javA_OUTER_classname is not assigned a value, it will be generated by converting the file name to an uppercase hump. For example, by default, “student. Proto” will use “student” as the wrapper class name.

The next part is the definition of the message. For simple types you can use bool, INT32, float, double, and String to define the type of the field.

We also used complex composite properties and nested types in the previous example. An enumerated class is also defined.

Above we assigned an ID to each attribute value, which is the unique “tag” used in binary encoding. Because tag numbers 1-15 occupy less byte space in a protobuf than tag numbers above 16, as an optimization, tags 1-15 are usually used for frequently used or repeated elements, while tags 16 and higher are used for less frequently used optional elements.

Then look at the field modifiers. The three modifiers are optional, repeated, and required.

Optional indicates that this field is optional. If no value is set, the default value is used. For simple types, you can customize the default value. For the system defaults, the number is 0, the string is an empty string, and the Boolean value is false.

Repeated forms indicate that the field can be repeated, which is essentially an array structure.

Required means that the field is required, and if it has no value, it will be considered uninitialized. Attempts to build an uninitialized message will throw a RuntimeException, and parsing an uninitialized message will throw an IOException.

Note that the Required field is not supported in Proto3.

Compiling protocol files

Once the proto file is defined, it can be compiled using the protoc command.

Protoc is a protobuf compiler that can be downloaded directly from github’s release library. If you don’t want to download it directly, or if the version you need is not available in the official library, you can compile it directly using source code.

The protoc commands are as follows:

protoc --experimental_allow_proto3_optional -I=SRC_DIR --java_out=DST_DIR $SRC_DIR/student.proto
Copy the code

If building Proto3, add the -experimental_allow_proto3_optional option.

Let’s run the above code. Will find in com. Flydean. Tutorial. Protos packets generated inside the 5 files. Respectively is:

Student.java              
StudentList.java          
StudentListOrBuilder.java 
StudentListProtos.java    
StudentOrBuilder.java
Copy the code

StudentListOrBuilder and StudentOrBuilder are the two interfaces, Student and StudentList are the implementation of these two classes.

Explain the generated file

Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student > Student

public final class Student extends
    com.google.protobuf.GeneratedMessageV3 implements
    StudentOrBuilder

  public static final class Builder extends
      com.google.protobuf.GeneratedMessageV3.Builder<Builder> implements
      com.flydean.tutorial.protos.StudentOrBuilder
Copy the code

You can see that they all implement the same interface, indicating that they probably provide the same functionality. The Builder is actually a wrapper around the message, and all operations on Student can be done by the Builder.

For fields in Student, the Student class only has get methods for those fields, whereas the Builder has both get and set methods.

For Student, the methods for fields are:

// required string name = 1;
public boolean hasName();
public String getName();

// required int32 id = 2;
public boolean hasId();
public int getId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);
Copy the code

For Builder, there are two more methods for each property:

// required string name = 1;
public boolean hasName();
public java.lang.String getName();
public Builder setName(String value);
public Builder clearName();

// required int32 id = 2;
public boolean hasId();
public int getId();
public Builder setId(int value);
public Builder clearId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();
public Builder setEmail(String value);
public Builder clearEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);
public Builder setPhones(int index, PhoneNumber value);
public Builder addPhones(PhoneNumber value);
public Builder addAllPhones(Iterable<PhoneNumber> value);
public Builder clearPhones();
Copy the code

The two additional methods are the set and clear methods. Clear is to clear the contents of a field and return it to its original state.

We also define an enumerated PhoneType:

  public enum PhoneType
      implements com.google.protobuf.ProtocolMessageEnum
Copy the code

The implementation of this class is not much different from a normal enumerated class.

Builders and Messages

As you saw in the previous section, Message only corresponds to the get and HAS methods, so it is immutable. Once a Message object has been constructed, it cannot be modified. To build a message, you must first build a builder, set any fields to be set to the values you choose, and then call the builder’s build() method.

Each call to the Builder method will return a new Builder, of course, this return Builder and the original Builder is the same, the return Builder is only for the convenience of code writing.

Here’s how to create a Student instance:

Student xiaoming = student.newBuilder ().setid (1234).setName(" xiaoming ").setemail ("[email protected]").addphone () Student.PhoneNumber.newBuilder() .setNumber("010-1234567") .setType(Student.PhoneType.HOME)) .build();Copy the code

Some common methods are provided in Student, such as isInitialized() to check if all required fields are set. ToString () converts an object to a string. The Builder that uses it can also call clear() to clear the set state, and mergeFrom(Message Other) to merge objects.

Serialization and deserialization

Serialization and deserialization methods are provided in the generated object, and we just need to call them when needed:

  • byte[] toByteArray(); : serializes the message and returns a byte array containing its original bytes.
  • static Person parseFrom(byte[] data); : Parses a message from the given byte array.
  • void writeTo(OutputStream output); : Serializes the message and writes it to OutputStream.
  • static Person parseFrom(InputStream input); : Reads and parses message InputStream from a message.

Using the above methods, you can easily serialize and deserialize objects.

Protocol extensions

After we define PROTO, if we want to modify it later, we want the new protocol to be compatible with historical data. So we need to consider the following points:

  1. You cannot change the ID number of an existing field.
  2. You cannot add or remove any required fields.
  3. Optional or duplicate fields can be removed.
  4. You can add new optional fields or duplicate fields, but you must use the new ID number.

conclusion

Protocol BUF protocol buF protocol buF protocol BuF protocol BuF

Learn -java-base-9-to-20 for an example of this article

This article is available at www.flydean.com/01-protocol…

The most popular interpretation, the most profound dry goods, the most concise tutorial, many tips you didn’t know waiting for you to discover!

Welcome to pay attention to my public number: “procedures those things”, understand technology, more understand you!