RPC

What is an RPC framework?Simply put, service A calls Service B, without explicitly making an HTTP or TCP request, just calling A local function, which in turn makes A call through HTTP or TCP, and then returns data. For service A, it doesn’t care how the network calls are implemented inside the function, so implementing an RPC framework requires:

  1. Because there are many services, many functions. Service A may call Service B’s F1 or service B’s F2. In general, a func will correspond to an ID. Each func should be unique to the func provided by service B. The ID can be a numeric mapping or a string.
  2. The arguments that func receives have to be passed, which involves serialization. Serialization can pass the ID along with the parameter, either json or XML.
  3. Serialized data, transmitted over the network, can be HTTP protocol, can also be TCP protocol.
  4. Service B listens on the socket and gets the packet.
  5. Deserialize according to the pre-agreed serialization scheme.
  6. Deserialize to get the ID and params of the executed func, and use the ID mapping to find the actual executed func
  7. Inject params into func and execute.
  8. The results are serialized and returned over the network
  9. Service A parses the data and gets the results in the same way.

The benefit of RPC is that it encapsulates remote calls into cost calls, hiding the communication details at the bottom. For developers, they only care about the input and output of business functions, and do not need to care about how the data is transmitted.

GRPC

GRPC was originally developed by Google as a language-neutral, platform-neutral, open source remote procedure call (RPC) system. GRPC is just a special RPC framework. Compared with conventional RPC, GRPC has better performance.

  • http2.0
  • protobuf

The benefits of HTTP2.0 are binary framing, multiplexing, header compression, and server push. I won’t go into details here, but refer to the http2.0 cost of why packets travel so fast across the network, which relates to the cost of bandwidth. GRPC serialization using Protobuf, Protobuf is much better than JSON and XML, the disadvantage is that it is not good to read JSON. Protobuf can transmit the same packet more times in the available bandwidth. And the smaller the data, the faster the serialization. Protobuf is cross-platform and language-independent. You can use the Proto tool to generate code for each language, which is very convenient.

  • Json is a text file, and text files are character-based.
  • Protobuf is binary encoded, value based.

For example, for a number like 123, the json encoding takes 3 bytes, while for a binary encoding, 123 can be expressed as int8, so it takes one byte. In terms of integer and floating point encoding, most of the time JSON takes up more space than protobuf (1 byte for 1, 2 and so on). Protobuf serializes messages with the following data structure:

Our passarguments might be filed1, field2… These fields are all close together like K and V. The difference is that some fields need length. Multiple values of the same type may be linked together to form the entire message data stream.

Then for k, the tag part, the data structure might look like this:

  • The number is the number of each field when we define proto
  • Wire_type always takes the lower 3 digits and can represent a total of 8 digits, each of which represents a different type

Here’s an example:



How the 1, protobuf squeeze space in int64:

  1. First of all, the lower 3 bits, 000, means 0, which is varint
  2. The result of encoding is only 8 bits -1 byte

Int64 uses 8 bytes for 64-bit, but only 1 byte here.

varint

Coding process:

  1. Get the corresponding complement
  2. Starting from the low part of the complement, take 7 bits each, and splice successively from high to low
  3. Splicing process, if the current position is followed by filling. Then the current highest bit complement 1, otherwise complement 0.

Take int32 251 as an example:

The complement of A positive number is itself, take the lower 7 bits, let’s say segment A

Because we still have 1, so we take 7 bits, let’s say we call segment B, and we start splicing, segment A goes before segment B, segment A comes after segment B, so the highest bit complement of segment A is 1, and the highest bit complement of segment B is 0. So the final result is:

Varint End of encoding, originally needed 4 bytes of storage, now only need 2 bytes. The highest bit of each byte does not represent itself, but rather identifies whether there is data after it.

The decoding process is also reversed:Varint encoding has its drawbacks. For example, for int32 data, each byte consumes one MSB, so it can only represent 2^28 numbers at most and an extra byte for numbers between 2^28 and 2^32. But the larger the number, the less likely it is.

zigzag

Varint can have a large space optimization for positive numbers, but negative numbers can have the opposite effect, with int32’s -1 column:

The highest bit of a negative number is always 1, which means that negative numbers will fill the entire byte. If varint encoding is used:

Take the complement first (the sign bit is unchanged, the rest of the bits are reversed, and then add 1) :Then take every 7 bits and add MSB:

Finally, we found a simple -1 with one more byte encoded by varint.

Note: Protobuf does not have int8 or int16 because it supports many languages. Some languages do not have int8 or int16

Zigzag’s idea is to map the negative numbers into positive numbers, and then the positive numbers are coded by VARint to reduce the space. Let’s start with 1+(-1) = -2:

The sum of the original code of 1 and the original code of -1 is equal to -2, which is obviously contrary to common sense, so the design of complement, the sum of the complement of 1 and the complement of -1 is equal to 0, the computer through the way of complement to calculate.

Look at the process of -1 (INT32) encoding:

  1. Int32’s -1 complement 1111XXXX1111
  2. Move to the left one bit 1111XXXX1110
  3. Move right 31 bits 1111XXXX1111
  4. Calculate 0000XXXX 0001

In this way, we can see that the -1 is converted to 1, and then by varint compression, the high 0 is removed, and finally the 4-byte -1 is converted to the 1-byte 1.

Zigzag core codec algorithm:

Code: (n < < 1) ^ (n > > 31) decoding: (((unsignedint) n) > > 1) ^ (n) & 1 -Copy the code