I believe that we have come into contact with a lot of application layer communication protocols, such as HTTP, Websocket, IMAP/POP, mysql, etc., are based on TCP application layer protocols (transport layer in addition to TCP, there are other protocols). I don’t know what kind of overview you have in mind for the various application layer protocols, but for me it’s: 1. Defines the specification of the message (how to start, how to end, rules for serialization/deserialization); 2. Achieve 1 through system call transport layer interface; . Just today I saw a communication layer protocol (Bolt) Nodejs implementation of Ali, I want to know how Bolt customizing its own protocol through the source code. I will list my own questions before reading, so I may have my own bias rather than a complete analysis. Some key NodeJS apis or plug-ins were also encountered during the analysis, which is also briefly described here

Build a communication service based on NodeJS

I have a misconception that the transport layer TCP has to be bound to the application layer (or a bunch of TCP communication format content and a bunch of machine code 01), but this is not the case. The TCP transport layer already encapsulates this layer and provides the Net interface in NodeJS (provided by the libuv core library), even without using any protocol. It can also send and receive messages normally, while the communication layer protocol only needs to standardize the communication between the two sides. Let’s try building a TCP network service in Nodejs:

// server side const net = require('net'); Const HOST = '; const PORT = 3002; Const server = net.createserver (); // Create a TCP service instance const server = net.createserver (); // Listen to PORT server.listen(PORT, HOST); Server on (' listening ', () = > {the console. The log (` services has opened in the ${HOST} : ${PORT} `); }); On ('data', buffer => {const MSG = buffer.tostring (); console.log(msg); // Write (Buffer. From (' hello '+ MSG)); }); }) // client side const net = require('net'); Const client = net.createconnection ({host: '', port: 3002}); Client.on ('connect', () => {// Send data to server client.write('Nodejs technology stack '); }) client.on('data', buffer => { console.log(buffer.toString()); })Copy the code

Bingo, we get the actual buffer object, and we get the string sent by the client through the string method provided by buffer! That is to say, our various application layer protocols based on TCP no longer need to deal with most of the tedious content of any communication process, just need to standardize the string encapsulated by TCP;


Bolt protocol is much simpler than HTTP (because Bolt is mostly used for RPC calls), and it’s much easier for me to read the source code.

Proto: indicates the protocol identifier. Bolt V1 is 0x01, bolt V2 is 0x02. Ver1: indicates the bolt protocol version. Request/response/request oneway cmdcode: request/response/heartbeat, and the type of overlapping ver2: the version of the application layer protocol (useless) temporarily requestId: Unique id codec packets: body serialization, currently supported hessian/hessian2 protobuf switch: whether open crc32 check headerLen: custom contentLen head length: Content length CRC32: The calculated CRC32 value of the entire packet (supported when ver1 > 1)Copy the code

Because the purpose of this article is to analyze how to implement the specification, the content of the specification itself is not important for this article, just look at the official specification description.


Here is mainly through the demo run source code to see the complete execution process, analyze how Bolt to implement the specification, Bolt mainly solves two problems: 1 Buffer object and specification mapping; 2. Body serialization; Bolt’s client code is example/client.js:

const net = require('net'); const pump = require('pump'); const protocol = require('.. /lib'); Const socket = net.connect(12200, ''); // Create TCP connection const socket = net.connect(12200, ''); // Instantiate encoder and decoder objects const encoder = protocol. Encoder (options); const decoder = protocol.decoder(options); Pump (encoder, socket, decoder, err => {console.log(err); }); // Encoder. WriteRequest (1, {args: [{$class: 'java.lang.String', $: 'Peter ',}], serverSignature: 'com. Alipay. Sofa. RPC. Quickstart. HelloService: 1.0', methodName: 'sayHello' timeout: 3000})Copy the code

What’s really hard to understand here is that NET is stream-based, and the flow of events within each stream is automatic, with no additional registration required. To send a request to writeRequest, call /lib/encoder. Js:

WriteRequest (id, req, callback) {this._writePacket({packetId: id, packetType: 'request', req, meta: this._createMeta(this.encodeOptions), }, callback); } _writePacket(packet, callback = noop) {// Call encode methods of different types (four types: 'request', 'response', 'heartbeat', 'heartbeatAck') // Buf = this['_' + packetType + 'encode '](packet); //this.write will stream the serialized Buffer object into the socket, triggering socket.write to send this._limited =! this.write(buf, err => { callback(err, packet); }); }Copy the code

_writePacket Encode of the first method call, where data is inserted into Buffer objects in the canonical order via byte library methods

exports.encode = (obj, options) => { byteBuffer.reset(); byteBuffer.put(0x01); // bp=1 means bolt bytebuffer.put (options.rpctype); byteBuffer.putShort(options.cmdCode); byteBuffer.put(0x01); byteBuffer.putInt(; byteBuffer.put(Constants.codecName2Code[options.codecType]); if (options.rpcType === RpcCommandType.RESPONSE) { byteBuffer.putShort(obj.responseStatus); } else { byteBuffer.putInt(obj.timeout || 0); } const offset = byteBuffer.position(); byteBuffer.skip(8); let start = byteBuffer.position(); obj.serializeClazz(byteBuffer); byteBuffer.putShort(offset, byteBuffer.position() - start); start = byteBuffer.position(); obj.serializeHeader(byteBuffer); byteBuffer.putShort(offset + 2, byteBuffer.position() - start); start = byteBuffer.position(); obj.serializeContent(byteBuffer); byteBuffer.putInt(offset + 4, byteBuffer.position() - start); return byteBuffer.array(); };Copy the code

This completes the process of sending data. But there is an important question, how does the Byte library insert/extract protocol data according to the specification? (See byte’s description below.)


Pump is a small NodeJS module that forms a pipe through which the Stream flows and destroys it when it is finished. Note: The pipe function normally passes the return value to the next function after execution, but each node in pump subscribes to the Stream’s ‘close’ event and flows to the next node only after the current event ‘closed’.


The offset position will be moved to the next position when the data is retrieved:

// numbers is bound to the object in number.js, Keys (numbers). ForEach (function(type) {const putMethod = 'put' + type; const getMethod = 'get' + type; const handles = numbers[type]; const size = handles.size; ByteBuffer.prototype[putMethod] = function(index, value) { // index, value // value if (value === undefined) { // index, value value = index; index = this._offset; this._offset += size; this._checkSize(this._offset); } const handle = this._order === BIG_ENDIAN ? handles.writeBE : handles.writeLE; this._bytes[handle](value, index); return this; }; ByteBuffer.prototype[getMethod] = function(index) { if (typeof index ! == 'number') { index = this._offset; this._offset += size; } const handle = this._order === BIG_ENDIAN ? handles.readBE : handles.readLE; return this._bytes[handle](index); }; });Copy the code

Source code address:…

The problem record

Bolt does not explicitly call net’s send interface when creating the socket on the client side. When was this interface triggered? This problem is really painful, and I have no clue for a long time. Net is stream-based, which triggers a write event when data flows into the socket. 2. Why pump is bound to three nodes (for example, the client should not experience encoder nodes when receiving data)? The data binding is createWriteStream, which only goes from the encoder to the socket, and the data received is createReadStream, which goes from the socket to the decoder

Refer to the article

1 source repository:… 2 Getting started Node.js Net module to build TCP Network Services:…