This article is shared under a Creative Commons Attribution 4.0 International license, BY Troland.

See Github addresses in updates to this serieshere.

This is chapter 5 of how JavaScript works.

We will now delve into the world of communication protocols, mapping and discussing their characteristics and internals. We’ll give a quick comparison of WebSockets and HTTP/2. At the end of this article, we will share some insights on how to choose the right network protocol.

Introduction to the

Complex web applications are now rich in functionality, thanks to the dynamic interaction of the web. And that’s not surprising – it’s been a long time since the Internet was born.

At first, the Internet was not designed to support such dynamic and complex web applications. It was supposed to be made up of a large number of HTML pages, each linked to another, thus forming the concept of a web page containing information. Everything is built very much around what is called the HTTP request/response pattern. The client loads a web page until the user clicks on the page and navigates to the next page.

AJAX was introduced around 2005, and then a lot of people started exploring the possibility of two-way communication between client and server. However, all HTTP links are controlled by the client, meaning that users must operate or poll periodically to load data from the server.

Let HTTP support two-way communication

Technology that allows servers to actively push data to clients has been around for some time. Such as “Push” and “Comet” technologies.

Long polling is one of the most common hacks where a server actively sends data to a client. With long polling, the client opens an HTTP connection to the server until the response data is returned. When the server has new data to send, it sends the new data as a response to the client.

Let’s take a look at a simple long polling snippet:

(function poll() {setTimeout(function(){
      $.ajax({ 
        url: 'https://api.example.com/endpoint', 
        success: function(data) {// Process 'data' //... // recursively call the next poll poll(); }, dataType:'json'}); }, 10000); }) ();Copy the code

This is basically a self-executing function that will run automatically the first time. It calls the server asynchronously every 10 seconds and calls the Ajax function again inside the callback function after each asynchronous request to the server is made.

Other technologies involve Flash and XHR multi-party requests and so-called HTMLfiles.

All of these schemes have one problem in common: they carry HTTP overhead, which makes them inadequate for applications that require low latency. Think of a first-person shooter in a browser or any other online game that requires real-time component functionality.

The emergence of web sockets

The WebSocket specification defines an API for establishing a “socket” connection between the web browser and the server. In layman’s terms: keep a persistent connection between the client and server, and both sides can start sending data at any time.

The client creates a WebSocket connection through the WebSocket handshake process. In this process, the client first makes a regular HTTP request to the server. The request contains an Upgrade header notifying the server that the client wants to establish a WebSocket connection.

Let’s look at how to create a WebSocket connection on the client side:

Var socket = new WebSocket('ws://websocket.example.com');
Copy the code

WebSocket addresses use the WS scheme. WSS is a secure WebSocket connection equivalent to HTTPS.

This scenario is the start of opening a WebSocket connection to websocket.example.com.

Here is a simplified example of initializing the request header.

The GET ws://websocket.example.com/ HTTP / 1.1 Origin: http://example.com Connection: Upgrade the Host: websocket.example.com Upgrade: websocketCopy the code

If the server supports the WebSocket protocol, it will agree to the Upgrade request and then communicate by returning the Upgrade header in the response.

Let’s take a look at the node.js implementation:

/ / we'll use https://github.com/theturtle32/WebSocket-Node to implement WebSocket var WebSocketServer = require ('websocket').server;
var http = require('http');

var server = http.createServer(function(request, response) {// Process HTTP requests}); server.listen(1337,function() {}); WsServer = new WebSocketServer({httpServer: server}); // WebSocket wsserver. on('request'.function(request) { var connection = request.accept(null, request.origin); // This is the most important callback, where all information returned by the user is processed. Connection.on ('message'.function(message) {// Handle WebSocket messages}); connection.on('close'.function(connection) {// Close the connection}); });Copy the code

After the connection is established, the server replies with an upgrade:

HTTP/1.1 101 Switching Protocols
Date: Wed, 25 Oct 2017 10:07:34 GMT
Connection: Upgrade
Upgrade: WebSocket
Copy the code

Once the connection is established, the client WebSocket instance’s open event is triggered.

var socket = new WebSocket('ws://websocket.example.com'); // When the WebSocket connection is opened, the WebSocket connection information is printedfunction(event) {
  console.log('WebSocket is connected.');
};
Copy the code

Now that the handshake is over, the original HTTP connection is replaced with a WebSocket connection that uses the same TCP/IP connection underneath. Now both sides can start sending data.

With WebSocket, you can send data at will without worrying about the overhead associated with traditional HTTP requests. Data is transferred over WebSocket in the form of a message, and each message is composed of one or more frames containing the data (payload) that you are transferring. To ensure that the message is properly reassembled when it arrives at the client, each frame is preloaded with 4-12 bytes of data about the payload. Using this frame-based information system can help reduce the transmission of non-payload data, thereby significantly reducing information latency.

** Note: ** It is important to note here that the client will receive notification of the new message only when all message frames have been received and the original message payload has been reassembled.

WebSocket address

Earlier we talked briefly about WebSockets introducing a new address protocol. In fact, WebSocket introduces two new protocols: WS :// and WSS ://.

The URL address contains the syntax for the specified schema. WebSocket addresses are unique in that they do not support an anchor (sample_anchor).

WebSocket and HTTP-style addresses use the same address rules. Ws is unencrypted and defaults to port 80, while WSS requires TSL encryption and defaults to port 443.

The frame agreement

Let’s take a closer look at the frame protocol. Here’s what the RFC offers:

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-------+-+-------------+-------------------------------+ |F|R|R|R| opcode|M| Payload len | Extended payload length | |I|S|S|S| (4) |A| (7) | (16/64) | |N|V|V|V| |S| | (if payload len==126/127)   |
     | |1|2|3|       |K|             |                               |
     +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
     |     Extended payload length continued, if payload len == 127  |
     + - - - - - - - - - - - - - - - +-------------------------------+
     |                               |Masking-key, if MASK set to 1  |
     +-------------------------------+-------------------------------+
     | Masking-key (continued)       |          Payload Data         |
     +-------------------------------- - - - - - - - - - - - - - - - +
     :                     Payload Data continued ...                :
     + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
     |                     Payload Data continued ...                |
     +---------------------------------------------------------------+
Copy the code

Since the WebSocket version is specified by the RFC, there is only one header in front of each packet. However, this header information is quite complex. Here is a description of its constituent modules:

  • Fin (1 bit) : indicates whether it is the last frame to compose the message. Most of the time, the message is only one frame so that bit usually has a value. Tests show that The second frame of Firefox data is after 32K.

  • Rsv1, Rsv2, RSV3 (each bit) : must be 0 unless negotiation extensions are used to define the meaning of non-0 values. If a non-zero value is received and there is no negotiated extension to define the meaning of a non-zero value, the receiver breaks the connection.

  • Opcode (4 bits) : Indicates the frame number. Currently available values:

    0x00: This frame continues the payload of the previous frame.

    0x01: This frame contains text data.

    0x02: This frame contains binary data.

    0x08: This frame breaks the connection.

    0x09: This frame is a ping.

    0x0A: This frame is a Pong.

    (As you can see, quite a few values are unused; They are reserved for future use).

  • Mask (1 bit) : indicates whether the connection is masked. As it stands, every message sent from the client to the server must be masked, and if the message is not masked, the connection is broken according to the specification.

  • Payload_len (7 bits) : Length of payload. WebSocket frames have the following types of lengths:

    0-125 indicates the length of the payload. 126 means the next two bytes represent the payload length, 127 means the next eight bytes represent the payload length. So payload lengths are roughly 7, 16 and 64 bits.

  • Masking -key (32-bit) : All frames sent from the client to the server are masked by a 32-bit value in the frame.

  • Payload: Actual data that is usually masked. The length depends on the length of payload_len.

Why is WebSocket frame-based and not stream-based? I’m as confused as you are, and I’d love to learn more. If you have any ideas, feel free to add comments and resources in the comments section below. In addition, HackerNews has a discussion about this.

The frame data

As mentioned earlier, data can be split into multiple frames. The data transmitted in the first frame contains an opcode indicating the order in which the data is transmitted. This is necessary because JavaScript does not support binary data transfer very well when the specification is completed. 0X01 indicates utF-8 encoded text data, and 0x02 indicates binary data. Most people choose text opcodes when transferring JSON data. When you transfer binary data, it is represented as a browser-specified Blob.

The API for transferring data via WebSocket is very simple:

var socket = new WebSocket('ws://websocket.example.com');
socket.onopen = function(event) {
  socket.send('Some message'); // Send data to the server};Copy the code

The Message event is emitted while the WebSocket is receiving data (the client). The event takes a data attribute that contains the contents of the message.

// Process the message socket.onMessage = returned by the serverfunction(event) {
  var message = event.data;
  console.log(message);
};
Copy the code

You can easily check the data for each frame of a WebSocket connection using the Web TAB of Chrome Developer Tools.

Data fragmentation

Payload data can be divided into multiple independent frames. The receiver buffers these frames until the FIN bit has a value. So you can split the string “Hello World” into 11 packages, each consisting of 6(header length) + 1 byte. Data sharding cannot be used to control packets. However, the specification wants you to be able to handle interleaved control frames. This is to prevent TCP packets from arriving at the client out of order.

The general logic for connecting frames is as follows:

  • Receive frame 1
  • Remember the opcodes
  • Connect the frame payload untilfinWho has a value
  • Asserts that the opcode for each package is 0

The main purpose of data sharding is to allow the initial transmission of information of unknown size. With data sharding, the server may need to set a reasonable buffer size and then return a data shard when the buffer is full. The second use of data sharding is multiplexing. It is unreasonable for a large amount of data on the logical channel to occupy the entire output channel, so the multiplexing technology is used to split the information into smaller data fragments to better share the output channel.

The heartbeat packets

At any point after the handshake, the client and server can ping each other at will. When a ping is received, the receiver must reply with a Pong as soon as possible. This is the heartbeat packet. You can use it to make sure the client stays connected.

Ping or Pong is just a normal frame, but it’s a control frame. Ping contains the 0x9 opcode, while Pong contains the 0xA opcode. When you receive a ping, return a Pong with the same payload data as ping (ping and Pong have a maximum payload length of 125). You might receive a Pong instead of sending a ping. Ignore it if that happens.

Heartbeat packets are very useful. Use services, such as load balancers, to break idle connections. In addition, it is impossible for the receiver to know if the server has been disconnected. It’s only when you send the next frame that you realize you’ve made an error.

Error handling

You can handle errors by listening for error events.

Like this:

var socket = new WebSocket('ws://websocket.example.com'); // Handle error socket.onerror =function(error) {
  console.log('WebSocket Error: ' + error);
};
Copy the code

Close the connection

The client or server can send a control frame containing opcode data 0x8 to close the connection. When a control frame is received, the other node returns a close frame. The first node then closes the connection. After the connection is closed, any subsequent data received is discarded.

This is the code that initializes the WebSocket connection to close the client:

// Close the connection if it is openif (socket.readyState === WebSocket.OPEN) {
    socket.close();
}
Copy the code

Similarly, to run any cleanup after closing the connection, you can add an event listener to the close event:

// Run the necessary cleanup work socket.onclose =function(event) {
  console.log('Disconnected from WebSocket.');
};
Copy the code

The server has to listen for close events to handle if needed:

connection.on('close'.function(reasonCode, description) {// Close connection});Copy the code

WebSockets versus HTTP/2

Although HTTP/2 offers a lot of functionality, it does not completely replace current push/ Streaming technology.

The most important thing to note about HTTP/2 is that it does not completely replace HTTP. The vocabulary, status code, and most of the header information will remain as they are now. HTTP/2 simply improves the efficiency of data transfer over the line.

Now, if we compare WebSocket to HTTP/2, we’ll see a lot of similarities:

As shown above, HTTP/2 introduces Server Push technology to get the Server to actively send data to the client cache. However, it does not allow you to send data directly to the client program itself. Server push can only be handled by the browser, not in application code, meaning that application code has no API to get notifications of these events.

This is where server-side push events (SSE) come in handy. SSE is a mechanism that allows the server to asynchronously push data to the client once the client-server connection is established. Then, whenever the server generates new data, it pushes it to the client. Think of this as a one-way publish-subscribe model. It also provides a standard JavaScript client API called EventSource, which is already implemented in most modern browsers as part of the HTML5 standard published by the W3C. Note that browsers that do not support the native EventSource API can be implemented via shippers.

Because SSE is based on HTTP, it is naturally compatible with HTTP/2 and can be mixed to take advantage of each other’s advantages: HTTP/2 handles an efficient transport layer based on multiplexing streams while SSE provides apis for applications to support server push.

To fully understand streaming and multiplexing, let’s take a look at the IETF definition: a “stream” is an independent bidirectional sequence of frames exchanged between a client and a server over an HTTP/2 connection. One of its key features is that a single HTTP/2 connection can contain multiple streams opened concurrently, interleaving frames from multiple streams at each terminal.

It is important to remember that SSE is based on HTTP. This means that by using HTTP/2, not only can multiple SSE streams be cross-merged into a single TCP connection, but also multiple SSE streams (server pushing to client) and multiple client requests (client-to-server) can be merged into a single TCP connection. Thanks to HTTP/2 and SSE, we now have a pure HTTP two-way connection with a simple API that allows application code to register to listen for server-side data pushes. The lack of bi-directional communication capability has been considered SSE’s main disadvantage compared to WebSocket. Thanks to HTTP/2, this is no longer a disadvantage. This gives you the opportunity to stick with HTTP-based communication systems instead of WebSockets.

WebSocket and HTTP/2 usage scenarios

WebSockets can still exist under THE rule of HTTP/2 + SSE, mainly because it is a highly regarded technology, and in particular, it has the advantage of inherently bidirectional communication with less overhead (i.e., header information) than HTTP/2.

Let’s say you want to build a massive multiplayer online game that generates a lot of information at each connection. In this case, WebSockets will behave perfectly.

In short, use WebSockets when you need to establish a true low-latency, near-real-time connection between the client and server. Keep in mind that this may require you to rethink how you build server-side programs, as well as focus on techniques such as event queuing.

If your usage scenario requires displaying real-time market news, market data, chat programs, etc., HTTP/2 + SSE will provide you with an efficient two-way communication channel and you can get all the benefits of HTTP:

  • WebSockets can often be a pain point when considering compatibility with existing architectures, because upgrading HTTP connections to a protocol that is completely unrelated to HTTP.
  • Scalability and security: Network components (firewalls, intrusion detection, load balancers) are built, maintained, and configured with HTTP in mind, and large/critical applications will prefer an environment that is resilient, secure, and scalable.

Again, you have to consider browser compatibility. Check the WebSocket compatibility:

Compatibility is good.

However, the situation with HTTP/2 is not so good:

  • TLS only supported (not bad)
  • Limited to Internet Explorer 11 on Windows 10
  • Supports only OSX 10.11+ Safari
  • HTTP/2 will only be supported if you negotiate the application ALPN (something the server needs to explicitly support)

SSE is better supported:

Only Internet Explorer and Edge are not supported. (Well, Opera Mini doesn’t support SSE or WebSockets, so we’ll queue it out entirely). There are some elegant shims to enable IE/Edge to support SSE.

How is SessionStack selected?

SessionStack uses both WebSockets and HTTP, depending on the usage scenario.

Once you integrate SessionStack into a web application, it starts logging DOM changes, user interactions, JavaScript exceptions, stack traces, failed web requests, and debugging information, allowing you to replay video of problems in the web application and everything that happens to the user. It all happens in real time and requires no performance impact on web applications.

This means you can join a user session in real time while the user is still in the browser. In such cases, we would prefer to use HTTP because there is no need for two-way communication (the server transfers data to the browser). Currently, using WebSocket is overused and difficult to maintain and extend.

However, the SessionStack library integrated into web applications uses WebSockets (preferred, otherwise rolled back to HTTP). It will package and send data to our server, one-way communication. In this case, WebSocket was chosen because some of the planned product features might require two-way communication.

Make an advertisement ^.. ^

Toutiao is hiring! Send resume to [email protected], you can take the fast internal push channel, long-term effective! The JD of the international PGC department is as follows: c.xiumi.us/board/v5/2H… , can also be pushed inside other departments!

See Github addresses in updates to this serieshere.