Previously, I was responsible for completing the company’s message push service alone, and completed the functions of scanning code login, order message push, activity message broadcast and so on in cooperation with the mobile terminal. In order to deepen their understanding of Websocket protocol, through the way of packet capture to learn. Share now come out, hope can be helpful to everybody.

Chrome console

(1)F12 enter the console, click Network, select ws, and pay attention to Filter.

(2) Refresh the page will get a WS link.

(3) Click the link to view the link details

Note the information in the red box, which will be explained in detail later. (4) Of course, you can also switch to Frames to view the sent and received messages, but it is very simple, only to see the message content, data length and time

Fiddler: packet capture debugging tool

(1) Open Fiddler, click “Rules” on the menu bar, and select Customize Rules…

(2) This opens the customrules.js file and adds the following code to class Handlers

static function OnWebSocketMessage(oMsg: WebSocketMessage) {
        // Log Message to the LOG tab
        FiddlerApplication.Log.LogString(oMsg.ToString());
    }
Copy the code

(3) You can see WebSocket packets in the Log TAB on the right side of Fiddler. Client.1 is the first message sent by the Client. The corresponding Server.1 indicates the first message sent by the Server. MessageType:Text Indicates normal call messages. Close indicates that the session is closed. Message sent by the client:

Message sent by the server:

Then we will see that each session closing is initiated by the client:

Fiddler captures packets in more detail than the Chrome console, knowing whether the session message was sent by the client or the server and the type of message. But this still doesn’t meet the purpose of learning the Websocket protocol in depth. If you’re handling HTTP, HTTPS, or Fiddler. WireShark is used for other protocols such as TCP and UDP. TPC/IP protocol is the transport layer protocol, mainly to solve the data transmission in the network, while HTTP, Websocket is the application layer protocol, mainly solve how to package data. Since the application layer wraps data on top of the transport layer, let’s start at the bottom to understand what Websocket really is. How does it work?

WireShark

WireShark (formerly known as Ethereal) is network packet analysis software. The function of network packet analysis software is to capture network packets and display the most detailed network packet information possible. The WireShark uses the TCP/IP layer 5 protocols to capture packets, that is, the physical layer, data link layer, network layer, transport layer, and application layer. We focus on the transport layer and the application layer.

TCP Three-way handshake

As we all know, when TCP establishes a connection, there is a three-way handshake. The following figure shows the data packets captured by the WireShark. The data packets are called packets, but the data packets do not contain data.

To see the details of each packet, click on the packet in the image above. Here we need a few concepts to understand what each packet means: SYN: Synchronizes a bit to establish a connection. ACK: the acknowledgment bit. Set to 1 to indicate that this is an acknowledged TCP packet and 0 to indicate that it is not. PSH: push bit. When the PSH of the sender is 1, the receiver must deliver the push bit to the application process as soon as possible.

  • The first handshake

You can see that we open the Transmission Control Protocol (Tcp) with SYN set to 1 and the client sending a connection request packet to the server.

  • Second handshake

After receiving a TCP packet from the client, the server sends a TCP packet with SYN=1 and ACK=1 to the client, knowing that the client wants to set up an online connection. The server sets the acknowledgement sequence number to the sequence number of the client plus 1.

  • The third handshake

After receiving the packet from the server, the client checks whether the serial number is correct, that is, the serial number +1 of the packet sent for the first time and whether the ACK flag bit is 1. If the packet is correct, the ACK flag is 1. Link established, ready to send data.

A special HTTP request

This is followed by an Http request (the fourth packet), indicating that Http is indeed using Tcp to establish a connection.

Let’s look at the transport layer (Tcp) first: PSH (push bit) is set to 1, ACK is set to 1, PSH is set to 1 to start sending data, and ACK is set to 1, because the receiving end of the data packet needs to acknowledge. When PSH is 1, it usually only occurs in packets whose DATA content is not 0. In other words, when PSH is 1, it means that real TCP packet content is passed.

Let’s look at the application layer (Http) : this is a special Http request, why a special Http request? The Http request header Connection: Upgrade Upgrade: websocket, Upgrade on behalf of upgrading to a new Http protocol or switch to a different protocol. Obviously WebSocket uses this mechanism to connect to HTTP servers in a compatible manner. The WebSocket protocol has two parts: a handshake to establish an upgraded connection, and then the actual data transfer. First, the client requests a WebSocket Connection by using the Upgrade: WebSocket and Connection: Upgrade headers along with some protocol-specific headers to establish the version in use and set the handshake. The server, if it supports the protocol, replies with the same Upgrade: WebSocket and Connection: Upgrade titles, and completes the handshake. After the handshake is complete, the data transfer begins. This information is also available in the previous Chrome console.

Request:

Response: The response status code 101 indicates that the server has understood the client’s request. After sending this response, the server will switch to the protocols defined in the Upgrade request header.

From this we can conclude: Websocket protocol is essentially a TCP based protocol. A handshake is required to establish a connection. The client (browser) first sends a special HTTP request to the web server (Web Server). The Web server parses the request and generates a reply to the browser.

The world of the Websocket

The communication protocol format is WebSocket. The server uses Tcp Socket to receive data for parsing. The protocol format is as follows:

First we need to know that data is transmitted in binary at the physical layer, data link layer, and in hexadecimal byte stream at the application layer.

The first byte:

FIN:1 bit, which describes whether the message is finished. If the value is 1, the message is the tail of the message. If the value is zero, there are subsequent packets. RSV1,RSV2,RSV3:1 bit each, used for extension definition, or must be 0 if there is no extension convention. OPCODE:4 bits, used to indicate the type of message received. If an unknown OPCODE is received, the receiver must close the connection.

Webdocket data frame OPCODE definition: 0x0: additional data frame 0x1: text data frame 0x2: binary data frame 0x3-7: temporarily undefined, reserved for future non-control frames 0x8: Connection closed 0x9: Ping 0xA: Pong 0xB-F: temporarily undefined, reserved for future control frames

The second byte:

MASK:1 bit, which is used to identify whether PayloadData is processed by MASK. Data frames sent by the client need to be processed by MASK, so this bit is 1. The data needs to be decoded. Length of PayloadData: 7 bits, 7+16 bits, 7+64 bits If the value ranges from 0 to 125, it is the real length of the payload. If the value is 126, then the value of the next two bytes of a 16-bit unsigned integer is the true length of payload. If the value is 127, then the value of the next eight bytes of a 64-bit unsigned integer is the true length of payload.

The figure above shows the data packet sent by the client to the server, where the length of PayloadData is binary: 01111110 — > Decimal: 126; If the value is 126, then the value of the next two bytes of a 16-bit unsigned integer is the true length of payload. That is, red hexadecimal: 00C1 — > Decimal: 193 bytes. So the actual data length of PayloadData is 193 bytes;

According to our analysis, the WebSocket frame of the client-to-server packet should look like:

Let’s capture and analyze the data packet from the server to the client:

It can be found that the MASK bit in the second byte of the data packet sent by the server to the client is 0, which indicates that the data frame sent by the server has not been processed by the MASK. We can also find from the screenshot of the data packet sent by the client and the server that the data on the client is encrypted while the data on the server is not. (If the server receives the data packet sent by the client without mask processing, it automatically disconnects the server. Conversely, if the client receives a mask-processed packet from the server, the connection is automatically disconnected.

Mask processing:

Processing without mask:

According to our analysis, the WebSocket frame of the server to client packet should look like:

TCP KeepAlive

As shown in the preceding figure, TCP keepalive packets always come in pairs, including TCP keepalive probe packets and TCP keepalive probe confirmation packets. A TCP keepalive probe packet is the data at the application layer with the acknowledgement sequence number of the previous TCP packet reduced by 1 and the content of the packet is 00, as shown in the following figure:

The TCP keepalive probe confirmation packet is a confirmation of keepalive probe packets. The format of the packet is as follows:

Since WebSockets work through Tcp sockets, now consider the question: how does the server know the order of the messages in a long connection? This is where TCP Sequence numbers and Acknowledgment numbers are involved. The serial number is the length of the data sent; The confirmation number is the length of the data to be received. This is more abstract, so let’s start with the TCP three-way handshake (combined with the figure below).

Packet 1: The serial number of each end of a TCP session starts at 0. Similarly, the confirmation number starts at 0 because the call has not yet started and there is no other end to confirm the call

Packet 2: The server responds to the client’s request with sequence number 0 (since this is the first packet sent by the server in this TCP session) and relative acknowledgement number 1 (indicating that the server received the SYN from the client in packet 1). Note that even though the client is not sending any valid data, the acknowledgement number is incremented by one because the received packet contains the SYN or FIN flag bit.

Package 3:2 and package, the client use the confirmation number 1 response server serial number 0, at the same time, the response is the client’s own serial number (due to the server to send package to confirm received the client sends the SYN, so the client serial number from 0 to 1) at this point, the communication of both sides of the serial number is 1.

Packet 4: Client — > Server This is the first packet in the stream that carries valid data (specifically, the HTTP request sent by the client). The sequence number is still 1 because no data has been sent until the last packet, and the acknowledgement number remains 1 because the client has not received any data from the server. Note that the valid data in the packet is 505 bytes in length

Packet 5: While the upper layer is processing the HTTP request, the server sends this packet to confirm the data the client sent in packet 4. Note that the acknowledgement number has been increased to 505 (505 is the valid data length in packet 4) to 506. In simple terms, the server uses this to tell the client that so far, I received a total of 506 bytes of data, with the server serial number unchanged at 1.

Packet 6: server — > client This packet marks the start of the HTTP response returned by the server and is still serial number 1, because the server has returned no valid data before this packet, which has 129 bytes of valid data.

Package 7: Because the last packet to send, the TCP client confirm the serial number increased to 130, from 129 bytes of data, the server receives the client confirmation number from 1 up to 130 how to understand the serial number and confirm the serial number work, we will know “before the message is to detect TCP keep alive the TCP message confirmation of serial number minus 1, And set 1 byte. “Why did you do that. To ensure that the serial number and confirmation serial number are not affected by keep Alive during a connection. The 1byte 00 data in the Keep alive is not the actual data to be passed, but the common rule of TCP Keep alive convention.

Conclusion: WebSocket is a stand-alone TCP-based protocol whose only relationship to HTTP is that its handshake request can be interpreted by the HTTP server as an Upgrade request. To be more precise, WebSocket is a network communication protocol. As long as you understand the data frame format and handshake process above, you can complete webSokect-based instant communication.


Wechat official account: follow the official account to get the latest updates