preface

I haven’t posted in 2 months and I see people saying, ‘Where’s that little reptilian Quinn? ‘and I jumped out.

I plan to go to Beijing this summer. I need to make good technical preparation before going, so I have been studying recently. My way of learning is simple: reading documents, reading source code, building wheels. Building a wheel is the fastest and most effective way I think you can make progress.

Some time ago, we need to crawl some data through WebSocket. In the online article introduction, we use the websocket-client library. But my project is asynchronous, I hope that websocket data reading can also be asynchronous, then I searched webSockets library on Github, in the use and source reading, I found that WebSockets is still not the ideal library I think. So I decided to develop an async WebSocket client by myself.

This time I will share with you WebSocket protocol knowledge and introduce my open source library AIoWebSocket.

WebSocket protocol and knowledge

WebSocket is a protocol for full duplex communication over a single TCP connection. The WebSocket communication protocol was standardized by THE IETF in 2011 as RFC 6455, and is supplemented by RFC7936. The WebSocket API is also a W3C standard.

WebSocket makes it easier to exchange data between the client and the server, allowing the server to actively push data to the client. In the WebSocket API, the browser and server only need to complete a handshake to create a persistent connection and two-way data transfer.

Why websockets

In the past, many websites used polling to implement push technology. Polling is when the browser makes an HTTP request to the server at a specific time interval (e.g., every 1 second), and the server returns the latest data to the client’s browser. This traditional pattern has obvious disadvantages, namely, the browser needs to make continuous requests to the server. However, HTTP requests may contain long headers, in which only a small portion of the data is really valid, which obviously wastes a lot of bandwidth and other resources. The newer technique for polling is Comet. This technology allows two-way communication, but still requires repeated requests. Also, the long links that are common in Comet consume server resources. In this case, HTML5 defines the WebSocket protocol, which can better save server resources and bandwidth, and can be more real-time communication.

What are the advantages of WebSocket

Low overhead, high real-time, binary support perfect, support expansion, compression is better.

  • Less control overhead. When data is exchanged between the server and client after the connection is created, the packet headers used for protocol control are relatively small. Without extensions, this header size is only 2 to 10 bytes (depending on packet length) for server-to-client content; For client-to-server content, an additional 4-byte mask is required for this header. This overhead is significantly reduced compared to HTTP requests that carry the full header each time.
  • More real-time. Because the protocol is full-duplex, the server can proactively send data to the client at any time. Compared with HTTP requests that need to wait for the client to initiate the request before the server can respond, the latency is significantly less; Even compared to long polling like Comet, data can be delivered more times in a shorter time. Keep the connection state. Unlike HTTP, Websocket needs to create a connection first, which makes it a * stateful protocol that can then communicate without some state information. HTTP requests, on the other hand, may need to carry status information (such as authentication) with each request.
  • Better binary support. Websocket defines binary frames, making it easier to process binary content than HTTP.
  • Extensions can be supported. Websocket defines extensions that users can extend and implement partially customized sub-protocols. For example, some browsers support compression.
  • Better compression. Compared to HTTP compression, Websocket, with appropriate extension support, can use the context of the previous content, which can significantly improve compression when passing similar data.

What’s the handshake?

Websockets are standalone protocols created on top of TCP.

Websocket uses the 101 status code of the HTTP/1.1 protocol for handshake.

To create a Websocket connection, a request is made through the browser, and the server responds, a process often referred to as “handshaking.”

WebSocket protocol specification

WebSocket is a communication protocol that specifies specifications and standards. The protocol standard is RFC 6455, which can be found in tools.ietf.org.

The protocol consists of 14 parts, including protocol background and introduction, handshake, design concept, terminology, two-end requirements, mask and connection closing.

A two-end interactive process

The interaction flow between the client and server is as follows:

Client – Initiates a handshake request. – The server returns a message after receiving the request. – The connection is established successfully

So, the first problem to solve is the handshake problem.

Handshake – client

As for the standard of handshake, it is stated in the agreement:

The opening handshake is intended to be compatible with HTTP-based server-side software and intermediaries, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server. To this end, the WebSocket client’s handshake is an HTTP Upgrade request:

    GET /chat HTTP/1.1
    Host: server.example.com
    Upgrade: websocket
    Connection: Upgrade
    Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
    Origin: http://example.com
    Sec-WebSocket-Protocol: chat, superchat
    Sec-WebSocket-Version: 13
Copy the code

In compliance with [RFC2616], header fields in the handshake may be sent by the client in any order, so the order in which different header fields are received is not significant.

The WebSocket handshake uses HTTP rather than WebSocket. The request sent during the handshake can be called an upgrade request. The client passes the handshake phase:

Upgrade: websocket
Connection: Upgrade
Copy the code

The Connection and Upgrade header fields tell the server to convert the communication protocol to websocket. Sec-websocket-version and sec-websocket-protocol indicate the communication Version and Protocol. Sec-websocket-key acts as a safeguard against uninitiated connections. Because the value of the key is completely controlled by the client, the server has no authentication mechanism), the other header fields are consistent with the HTTP protocol.

Handshake – server side

Just now, the client sends an HTTP request, indicating that it wants to shake hands. The server needs to verify the information before the handshake is successful (the connection has been established successfully, and two-way communication can be achieved). Then the server will reply to the client: “Hello, little brother, there is no inside ghost, the connection has been reached!”

What does the server need to reply to?

Status Code: 101 Web Socket Protocol Handshake
Sec-WebSocket-Accept: T5ar3gbl3rZJcRmEmBT8vxKjdDo=
Upgrade: websocket
Connection: Upgrade
Copy the code

First, the server will give a status code, 101 status code to indicate that the server has understood the client request, and reply Connection and Upgrade to indicate that the webSocket protocol has been switched. Sec-websocket-accept is an encrypted sec-websocket-key that is confirmed by the server.

In this way, the client and server complete the handshake operation, and reach an agreement to use the WebSocket protocol to communicate.

Back and forth. – Data exchange

After a successful handshake and confirmation of the agreement, the two parties can send messages to each other. How are their messages sent? Is it:

client: Hello, server boy

server: Hello, client girl
Copy the code

Is it the same as we send messages in wechat and QQ?

That’s what we see, but it’s not what it looks like in transit. The transmission section also has corresponding provisions:

In the WebSocket Protocol, data is transmitted using a sequence of frames. To avoid confusing network intermediaries (such as intercepting proxies) And for security reasons that are further discussed in Section 10.3, A client MUST mask all frames that it sends to the server (see Section 5.3 for further details). (Note that masking is. done whether or not the WebSocket Protocol is running over TLS.) The server MUST close the connection upon receiving a frame that is not masked. In this case, A server MAY send a Close frame with a status code of 1002 (protocol error) as defined in Section 7.4.1 NOT mask any frames that it sends to the client. A client MUST close a connection if it detects a masked frame. In this case, It MAY use the status code 1002 (protocol error) as defined in Section 7.4.1. (These rules might be relaxed in a future specification.)

The base framing protocol defines a frame type with an opcode, a payload length, and designated locations for “Extension data” and “Application data”, which together define the “Payload data”. Certain bits and opcodes are reserved for future expansion of the protocol.

The data frame protocol defines the frame type with the opcode, the payload length, and the “extended data” and the specified location of the application data, which together define “payload data”. Certain bits and opcodes are reserved for future extension protocols.

The format of the data frame is shown below:

A frame consists of FIN, RSV1, RSV2, RSV3, opcode, MASK, Payload Length, Masking-key, and payload-data. Their meanings and functions are as follows:

1. The FIN: 1 bit

0: not the last shard of the message 1: the last shard of the messageCopy the code

2.RSV1, RSV2, RSV3:1bit each

In general, they’re all 0’s. When the client and server negotiate to use WebSocket extension, the three flag bits can be non-0, and the meaning of the value is defined by the extension. If a non-zero value is present and the WebSocket extension is not used, the connection fails.

3.Opcode: 4bit

%x0: indicates a continuation frame. When Opcode is 0, it indicates that the data transmission adopts a data fragment, and the received data frame is one of the data fragments. %x1: indicates a text frame; %x2: indicates a binary frame; %x3-7: reserved operation code for subsequent defined non-control frames; %x8: Indicates that the connection is down. %x9: indicates a heartbeat request (ping); %xA: indicates a heartbeat response (pong); % xb-f: Reserved operation code for subsequent defined control frames.Copy the code

4.Mask: 1bit

Indicates whether to perform a mask xor operation on the data payload.

0: No 1: YesCopy the code

5.Payload length: 7bit or (7 + 16)bit or (7 + 64)bit

Represents the length of the data payload.

0 to 126: The data length is equal to the value. 126: The next two bytes represent a 16-bit unsigned integer whose value is the length of the data; 127: The next 8 bytes represent a 64-bit unsigned integer (highest bit 0) whose value is the length of the data.Copy the code

6.Masking-key: 0 or 4bytes

If the Mask is 1, it carries 4-byte Masking-key. If the Mask is 0, there is no Masking-key. Mask algorithm: performs cyclic xor operation by bit. First, modulus is taken from the index of the bit to obtain the value x corresponding to Masking-key, and then xor is performed between the bit and X to obtain the real byte data.Copy the code

Note: Masks are not intended to prevent data leaks, but rather to prevent proxy cache contamination attacks and other problems that existed in earlier versions of the protocol.

It’s a Payload of Data

After receiving several frames, the two ends can process or extract information according to the values of each position of the data frame.

mask

Note that when sending data from the client to the server, mask the data. When sending data from the server to the client, there is no need to mask the data. If the data received by the server has not been masked, the server needs to disconnect the data. If Mask is 1, Masking key is defined in Masking-key and used to Mask the data payload. Mask 1 is used for all data frames sent by the client to the server.

Stay connected

The WebSocket protocol is two-way communication, so once the connection is connected, it will not be disconnected?

This is true, but the server cannot keep all connections open all the time, so the server usually sends a ping frame to the client at a regular time, and the client replies with a Pong frame. If the client does not respond, the server disconnects.

Opcode frame 0x09 means it’s a Ping, 0x0A means it’s a Pong.

WebSocket protocol learning summary

WebSocket protocol is written more standard, easier to read and understand. As long as you follow the protocol, you can achieve stable communication connections and data transmission.

Aiowebsocket design

Based on the learning of protocol, I compiled an open source asynchronous WebSocket library – AIoWebSocket. Its file structure and class design are shown in the following figure:

aiowebsocket

Aiowebsocket is a WebSocket client that is faster, lighter and more flexible than the same type of library. It is based on Asyncio and has the characteristics of being easy to use with Websocket-client and WebSockets library. This is the result of seven days of WebSocket learning and Python document Stream learning.

Installation and use

Install: Like the other libraries, you can install it via PIP: PIP install aiowebsocket, or clone it locally on Github.

Use: The WebSocket protocol, short for WS, is similar to HTTP/HTTPS with the more secure protocol WSS. The difference in usage is minor, except that SSL is turned on when the connection is created.

Ws protocol example code:

import asyncio
import logging
from datetime import datetime
from aiowebsocket.converses import AioWebSocket


async def startup(uri):
    async with AioWebSocket(uri) as aws:
        converse = aws.manipulator
        message = b'AioWebSocket - Async WebSocket Client'
        while True:
            await converse.send(message)
            print('{time}-Client send: {message}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), message=message))
            mes = await converse.receive()
            print('{time}-Client receive: {rec}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), rec=mes))


if __name__ == '__main__':
    remote = 'ws://echo.websocket.org'
    try:
        asyncio.get_event_loop().run_until_complete(startup(remote))
    except KeyboardInterrupt as exc:
        logging.info('Quit.')
Copy the code

When run, you get the following result:

2019-03-04 15:11:25-Client send: b'AioWebSocket - Async WebSocket Client'
2019-03-04 15:11:25-Client receive: b'AioWebSocket - Async WebSocket Client'
2019-03-04 15:11:25-Client send: b'AioWebSocket - Async WebSocket Client'
2019-03-04 15:11:25-Client receive: b'AioWebSocket - Async WebSocket Client'
Copy the code

This represents that the client is successfully connected and communicating with the service.

WSS protocol example code:

Enable SSL
import asyncio
import logging
from datetime import datetime
from aiowebsocket.converses import AioWebSocket


async def startup(uri):
    async with AioWebSocket(uri, ssl=True) as aws:
        converse = aws.manipulator
        message = b'AioWebSocket - Async WebSocket Client'
        while True:
            await converse.send(message)
            print('{time}-Client send: {message}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), message=message))
            mes = await converse.receive()
            print('{time}-Client receive: {rec}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), rec=mes))


if __name__ == '__main__':
    remote = 'wss://echo.websocket.org'
    try:
        asyncio.get_event_loop().run_until_complete(startup(remote))
    except KeyboardInterrupt as exc:
        logging.info('Quit.')
Copy the code

The results are similar to those above. Aiowebsocket also allows custom headers, which are useful when connecting to websites that need to verify origin, User-Agent, and host header information:

import asyncio
import logging
from datetime import datetime
from aiowebsocket.converses import AioWebSocket


async def startup(uri, header):
    async with AioWebSocket(uri, headers=header) as aws:
        converse = aws.manipulator
        message = b'AioWebSocket - Async WebSocket Client'
        while True:
            await converse.send(message)
            print('{time}-Client send: {message}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), message=message))
            mes = await converse.receive()
            print('{time}-Client receive: {rec}'
                  .format(time=datetime.now().strftime('%Y-%m-%d %H:%M:%S'), rec=mes))


if __name__ == '__main__':
    remote = 'the ws: / / 123.207.167.163:9010 / ajaxchattest'
    header = [
        'GET/ajaxchattest HTTP / 1.1'.'Connection: Upgrade'.'the Host: 123.207.167.163:9010'.'Origin: http://coolaf.com'.'Sec-WebSocket-Key: RmDgZzaqqvC4hGlWBsEmwQ=='.'Sec-WebSocket-Version: 13'.'Upgrade: websocket',
        ]
    try:
        asyncio.get_event_loop().run_until_complete(startup(remote, header))
    except KeyboardInterrupt as exc:
        logging.info('Quit.')

Copy the code

Ws: / / 123.207.167.163:9010 / ajaxchattest is a free, open interface connection test, it will in the handshake phase calibration origin head domain, if do not conform to the specification does not allow the client connection.

The project Github address is

https://github.com/asyncins/aiowebsocket

Welcome to star, if you can give advice or find bugs that will be more beautiful.