This article has been edited from the technical article of “TheAlchemist”, the content of which has been revised. Thanks to the original author, the link is at the end of the article.


1, the preface



WebSocket details (4) : Get to the bottom of HTTP and WebSocket relationship (part 1


2. Series of articles



This is the fifth article in a series that Outlines the following:





  • WebSocket In Detail (1) : A Preliminary Understanding of WebSocket Technology
  • WebSocket In Detail (II) : Technical Principles, Code Demonstrations and Application Cases
  • Details of WebSocket (iii) : In-depth Details of WebSocket Communication Protocol
  • WebSocket in Detail (4) : Getting to the bottom of HTTP and WebSocket relationship (part 1)
  • “WebSocket details (5) : Getting to the bottom of the relationship between HTTP and WebSocket (Part 2)”
  • WebSocket In Detail (6) : Probing the relationship between WebSocket and Socket
  • WebSocket Protocol and Socket. IO Open Source Project


3. More information



Web instant Messaging beginner post:



Beginner’s notes: Explain how instant messaging works on the Web






For an inventory of instant messaging technologies on the Web, see:



Web side im technology inventory: short polling, Comet, Websocket, SSE






About Ajax short polling:









For details on Comet technology, see:



Comet technology description: Based on HTTP long connection Web end real-time communication technology



WEB side im: HTTP long connection, long polling (detailed explanation)



Instant Messaging on the WEB: Instant messaging can be done without a WebSocket



Open source Comet server iComet: Web side instant messaging solution that supports millions of concurrent messages






For more information about WebSocket, see:



Quick start for beginners: WebSocket concise tutorial



WebSocket details (I) : a preliminary understanding of WebSocket technology



WebSocket detail (2) : technical principle, code demonstration and application cases



WebSocket details (3) : In-depth details of WebSocket communication protocol



Socket.IO: a WebSocket framework for instant messaging on the WEB



What is the relationship between socket. IO and websocket? What’s the difference?






For a detailed introduction to SSE, see:



SSE technology details: a new HTML5 server push event technology






For more WEB side im articles, see:



www.52im.net/forum.php?m…


4. WebSocket protocol










RFC 6455: The WebSocket Protocol






WebSocket details (I) : a preliminary understanding of WebSocket technology
WebSocket detail (2) : technical principle, code demonstration and application cases
WebSocket details (3) : In-depth details of WebSocket communication protocol


5. Why WebSocket instead of HTTP















To achieve two-way communication on the Web side, the following solutions are generally used:





  • 1) Polling (polling) : polling will cause a waste of resources on both the network and communication, and is not real-time;
  • 2) Long polling: the client sends a Request with a long timeout, and the server holds the connection and returns a Response when new data arrives. Compared with #1, it consumes less network bandwidth, and other similar things;
  • 3) Long connection: In fact, some people are confused about the concept of long connection, I am talking about HTTP long connection (1). If you use a Socket to set up a TCP long connection (2), then the long connection (2) is the same as WebSocket. In fact, TCP long connections are the basis of WebSockets, but if you use HTTP long connections, In essence, it is still a Request/Response message pair, which still causes problems such as waste of resources and poor real-time performance.













Before the advent of WebSocket, the traditional Web side of how to implement instant messaging technology, see the following article:



Comet technology description: Based on HTTP long connection Web end real-time communication technology



WEB side im: HTTP long connection, long polling (detailed explanation)



Instant Messaging on the WEB: Instant messaging can be done without a WebSocket


6. WebSocket protocol foundation














6.1 Handshake





GET /chat HTTP/1.1            //1
Host: server.example.com   //2
Upgrade: websocket            //3
Connection: Upgrade            //4
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==            //5
Origin: [url=http://example.com]http://example.com[/url]            //6
Sec-WebSocket-Protocol: chat, superchat            //7
Sec-WebSocket-Version: 13            //8Copy the code



As you can see, the first two lines are exactly the same as the beginning of the HTTP Request, but what really matters in the WS handshake are the following header fields:





  • 1) Upgrade: Upgrade is the header field used in HTTP1.1 to define the transformation protocol. It says that if the server supports it, the client wants to switch to another “application layer” (WebSocket) protocol using the “connection” already established at the existing “network layer” (TCP connection in this case).
  • 2) the Connection: HTTP1.1 specifies that Upgrade can only be used in “direct connections”. Therefore, HTTP1.1 messages with Upgrade headers must contain Connection headers. Anyone who receives this message (usually a proxy server) will process the domain specified in Connection (not forward the Upgrade domain) before forwarding this message. If the client and server are connected through a proxy, the CONNECT message is first sent to establish a direct connection before the handshake message is sent.
  • 3) Sec-WebSocket- * : Line 7 identifies the list of subprotocols supported by the client (more about subprotocols below), line 8 identifies the list of versions of the WS protocol supported by the client, and line 5 is used to send it to the server (the server will use this field to assemble another key value to send to the client in the handshake return message).
  • 4) Origin: for security purposes, to prevent cross-site attacks, browsers generally use this to identify the original domain.









RFC2616Specifies that switching can only be agreed to if the protocol to be switched is “better than HTTP1.1” :


HTTP/1.1 101 Switching Protocols //1
Upgrade: websocket. //2
Connection: Upgrade. //3
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=  //4
Sec-WebSocket-Protocol: chat. //5Copy the code


6.2 the WebSocket protocol Uri



The default ws protocol uses port 80, and the default WSS protocol uses port 443:


ws-URI = "ws: " "// " host [ ": " port ] path [ "? " query ] wss-URI = "wss: " "// " host [ ": "Port] path ["?" query] host = <host, defined in [RFC3986], Section 3.2.2> port = <port, defined in [RFC3986], Section 3.2.3> path = <path-abempty, defined in [RFC3986], Section 3.3> query = <query, defined in [RFC3986], Section 3.4 >Copy the code

Note:





6.3 Little things to do before the client sends a handshake









The process from establishing a connection to sending a handshake message looks something like this:





  • 1) The client checks whether the entered Uri is valid.
  • 2) If A CONNECTING IP address is in the CONNECTING state, the client needs to wait for A successful or failed connection before continuing to establish A new connection. If the current connection is in a proxy network and IP addresses are not the same, each Host address is considered as a separate destination address. In addition, the client should limit the number of connections in the CONNECTING state. – PPS: this prevents some DDOS attacks. – PPPS: the client does not limit the number of connections in the “successful” state at the same time. However, if a client “holds a large number of connections in the” successful “state,” the server may reject new connections requested by the client.
  • 3) If the client is in a proxy environment, it first requests its proxy to establish a TCP connection to a destination address: for example, if the client is in a proxy environment and it wants to connect to port 80 of a destination address, it may want to collect and send the following message: CONNECT example.com:80 HTTP/1.1 Host: example.com If the client is not in a proxy environment, it first establishes a direct TCP connection to the destination address.








6.4 Minor Requirements for Client Handshake Information



Requirements for client handshake information:





  • The handshake must be a Request message defined in RFC2616
  • The method of the Request message must be GET and the HTTP version must be greater than 1.1: – The following is the Request message corresponding to the Uri of a WS: -ws ://example.com/chat – GET /chat HTTP/1.1
  • The assets defined in the Request-URI part of the Request message (a concept in RFC2616) must be the same as those defined in the WS protocol URI.
  • The Request message must contain the Host header field, which must have the same content as defined in the WS Uri.
  • This Request message must contain the Upgrade header field, and its contents must contain the WebSocket keyword.
  • This Request message must contain the Connection header field, and its contents must contain the Upgrade directive.
  • The Request message must contain the SEC-websocket-key header field, which is a Base64 encoded 16-bit random character.
  • If the client is a browser, the Request message must contain the Origin header field, which is referred to RFC6454.
  • This Request message must contain the sec-websocket-version header field, the Version number defined in this protocol is 13.
  • This Request message may contain the SEC-websocket-protocol header field, the meaning of which is described above.
  • The Request message may contain the SEC-websocket-Extensions header field, which clients and servers can use to extend some functionality.
  • This Request message may contain any valid header field. As defined in RFC2616.


6.5 Some things to do after the client receives the Response handshake message



After receiving the Response handshake message:





  • If the return code is not 101, process the information according to RFC2616. If it is 101, proceed to the next step and start parsing the header fields. All header fields are case insensitive.
  • Check whether the Upgrade header contains websocket.
  • Check whether there is a Connection header and the content contains Upgrade.
  • Determine whether there is an SEC-websocket-accept header, which is described below.
  • If there is an SEC-websocket-extensions header, determine if the previous Request handshake contains this content. If not, the connection fails.
  • If it contains the sec-websocket-protocol header, determine if the previous Request handshake had this Protocol. If not, the connection fails.


6.6 Concepts of Servers








6.7 After receiving a connection request from a client, the server needs to do some things




















6.8 Successful Response Handshake Sent by the Server



The handshake message is a standard HTTP Response message, and it contains the following parts:





  • Status lines (as described in RFC2616 above);
  • The Upgrade header field is websocket;
  • The Connection header field, which contains Upgrade;
  • The sec-websocket-accept header is generated in the following steps: -a. First add the contents of sec-websocket-key to the string 258eafa5-e914-47DA-95CA-C5AB0DC85b11 (a UUID); -b. SHA1 encode the string generated in #1; -c. Base64 encodes the string generated in #2.
  • Sec-websocket-protocol header field (optional);
  • Sec-websocket-extensions header field (optional).








Some extensions to WebSocket








6.10 Sending Data














6.11 the frame type









The frame types defined in RFC6455 are as follows:





  • 1) Opcode == 0 continue: indicates that this frame is a continue frame and needs to be spliced after the last received frame to form a complete message. Due to this parsing feature, non-control frames must be sent and received in the same order.
  • 2) Opcode == 1 text frame.
  • 3) Opcode == 2 binary frames.
  • 4) Opcode == 3-7 future use (non-control frames).
  • 5) Opcode == 8 Close connection (control frame) : This frame may contain content indicating the reason for closing the connection. A party to the communication sends this frame to close the WebSocket connection, and the party receiving this frame needs to send an identical close frame to confirm the closure if it has not sent this frame before. If both parties send this frame at the same time, both parties need to send a closed frame in response. Ideally, the server closes the TCP connection after confirming that the WebSocket connection is closed, and the client waits for the server to close the TCP connection, but the client can close the TCP connection under certain circumstances.
  • Ping: similar to heartbeat, when a party receives a Ping, it should immediately send Pong as a response.
  • 7) Opcode == 10 Pong: If the communicating party did not send a Ping but received Pong, it is not required to return any information. The contents of the Pong frame should be the same as the Ping received. There may be a situation where one party receives a lot of pings, but only needs to respond to the most recent one.
  • 8) Opcode == 11-15 future use (control frame).


6.12 frame format















6.12 Summary



With all that said, the relationship between Http and WebSocket can be easily understood in the following figure:






7. Compare to HTTP








7.1 similarities


  • Both are application layer protocols based on TCP.
  • Request/Response model is used for connection establishment.
  • Errors are handled the same way during connection establishment, and WS may return the same return code as HTTP at this stage;
  • Can transfer data over the network.


7.2 the difference between


  • WS uses HTTP to establish connections, but defines a new set of header fields that are not used in HTTP;
  • A CONNECTION to WS cannot be forwarded through a middleman, it must be a direct connection;
  • After the WS connection is established, both communicating parties can send data to the other party at any time;
  • After the WS connection is established, the data transmission is transmitted by frames, and the Request message is no longer needed.
  • WS data frames are ordered.



www.jianshu.com/p/f666da1b1…


Appendix: Other Web – side im technology articles



Quick start for beginners: WebSocket concise tutorial



Socket. IO implementation of message push practice and ideas



LinkedIn instant messaging practice on the Web: hundreds of thousands of long connections in a single machine



The development of Instant messaging technology on Web and the practice of WebSocket and socket. IO technology



Instant Messaging Security on the Web: Cross-site WebSocket hijacking (with sample code)



Open source framework Pomelo practice: Build a high-performance distributed IM chat server on the Web end



WebSocket and SSE technology is used to push Web messages



Detailed explanation of the evolution of Web side communication mode: from Ajax, JSONP to SSE, Websocket



Why does the mobile IMSDK-Web network layer framework use socket. IO instead of Netty?



More of the same…