Comb the network protocol stack again

This paper is participating in theNetwork protocols must be known and must be known”Essay campaign

preface

Since the release of HTTP 0.9 in 1989, the Development of the Internet has been nearly three decades, and various concepts have emerged endlessly. Such as HTTPs, WebSocket, HTTP 2.0, Spdy and so on. As a basic computer discipline, I believe that you more or less already understand the point. Here, I want to take another look at how they evolved, the design behind them, the problems they solve, and some of the practices to bring you back to the beginning.

Let’s throw out a few questions and find the answers at the end

When you open a browser and visit a web page, how does it send data safely to us?
Why did HTTP 2.0 take 10 years to emerge, and what problems did it solve?
Socket and Http relation?
What is a Web Socket? What floor does it work on?
How to make data transfer lighter and more efficient?

Disclaimer: This article was originally published on my blog four years ago. It has been partially updated. Please correct it if it is still out of date

1. Network protocols

1.1 Layer 7 Protocol

OSI model: The Open Systems Interconnection model

A conceptual model proposed by the International Organization for Standardization (ISO), a standard framework that attempts to network various computers worldwide.

In layman’s terms, a protocol framework designed for computers around the world to communicate with each other, following the rules of which all devices or software that need to be connected to the Internet must follow. In addition, the current layered model was developed to avoid the possible mixing of all functions and to allow different companies or organizations to focus more on one area.

We can see that at each layer, the corresponding protocol is designed according to the actual application scenario.

From the above model, we can know that the large structure is divided into two parts: upper-layer application-oriented and network media-oriented. The application layer is mainly oriented to data and data transmission, and the network media is mainly oriented to network communication and hardware.

1.2 the TCP/IP

Derived from the OSI protocol model, the TCP/IP protocol stack is the most used today. Note that the TCP/IP stack is not just a collection of TCP and IP protocols; it is a more simplified OSI model.

The following figure shows how it differs from the OSI model:

From the TCP/IP protocol specification, the application layer, presentation layer and transport layer are combined. The main reason is that these three layers, in practical use today, have quite a lot of coupling relationships, and it is difficult to separate them deliberately.

In fact, the CURRENT TCP/IP model is much simpler and more suitable for today’s application scenarios.

In addition to the stack above, there are two important protocols today: TLS and SPDY.

1.3 the TLS/SSL

TLS/SSL is one of the more important secure transport protocols today and is not on the protocol stack listed above. But it is designed to sit between the application layer and the transport layer. To ensure that the data is secure before transmission, but without breaking the original data protocol. Application layer protocols (e.g. HTTP, FTP, Telnet, etc.) can be transparently built on top of TLS.

We often hear HTTPS, the S stands for secure, both SSL and TLS protocols. It was introduced by Netscape in 1994.

SSL (Secure Socket Layer, 1994, first edition by Taher Gamor).

SSL ensures secure communication between clients and servers through mutual authentication, digital signatures to ensure integrity, and encryption to ensure privacy.

TLS: (Transport Layer Security, 1999 年)

It is based on the SSL 3.0 protocol specification, and acts the same as SSL, with very little difference from SSL 3.0. It can be understood as SSL 3.1.

Both SSL and TLS have two layers: the handshake protocol and the logging protocol

SSL has been reported to have various security problems, so TLS is basically used today.

Here’s how to establish a connection:

It must be based on reliable data transmission, so it is usually on top of TCP.

At the same time, if SSL/TLS is used for security authentication, the connection establishment time will actually be longer, as shown in the figure:

1.4 SPDY

SPDY (pronounced speedy), September 2010 An open Internet transport protocol developed by Google for delivering web content to minimize latency, speed up the network, and optimize the user experience. It is based on TCP’s application layer protocol and is a precursor to HTTP/2. It was first introduced in Chrome 6.

SPDY is designed to reduce web page load times. Through priority and multiplexing, SPDY makes it possible to transfer resources such as web content and images only by creating a TCP connection. TLS encryption is widely used in SPDY, and transmission is compressed in GZIP or DEFLATE format (unlike HTTP, HTTP headers are not compressed). In addition, in addition to a web server like HTTP passively waiting for a browser request, SPDY’s web server can also actively push content.

SPDY is not intended to replace HTTP; it just modifies the way HTTP requests and replies travel over the network. This means that only one SPDY transport layer needs to be added, and no changes need to be made to any existing server applications. When transmitted using SPDY, HTTP requests are processed, marked down, and compressed. For example, each SPDY endpoint keeps track of every HTTP header that was sent in a previous request to avoid repeating the same headers. The data portion of the unsent packet will be compressed and sent.

The reason Why Google changed HTTP rather than TCP/IP is that changing HTTP requires only updating the Browser and Web Server, while changing TCP/IP requires updating all routers, servers, and client operating systems.

1.3 the TCP

No matter how the protocol of the application layer is defined and used, it is based on a connected path to send and receive data. We can try to understand the establishment and transmission of connection from the data transmission layer that we are most familiar with.

This layer has two important protocols, TCP and UDP.

The operation of TCP can be divided into three stages: Connection establishment, data Transfer and connection termination.

The connection to create

TCP creates a connection with a three-way handshake. During connection creation, many parameters are initialized, such as sequence numbers, to ensure sequential transmission and connection robustness.

It is possible for a pair of terminals to simultaneously initiate a connection between them. But usually one end opens a socket and listens for connections from the other end. This is commonly referred to as passive open. After the server is passively opened, the client can begin to create active open.

The client creates an active open by sending a SYN to the server as part of a three-way handshake. The client sets the serial number of this connection to A random number.
The server should return a SYN/ACK for a valid SYN. The ACK code should be A+1, and the SYN/ACK packet itself has A randomly generated sequence number B.
Finally, the client sends another ACK. When the server receives this ACK, it completes the three-way handshake and enters the connection creation state. The sequence number of the packet is set to A+1 for the received acknowledgement number and B+1 for the response number.

A: Yes? B: Yes, come on! A: good!

The connection retry

If the server receives a SYN-ACK from the client and then replies with a SYN-ACK, the client is disconnected and the server does not receive an ACK from the client, then the connection is in an intermediate state, that is, it did not succeed or fail. Therefore, if the server does not receive any TCP within a certain period of time, it resends a SYN-ACK. Under Linux, the default number of retries is 5, and the interval between retries is doubled from 1s. The interval between five retries is 1s, 2s, 4s, 8s, and 16s, for a total of 31 seconds. After the fifth retry, it takes 32 seconds to realize that the fifth retry has also timed out. It takes 1s + 2s + 4s+ 8s+ 16s + 32s = 63s for TCP to disconnect this connection.

The data transfer

When receiving a TCP packet, the host identifies the session to which the TCP packet belongs with the IP addresses and port numbers of both ends.

TCP or UDP uses the concept of Port number to identify the sender and receiver application layers, also known as Internet Sockets. Each TCP connection has an associated 16-bit unsigned port number assigned to one end. Ports are divided into three categories: well-known, registered, and dynamic/private. Well-known port numbers are assigned by the Internet Licensing Authority (IANA) and are usually used at the system level or root process. Well-known applications run as server programs and passively listen for connections that frequently use these ports. For example, FTP (20 and 21), SSH (22), TELNET (23), SMTP (25), HTTP over SSL/TLS (443), and HTTP (80). Registered port numbers are often used as source port numbers for brief use by end users to connect to servers, but they can also be used to identify named services that have been registered with third parties. Dynamic/private port numbers have no meaning outside of any particular TCP connection. The number of possible, officially recognized port numbers is 65535.

Timeout retransmission

In addition, an important concept mentioned for TCP data transmission is timeout retransmission. This mechanism is further proof of why TCP is a reliable transport.

The sender uses a conservatively estimated time as the upper limit of the timeout for acknowledgement of received packets. If the upper limit is exceeded and no acknowledgement packet is received, the sender retransmits the packet. The retransmission timer is reset each time the sender receives an acknowledgement packet.

The timeout definition is RTT (round-trip delay time) in the definition.

Connection is terminated

A four-way handshake is used to terminate a TCP connection

A: Turn it off. B: Ok, wait a minute! B: pass! A: Bye~

More information on Wikipedia

1.4 the UDP

In addition to reliable TCP connections, there is another mechanism called unreliable connections: the User Datagram Protocol (UDP), which is a simple Datagram oriented transport layer Protocol. This protocol was designed by David P. Reed in 1980 and incorporated into RFC 768 as an official specification.

UDP does not need to establish a connection, does not guarantee the reliability of transmission, and does not have a retry mechanism, so it is more efficient, of course, the quality of data transmission is not guaranteed.

In the TCP/IP model, UDP provides a simple interface above the network layer and below the application layer. UDP provides only unreliable delivery of data and does not retain a backup of data once it has sent the data sent by the application to the network layer (hence UDP is sometimes referred to as an unreliable datagram protocol). UDP adds only reuse and validation to the header of an IP datagram.

Many of the key applications on the network that use the UDP protocol are somewhat similar. These applications include domain Name System (DNS), Simple Network Management Protocol (SNMP), Dynamic Host Configuration Protocol (DHCP), Routing Information Protocol (RIP), and some video streaming services, among others.

1.5 the Socket

We mentioned a Socket, literally a Socket, which is to set up a channel for communication or connection.

According to the OSI model, it is a concrete implementation based on the transport layer (TCP/UDP). With TCP/UDP, both parties can set up a connection according to the protocol. Once the connection is set up, we say that the Socket connection has been established, which is equivalent to plugging into the Socket.

In computer science

It usually stands for a Network socket, also translated as a Network socket. It is the mechanism by which processes in a computer Network communicate with each other. A network socket that uses The Internet Protocol (Internet Protocol) as the basis for communication is called an Internet socket. Because of the popularity of Internet protocol, the vast majority of modern network sockets are Internet sockets.

In the operating system

It is a mechanism that provides interprocess communication. Applications are typically provided with a set of application program interfaces (apis) called socket apis. Applications can use network sockets through socket interfaces for data exchange. The earliest socket interfaces came from 4.2BSD, so most common modern sockets are derived from the Berkeley Sockets standard. In a socket interface, an IP address and a communication port form a socket address. Once the remote and local socket addresses are connected, plus the protocol used, the five-element tuple acts as socket pairs and then exchanges data with each other.

For example, TCP and UDP can use the same port on the same computer without interfering with each other. Based on the socket address, the operating system can decide which process or thread should deliver the data. This is similar to the telephone system, where a phone number plus an extension number is used to decide who to call.

Socket = IP + Port

Here is a Socket Demo written by C for your reference. The Accept function implements a three-way handshake.

2. The application layer

2.1 HTTP

One of the most commonly used protocols in the application layer is the Hypertext Transfer Protocol (HTTP).

HTTP history

– In 1965, a 26-year-old young man named Ted Nelson came up with the concept of hypertext and hypermedia. (What a hero!)

– In 1989, years later, a guy named Tim Berners-Lee and his team at CERN started the WorldWideWeb project based on hypertext concepts. The World Wide Web, as we now know it, combined the hypertext system with the Internet to send hypertext data. The first version had only one method, called GET, that could request a page from the server to the client, an HTML page.

– in 1991, the first HTTP document HTTP 0.9 was defined. The world’s first website was built at CERN.

– In 1996, the HTTP 1.0 standard was released. The HTTP Working Group (HTTP WG) led by Dave Raggett has expanded the HTTP protocol to include rich media information, security protocols, header fields, and more.

– In 1997, the HTTP 1.1 standard was developed.

– HTTP 2.0 was released in 2015

About HTTP Session

An HTTP session is a request-response sequence for a network. An HTTP client initially requests to establish a TCP connection to a specified port (usually 80) on the server. An HTTP server is always listening for requests from clients. Based on the received request, the server returns a status line such as “HTTP/1.1 200 OK”.

2.1 HTTP 0.9 ~ 2.0 Evolution

The HTTP 0.9

There is only one method, GET. Request headers are not supported

The HTTP 1.0

Increased the request header, in addition to GET, many more methods, such as POST, HEAD, PUT and so on.

The HTTP 1.1

Because HTTP 1.0 only maintains ephemeral connections, TCP connections are closed after each request, resulting in a significant amount of time spent establishing connections. Another problem is that HTTP 1.0 requests are like a queue, following a FIFO (first in, first out) principle. The second Request can only be sent after the Response of the first Request is returned. That’s what you call head of line blocking.

HTTP 1.1 improves on two major issues

Persistent connection support allows multiple HTTP requests and responses to be sent over a SINGLE TCP connection, reducing the cost and latency of establishing and closing connections.
Allows a client request results back don’t have to wait for the last time, you can send out the next request, but the server must be in order to receive the order of the client request of echo response as a result, to ensure that the client can distinguish between the response content of each request, it also significantly reduce the time required to download the entire process.

In addition, 1.1 protocol also supports mechanisms such as identity authentication, status management and Cache caching, which can be easily implemented in scenarios such as resumable breakpoints.

Key Differences between HTTP/1.0 and HTTP/1.1

HTTP Pipelining

It is a technique for submitting multiple HTTP requests in batches without waiting for a response from the server.

The pipelining of request results results in dynamic improvements in HTML page load times, especially in connection environments with high latency. In broadband connections, acceleration is not as significant as it is required to apply the HTTP/1.1 protocol on the server side: the server side must resume requests in the order requested by the client, so that the entire connection is still first-in, first-out, and HOL blocking can occur, causing delays.

The HTTP 2.0

In fact, HTTP 1.1 has been used for a long time and has not undergone a major improvement, which is basically a small range of protocol optimization and expansion on the original basis. It wasn’t until Google launched SPDY that HTTP 2.0 was officially proposed based on SPDY.

It primarily addresses the holoblocking problem that was always present in HTTP 1.0 or 1.1.

It does not break the original HTTP structure, but only uses binary data transmission, which is more compact and efficient than the previous text transmission. At the binary framing layer, HTTP2.0 splits all transmitted information into smaller messages and frames and encodes them in binary format, where the http1.x header is encapsulated in the Headers frame and our Request body is encapsulated in the Data frame.

The new binary framing mechanism changes the way data is exchanged between clients and servers. To illustrate this process, we need to understand three concepts of HTTP/2:

Data stream: a bidirectional byte stream within an established connection that can carry one or more messages.

Message: A complete sequence of frames corresponding to a logical request or response message.

Frame: The smallest unit of HTTP/2 communication. Each frame contains a frame header that at least identifies the data stream to which the current frame belongs.

The relationship between these concepts is summarized as follows:

All communication is done over a SINGLE TCP connection, which can host any number of two-way data streams.
Each data flow has a unique identifier and optional priority information for carrying two-way messages.
Each message is a logical HTTP message (such as a request or response) that contains one or more frames.
A frame is the smallest unit of communication and carries specific types of data, such as HTTP headers, message payloads, and so on. Frames from different data streams can be sent interlaced and then reassembled according to the data stream identifier for each frame header.

The key here is that the data flows are prioritized, instead of being processed sequentially as before, the server can process requests according to a custom priority.

Another big improvement is server push. HTTP 2.0 does not have to rely on request to send a response as before, breaking the strict request-response semantics. A server can send a Response to a client even if the client does not have a Request.

You can enter the following command on the command line to see the difference.

telnet www.google.com 80

GET / HTTP/1.0

GET / HTTP/1.1

2.2. The HTTPS

As mentioned before, S in HTTPS is a layer of security protocol between the transport layer and the application layer, which encrypts and encapsulates information based on reliable transport (such as TCP).

I’m sure you can find a detailed explanation of HTTPS in many places. I’m not going to go through them all. We just have to remember the following points.

First, symmetric encryption and asymmetric encryption. The biggest difference between the two encryption methods is that symmetric encryption has no data length limit, while asymmetric encryption has length limit, so asymmetric encryption is used to transmit symmetric encryption keys.

Second, the public key in asymmetric encryption cannot be directly sent to the peer party, otherwise the scenario of man-in-the-middle attack will occur. There must be a reliable third party to act as an intermediary. This is known as a CA (Digital Certificate Authority).

The CA usually performs server-side authentication based on the domain name, and the CA’s public key is built into the browser or operating system, so the CA’s public key does not need to be transmitted. This avoids middleman certificate forgery.

Someone has written a very easy to understand article before, you can do reference.

2.3 the WebSocket

When it comes to application layer protocols, in fact, there is a more popular protocol is WebSocket, which was selected as the standard RFC 6455 by IETF in 2011.

WebSocket makes it easier to exchange data between the client and the server, allowing the server to actively push data to the client. In the WebSocket API, the browser and server only need to complete a handshake to create a persistent connection and two-way data transfer.

Why is it called WebSocket? What does it have to do with sockets?

First of all, WebSocket has nothing to do with sockets. It’s called that for the sake of understanding. The Emphasis is on the Web because it allows protocol switching based on HTTP connections. Remember the request header from HTTP /1.1? We can define Upgrade: websocket Connection: Upgrade. The server’s response then returns: HTTP/1.1 101 Switching Protocols. Switch to the WebSocket protocol.

Web sockets use WS or WSS as a uniform identifier. Such as: ws://www.abc.com

Reference practice: WebSocket client and server open source project based on Node JS

3. Summary

Answer the questions at the beginning of the passage

1. When we open a browser and visit a web page, how does it safely send data to us?

If you want to access an IP address that has a domain name, resolve the domain name through the DNS server. If it is IP, access directly. Then exclude the lowest layer of the network protocol, from the transport layer, the first three handshake, HTTPs first set up an encrypted channel, get the transmission key. Then Request and Response data are transmitted based on application layer protocols, such as HTTP. Depending on the upper-layer protocol (HTTP/FTP/Others), the connection may be continuous or one-time.

2. Why did Http 2.0 take 10 years to emerge, and what problems did it solve?

HTTP 1.1 came out, basically in the above supplement and small adjustments, no substantive changes. Ten years later, with Google’s powerful technology research and development capabilities and modern network infrastructure, the binary stream transmission method was proposed to solve the HOL blocking problem. It eventually became the HTTP /2.0 standard. Currently, there is less room for optimization in the application layer, and the next bottleneck is more the transport layer, but changes in the transport layer will involve all existing browsers and operating systems, so this piece of evolution is not as fast as expected. Maybe one day, there will be newer technologies.

3. What is the relation between Socket and Http?

A Socket is a channel. In the Network domain, it is a Network Socket that establishes a channel between two ends for data transmission. HTTP is based on this management for hypertext transfer. Simply put, once a Socket is set up, you can transmit anything, whether HTTP or FTP. It depends on what your application protocol is.

4. What is a Web socket? What floor does it work on?

Web Socket is just a duplex communication concept. In order to solve the HTTP mode of sending information only in the form of Request and Response, both the client and the server can actively send data. HTTP /2.0 also adds this concept. But WebSocket was invented before it, so it’s the predecessor.

It works on top of transport, an application layer that can be interpreted as an upgraded version of HTTP. Because it can use HTTP protocol, upgrade use, so called Web Socket.

5. How to make a lighter and more efficient transmission?

You can directly face Socket programming, transport layer development, their own invention of lighter, more efficient application layer protocol. The existing operating system, TCP interface has a very standard, perfect programming interface, development is not difficult. Therefore, you can customize the application layer protocol according to the application scenario. Recommended to buy a Socket network programming to learn.