preface

The network is composed of a number of nodes and the links connecting these nodes, and then the huge network connected between the network is called the Internet, and the HTTP (HyperText Transfer Protocol) that we are going to talk about today is the most widely used network Protocol on the Internet. It was developed and published by the World Wide Web Consortium.

A list,

The HTTP protocol has five features:

1. Support client/server mode. 2. Simple and fast: when customers request services to the server, they only need to transmit the request method and path. 3. Flexibility: HTTP allows the transfer of any type of data object. The Type being transferred is marked by content-Type, which is the identifier used in the HTTP package to indicate the Content Type. 4. Connectionless: Connectionless means that only one request can be processed per connection. The server disconnects from the customer after processing the request and receiving the reply from the customer. In this way, transmission time can be saved. 5. Stateless: Stateless means that the protocol has no memory for transaction processing and the server does not know the status of the client. After we send an HTTP request to the server, the server will send us data based on the request, but no information will be recorded after sending the request (cookies and sessions are bred).

TCP/IP protocol

HTTP is a four-tier model for transferring data based on the TCP/IP protocol cluster.

From the figure above, we can clearly see that HTTP uses TCP as the transport layer, while the network layer uses IP (among many other protocols), so HTTP is based on TCP/IP to transfer data. Here’s how TCP/IP works:

We can see that the data sender encapsulates the data layer by layer, the data receiver unencapsulates the data layer by layer, and finally the application layer obtains the data.

3. Establish a TCP connection

Now that we know roughly how the TCP/IP protocol cluster works, let’s take a look at how HTTP establishes connections.

1.TCP packet header information before we talked about HTTP is a TCP/IP protocol cluster to transfer data, so this HTTP connection is to establish a TCP connection, TCP how to establish a connection, a look at the TCP packet information structure.

TCP packet =TCP header information +TCP data body, and TCP header information contains 6 control bits (in the red box above), which represent the state of TCP connection: 1.URG: Urgent data — this is an urgent message; 2.ACK: Confirm receipt 3.PSH: indicates that the application program on the receiving end should immediately read data from the TCP receive buffer 4.RST: requests the peer to re-establish a connection 5

With the TCP header information, we can take a look at the three-way handshake used to establish a TCP connection.

Three handshakes:

1. The client sends a packet with bit code SYN=1 and seq number=1234567 to the server. The server knows from SYN=1 that the client wants to establish connection (client: I want to connect you) 2. Ack number=(client seQ +1), SYN =1, ACK =1, and seQ =7654321 (server: ok, you can connect) 3 After receiving the packet, the client checks whether the ACK number is correct, that is, the SEQ number+1 sent for the first time and the bit code ACK is 1. If the ack number is correct, the client sends the ack number=(seQ +1 of the server), ACK =1, If the server receives the seQ value and ACK =1, the connection is established successfully. (Client: OK, I’m coming) Interviewer: Why does HTTP require three handshakes to establish a connection, not two or four handshakes A: Three is the minimum number of secure times, two is unsafe, four is a waste of resources

4. Client request

Once the client is connected to the server, the client can start requesting resources from the server and can start sending HTTP requests.

1.HTTP request packet structure We have said that the TCP packet =TCP header +TCP data body, we have talked about the TCP header, now let’s talk about the TCP data body, which is our HTTP request packet.

Take a look at an actual HTTP request example:

1.① Is the request method, HTTP/1.1 definition of the request method has eight: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, and TRACE are the two most common types of GET and POST. If RESTful interfaces are used, they usually use GET, POST, DELETE, and PUT. 2. ③ Is the protocol name and version number.④ Is the HTTP packet header. The packet header contains several attributes in the format of attribute name: Attribute value. It encodes component values in a page form into a format string, which holds data for multiple request parameters, in the form of param1= Value1 & Param2 =value2 key-value pairs. Not only can the message format pass the request parameters, but the request URL can also be passed via something like “/chapter15/user.html? Param1 =value1& Param2 =value2 is used to pass the request parameters. User-agent: the name and version of the operating system and browser used by the client. Some sites restrict the request browser.Referer: the address of the previous page, indicating where the request came from

5. Server response

The server responds to the request and sends it back to the client. The HTTP response packet structure is consistent with the request structure.

1.HTTP response packet structure

2.HTTP response instance

3. Response status code In response messages, we pay special attention to: the response status code of the server, which is easy to be asked in an interview. The following is only a list of categories.

Disconnect

After the server responds, a session ends. Will the connection be disconnected?

1. Short and long connection disconnection we need to distinguish between HTTP versions: (1) In HTTP/1.0, the client and the server completed a request/response, the previously established TCP connection is disconnected, the next request must be re-established TCP connection, this is also known as the short connection. (2) Only six months after the release of HTTP1.0 (January 1997), HTTP/1.1 has been released with a new feature that allows TCP connections to be opened continuously after a request/response has been completed between the client and the server. This means that the next request is made using the TCP connection without the need to re-establish a new connection through a handshake. This is also known as a long connection. A long connection is a TCP connection that allows multiple HTTP sessions. HTTP is always a request/response. When the session ends, HTTP itself does not have a long connection. Connection:keep-alive (” keep-alive “); Connection:keep-alive (” keep-alive “); Connection:keep-alive (” keep-alive “); Connection:keep-alive (” keep-alive “); Connection:keep-alive (” keep-alive “);

Advantages: When the site has a large number of static resources (pictures, CSS, JS, etc.) can open the long connection, which also several pictures can be sent through a TCP connection. Disadvantages: When the client does not request once, and the server is open long connection resources are occupied, which is a serious waste of resources. So whether to open long connection, long connection time need to be set reasonably according to the website itself. Ps: We should not underestimate this TCP connection, in a client HTTP complete request (DNS address, establish TCP connection, request, wait, parse web page, disconnect TCP connection) to establish TCP connection takes a large time ratio.

3. During the disconnection process, the establishment of a TCP connection is a three-way handshake, while the disconnection is a four-way wave.

Ps: HTTP also has two major disadvantages: plaintext and no guarantee of integrity, so it is gradually replaced by HTTPS.