This is the 8th day of my participation in the More text Challenge. For details, see more text Challenge

This article is about the HTTP protocol study notes, here to make a summary and share, there are shortcomings also hope to be made

The network layer

When transferring data from host A to host B, you may go through the following processes

It can be seen that this is a complex process, hence the concept of network stratification, the following excerpt baidu Encyclopedia part of the explanation:

Network stratification is to send or forward, package or unpack data, control information loading or unpack and other tasks to be completed by network nodes by different hardware and software modules. This simplifies the complex problem of communication and network interconnection.

In order to simplify the complexity of the network, different aspects of network communication is broken into multi-level structure, each layer to interact with only next to the upper or the lower, the network layer, so that you can modify or even replace a layer of software, as long as the interface between the layer and layer remains the same, will not affect the other layers.

Two protocol stacks

The network layer can be divided into two types of network protocol stacks

  • OSI
  • TCP/IP protocol suite

A hierarchical resolution process for HTTP requests

What happens when we enter the domain name of the page we want to visit into the browser address bar and press enter?

The browser does not recognize the domain name, but the IP address, so first need to resolve the domain name to the IP address, then see whether the cache has the relevant DNS information of the corresponding domain name, if there is, you can directly obtain the IP address. If not, it looks in the local host file. If not, it makes a DNS request for the server’s IP address.

The process of obtaining the server IP address is as follows:

  1. The application layer constructs a DNS request packet
  2. The udP-related protocol at the transport layer is invoked, a UDP request header is added to the DNS request packet, and the request header is sent to the network layer
  3. The network layer adds an IP header to the UDP request packet and sends the IP request packet to the data link layer
  4. The data link layer adds its own MAC header and the MAC address of the next machine to the physical layer, usually to a router, which is a layer 3 device: (1). The link layer checks whether the MAC address is assigned to the link layer. If yes, the link layer parses the MAC address. After the MAC address is parsed, the data packet is forwarded upward to the network layer (3). The network layer looks at the address of the next router where the data should be sent, and then passes it to the carrier’s router through the carrier’s network interface (4). If the computer is configured with the carrier’s DNS server, go directly to the carrier’s DNS server to find the IP address of the corresponding domain name, and then start to return layer by layer

After receiving the IP address, the application layer sends THE HTTP request packet

  1. Invoke the TCP protocol at the transport layer
  2. IP protocol at the network layer, plus the IP header
  3. Call the data link layer, plus the MAC header
  4. Data is transmitted through the physical layer and the router. This time, the packets carry IP addresses. Therefore, the carrier does not need to access the DNS server of the carrier
  5. In the network environment of server, it is still resolved layer by layer
  6. The physical layer sends it to the data link layer
  7. The link layer determines whether the data is for itself, parses the data, and sends the data to the network layer
  8. The network layer determines if the IP address is its own, parses it, and sends it to the transport layer
  9. The transport layer parses the TCP port, such as 80, and sends the request packet to the application layer
  10. The application layer parses the packet, constructs an HTTP response packet, and returns it to the client layer by layer

The HTTP protocol

Hypertext Transfer Protocol (HTTP) is a stateless, request/response protocol that uses extensible semantics and self-describing message formats to interact flexibly with web-based hypertext information systems. Stateless, here is an explanation from Zhihu

Because each HTTP request is completely independent, each request contains the complete data needed to process the request, sending the request does not involve a state change. As for HTTP/2, it should be considered a stateful protocol (there are handshakes and GOAWAY messages, and there is flow control like TCP), so it would be better to say “HTTP 1.x is a stateless protocol” than “HTTP 1.x is a stateless protocol”

HTTP Message Format

HTTP request packets and response packets have the same structure and consist of three parts:

  • Start line
    • Describes the basic information about the request or response
    • GET/API /app/warehouseArea HTTP/1.1
    • Response to HTTP/1.1 200 OK
  • Header field collection
    • Use the key-value format to describe the packet in more detail, for example, Connection: keep-alive
    • The field name is case insensitive. Spaces are not allowed in the field name. Hyphens (-) are allowed and underscores (_) are not allowed_, the field name must be followed by a colon without Spaces, and the field value after a colon can be preceded by multiple Spaces
    • The order of the fields can be arbitrary
    • In principle, fields cannot be repeated unless the semantics of the field itself allow, such as set-cookie
  • Message body (Entit)
    • The actual transmitted data may not be plain text, but can be binary data such as pictures and videos

The complete process of an HTTP request

Transmission Control Protocol

Connection-oriented, reliable, byte stream – based transport – layer communication protocol

The characteristics of

  • Connection-based: A connection must be established before data transmission
  • Full-duplex: Two-way transmission
  • Byte stream: The data size is not limited, and packets are packaged into segments to ensure orderly receipt. Repeated packets are automatically discarded
  • Traffic buffering: Resolve the mismatch between the processing capabilities of both parties
  • Reliable transmission service: reachability is guaranteed, and reliability is achieved through retransmission mechanism in case of packet loss
  • Congestion control: prevents malicious network congestion

TCP Connection Management

TCP connection: four-tuple [source ADDRESS, source port, destination ADDRESS, destination port] Establishment connection: TCP three-way handshake

  • Three-way Handshake is when a TCP connection is established with a total of Three packets sent by the client and server.
  • Initial serial number (ISN)
  • Negotiate TCP communication parameters (MSS, window information, specify checksum algorithm)

TCP packet

How to Shake hands (three-way handshake)


  • SYN_SENT indicates that you request a connection. When you want to access another computer’s service, you first send a synchronization signal to the port. This state is SYN_SENT, and if the connection is successful, it becomes ESTABLISHED
  • TCB stands for Transmission Control Protocol (TCP) Control Block.

TCP waves four times

A: Sends A FIN packet, which means that A does not send data

B: After receiving the request, user A starts to reply to prevent user A from resending the FIN

B: Close the connection after processing the data and send the FIN request

A: After receiving the request, service B sends an ACK response, and service B releases the connection


MSL is the abbreviation of Maximum Segment Lifetime, which can be translated as “Maximum Lifetime” in Chinese. It indicates the Maximum duration for any packet to exist on the network. If the duration exceeds this period, the packet will be discarded. Wait for 2MSL to release the connection, yes

  1. This prevents packet loss and causes USER B to repeatedly send FIN packets
  2. Prevents stranded packets from disrupting data on newly established links


Due to the “plaintext” nature of HTTP, the whole transmission process is completely transparent. Anyone can intercept, modify, or forge the request/response packets on the link, and the data is not credible. Hence the HTTPS protocol for security. With HTTPS, all HTTP requests and responses are encrypted before being sent to the network.

One More Thing

The request header field is not allowed to use the hyphen _, but it is valid and conforms to THE HTTP standard. By default, the server is not allowed to use the header field because of the historical legacy of CGI. Both underscores and hyphens are mapped to underscores in CGI system variable names, which can cause confusion. In nginx servers, it is possible to use underscores in field names by setting underscores_IN_headers on.