HTTP is actually a communication protocol between a client and a server. When an HTTP client initiates a request, a port will be created on the server, and the HTTP server listens to the client’s request on the port, and the HTTP server returns status and content.

Start by typing in a web address

When we type a web address into the browser and press Enter, the browser displays the page. With a good Internet connection, it might be a second, but what happened in that second?

The main content of this article is to try to record the detailed process of a complete Web request, starting from the user input URL address in the browser, and then how the browser find the server address process, and initiate a request; Analyze the process of the request inside the reverse proxy server. Finally, after the request is processed on the server, the browser renders the response page process.

The general process is as follows:

How Web requests work can be summarized simply as follows:

  1. The browser uses DNS to resolve a domain name into an IP address.
  2. Find the corresponding server on the Internet according to the IP address, establish Socket connection;
  3. The client sends the HTTP protocol request package to the server to request the resource document in the server.
  4. On the server side, there is actually complex business logic: there may be multiple servers, and specifying which server to handle requests requires a load balancing device to evenly distribute requests from all users;
  5. Whether the requested data is stored in a distributed cache, a static file, or in a database;
  6. When the data is returned to the browser, the browser will initiate another request when parsing the data and finding some static resources (such as CSS, JS or images), and these requests may be on the CDN, so the CDN server will process the user’s request.
  7. The client is disconnected from the server. The HTML document is interpreted by the client and the graphical results are rendered on the client screen.

An HTTP transaction is implemented in this way. It looks simple, but the principle is quite complex. It is important to note that communication between the client and server is non-persistent, meaning that when the server sends a reply, it disconnects from the client and waits for the next request.

However, it is important to note that starting with HTTP 1.1, the server can maintain a long connection with the client, not necessarily disconnect after the request is completed, depending on the server’s operations.

Second, DNS domain name resolution

Let’s start with what happened first — DNS domain name resolution, which simply translates a domain name into an IP address. For example, the domain name www.test.com is translated to the corresponding IP address 192.168.1.1. This is just an example.

If you type the IP address directly into your browser, you will actually skip this step, otherwise the following steps will be implemented:

1. Check browser cache

The browser first searches the DNS cache of the browser. The cache duration is short (about 1 minute) and can hold only 1000 entries. The browser checks whether there are corresponding entries in the cache and they have not expired.

2, the operating system cache check + hosts resolution

If the browser does not find the corresponding domain name in the cache, the OPERATING system also performs a domain name resolution process. The browser searches for the resolution result in the DNS cache of the operating system. If the resolution result does not expire, the search is stopped.

This can be set in Linux through the /etc/hosts file, which resolves any domain name to any accessible IP address. If you specify an IP address for a domain name, the browser will use that IP address first. When a domain name is resolved in the configuration file, the operating system caches the resolution result in the cache. The cache duration is also controlled by the expiration time of the domain name and the cache space.

3. Local DNS Server resolution

If no corresponding entry is found in the hosts file, the browser initiates a DNS system call, which sends a domain name resolution request to the locally configured preferred DNS server (this request is made recursively through UDP to port 53 of DNS. The carrier’s DNS server must provide us with the IP address of the domain name).

We all have “DNS server address” in our network configuration, and this address is used to resolve what happens if the two procedures fail to resolve. The operating system sends this domain name to the LDNS set up here, the local domain name server.

This DNS server usually provides a DNS resolution service for your local Internet access. For example, if you are accessing the Internet at school, your DNS server must be at your school. If you access the Internet in a community, this DNS is the application provider that provides you access to the Internet, namely Telecom or Unicom. About 80% of domain name resolution is done at this point, so LDNS is mainly responsible for domain name resolution.

4, Root DNS Server parsing (Root Server)

If LDNS does not find the corresponding entry, the carrier’s DNS initiates an iterative DNS resolution request on behalf of our browser. It first looks for the DNS IP address of the root domain. It finds the DNS address of the root domain and sends a request to it. The root DNS Server then returns to the local DNS Server a master DOMAIN name Server (gTLD Server) address for the domain being queried.

5. Master Domain Server (gTLD Server)

The local domain name Server (LDNS Server) sends the request to the gTLD Server returned in the previous step.

The gTLD Server that accepts the request looks up and returns the address of the Name Server that corresponds to the domain Name. This Name Server is usually the domain Name Server that you have registered, such as the domain Name that you applied for with a domain Name service provider. The domain name resolution task is performed by the domain name provider’s server.

Name Server The DNS Server queries the mapping table between domain names and IP addresses. In normal cases, the DNS Server obtains the destination IP address record based on the domain Name and returns it together with a TTL value to the DNS Server.

The following figure summarizes the DNS resolution process described above:

TCP three-way handshake

After obtaining the IP address corresponding to the domain name, the User-Agent (usually a browser) sends a TCP connection request to the WEB program of the server using a random port (1024 < port < 65535).

ARP (Address resolution Protocol) is a protocol for obtaining physical addresses (MAC addresses) based on IP addresses. When a data frame after multiple route to destination network, the router can only know the destination IP address of a data frame, and don’t know the hardware address of the target host, the network layer USES the IP address, but when sending data frames on the actual network link, the hardware address of the final must use the network, need the destination hardware address at this time, ARP is used to obtain the physical address of the host corresponding to the IP address.

Once the connection request (the original Http request is encapsulated in the TCP/IP 4-layer model) arrives at the server (through various routing devices, except lans), goes to the nic, and then to the kernel’s TCP/IP stack (used to identify the connection request, unpack the packet, layer by layer unpack). It may also pass through the Netfilter firewall (a module belonging to the kernel), eventually reaching the WEB application and establishing a TCP/IP connection.

  1. The Client sends a connection probe. SYN = 1 indicates a connection request or connection accept packet, and the datagram cannot carry data. Seq = x indicates the Client’s initial sequence number (seq = 0 indicates packet number 0). In this case, the Client enters the syn_sent state, indicating that the Client waits for the reply from the server.

  2. After the Server receives a connection request packet and agrees to establish a connection, it sends an acknowledgement message to the Client. ACK = X +1 indicates that the sequence number of the first data byte in the next packet segment is X +1 and that all data up to X has been correctly received. (ACK = 1 actually means ACK = 0 +1. Seq = y indicates the initial sequence number of the Server itself (seq = 0 means that this is packet 0 sent from the Server). The server enters syn_RCvd, indicating that the server has received the connection request from the Client and is waiting for confirmation.

  3. After receiving the acknowledgement, the Client sends the acknowledgement again, along with the data to be sent to the Server. ACK 1 indicates that ACK = y + 1 is valid (the first packet is expected to be received from the server), and the Client’s own serial number seq= x + 1 (this is my first packet, as opposed to the 0th packet). The TCP connection enters the Established state and can initiate the request.

Nginx reverse proxy

1. Reverse proxy

In Reverse Proxy mode, a Proxy server receives Internet connection requests, forwards the requests to the Intranet server, and returns the results obtained from the Intranet server to the Internet client. In this case, the proxy server acts as a server to the outside world, while the reverse proxy server acts as the original server to the client and does not require any special Settings for the client.

Reverse proxy functions:

  1. To ensure Intranet security, you can use reverse proxies to provide the WAF function and prevent Web attacks.

  2. Load balancing, using a reverse proxy server to optimize the load on your website.

2. Forward proxy

If there is a reverse proxy, there must be a forward proxy. What is forward proxy?

Forward Proxy is usually referred to as Proxy. It enables users to bypass the firewall and connect to the target network or service when they cannot access external resources.

Forward proxies work like a springboard.

For example: I can’t access Google.com, but I can access A proxy server, A can access Google.com, so I connect to proxy server A, tell it I need google.com content, A goes to fetch it, and then return it to me.

From the site’s point of view, only once is recorded when the proxy server comes to pick up the content, sometimes without knowing it is the user’s request and hiding the user’s information, depending on whether the proxy tells the site or not.

A forward proxy is a server that sits between the client and the origin server. To get content from the original server, the client sends a request to the agent specifying the target (the original server), which then forwards the request to the original server and returns the obtained content to the client.

3. Comparison between forward proxy and reverse proxy

5. Close the TCP connection

Not all web pages do this. For example, wechat web version does not close TCP connection, because other people can send messages to you at any time on wechat. In fact, other people first send messages to the wechat server, and the wechat server pushes the messages to your screen through TCP connection.

Imagine if the web version of wechat closed TCP connection?

The result: if you don’t refresh the page, you’ll never get a message. At the same time, if you’re constantly texting people, you’re constantly creating and closing connections, which can be a drain on resources. So wechat didn’t close the TCP connection at all, so the wechat server could send messages to our browser.

The following figure shows the header information of an Http request packet. Connection: keep-alive indicates that the TCP Connection will not be closed after the request ends.

Of course, not all HTTP requests don’t close the connection. For example, for a blog post, the browser receives the data display, there’s not that much dynamic data, I read it and closed it. At this point, I should close the TCP connection, again depending on the server. All that said, there’s no mention of closing the connection.

Closing a TCP connection is known as the “four wave”, as opposed to the “three handshake” used to establish a TCP connection.

Because the TCP connection is full-duplex, each direction must be closed separately. The principle is that a party can send a FIN to terminate the connection in that direction when it has finished sending its data. Receiving a FIN only means that there is no data flow in that direction, and a TCP connection can still send data after receiving a FIN. The party that closes first performs an active shutdown and the other party performs a passive shutdown.


Conclusion: to be a hard-working, diligent, proactive front-end engineer