HyperText Transfer Protocol (HTTP) is a simple request-response Protocol that usually runs on top of TCP. It specifies what messages the client may send to the server and what responses it may get (Baidu Encyclopedia).

Requests are usually made by browsers to obtain different types of files, such as HTML files, CSS files, JavaScript files, images, videos, etc. HTTP is also the most widely used protocol for browsers

First of all, take a look at the detailed”HTTP Request schematic“To show the stages through which an HTTP request goes in a browser

Let’s take a closer look at the complete HTTP request flow

1. Build the request

First, the browser builds the request line information (as shown below), and once it’s built, the browser is ready to make the web request.

GET /index.html HTTP1.1
Copy the code

2. Search the cache

Before actually making a web request, the browser looks in the browser cache to see if there is a file to request. Browser caching is a technique that saves a copy of a resource locally for immediate use on the next request.

When the browser finds that the requested resource is already in the browser cache, it does not continue to send the request but directly returns the resource in the cache. If the search fails, it enters the network request process. The benefits of doing this include:

  • Relieves server-side stress and improves performance because resources take less time to acquire
  • For web sites, caching is an important part of fast resource loading

For more information on browser caching, please refer to my article: Browser Static Resource Caching mechanisms.

3. Prepare IP addresses and ports

The browser uses HTTP as the application layer protocol to encapsulate the requested text information. And useTCP/IP as the transport layer protocolSend it to the network, so the browser needs to establish a connection with the server via TCP before HTTP work can begin. The following diagram illustrates the relationship between TCP and HTTP.Let’s look at what information is required to establish a TCP connection through the flow of a single packet under TCPAs shown in the figure, the first step to establish a TCP connection is to prepare the IP address and port number. How to obtain the IP address and port number? It depends on what we have, we have a URL, can we use the URL to get IP and port information?

The packets are sent to the recipients via IP addresses. Because IP address is a number identification, but IP address is difficult to remember, but domain name is much easier to remember, so based on this demand emerged a service, responsible for domain name and IP address mapping relationship. The System that maps Domain names to IP addresses is called the Domain Name System (DNS).

So the first step is for the browser to ask DNS to return the IP corresponding to the domain name. Of course, the browser also provides DNS data caching service. If a domain name has been resolved, the browser will cache the result for the next query, which also reduces the network request.

Once you have the IP, the next step is to get the port number. In general, HTTP defaults to port 80 if the URL does not specify a port number.

4. Wait for the TCP queue

TCP connections cannot be established directly after the port and IP address are ready. Because Chrome has a mechanism, a maximum of six TCP connections can be established at the same time under the same domain name. If 10 requests occur at the same time under the same domain name, four of the requests will be queued until the request is completed. If the number of current requests is less than 6, the system directly goes to the next step and establishes the TCP connection.

5. Establish the TCP connection

After the queuing is over, you can establish a TCP connection with the server. Three-way handshake For details about how to establish a TCP connection, see the TCP three-way handshake

For details on network layer protocols, see this article: Illustrating the OSI seven-layer protocol model, TCP/IP four-layer model, and five-layer protocol architecture

6. Send an HTTP request

Once the TCP connection is established, the browser can communicate with the server. The data in HTTP is transferred during this communication.

You can use the following figure to understand the request information sent by the browser to the server

The request line tells the server what resources the browser needs. The most common request methods are Get, POST, PUT and so on. Ok

HTTP request methods are as follows:

  • GET Gets the resource (idempotent)
  • POST New resources
  • HEAD fetch HEAD metadata (idempotent)
  • PUT Updates resources (idempotent with conditions)
  • DELETE Deletes a resource (idempotent)
  • CONNECT Establishing a Tunnel
  • OPTIONS Gets the method the server supports to access the resource (idempotent)
  • The request received by the server can be traced to locate faults. (Security risk)

The request header tells the server some basic information about the browser. For example, information about the operating system used by the browser, the browser kernel, the requested domain name, and the Cookie information of the browser are included

If the request body is to send a POST request, the request body also needs to tell the server the specific content to be transmitted

7. The server processes the request

After the browser sends the request to the server, the server can process the request information

8. The server responds to the request

Once the server has finished processing, the data can be returned to the browser. The data format of the server response is shown belowFirst the server will returnResponse line, including the protocol version and status code. For example, 200 indicates success, and 404 is returned if the page is not found

You can view details about the status codes: The most complete list of HTTP response status codes

The server also sends the response header to the browser along with the response. The response header contains information about the server itself, such as when the server generated the returned data, the returned data type (JSON, HTML, streaming media, and so on), and the Cookie that the server wants to save on the client

After sending the response header, the server can continue sending the data in the response body, which usually contains the actual content of the HTML

9. Disconnect

Normally, once the server returns the request data to the client, it closes the TCP connection. However, if the browser or server adds “Connection: keep-alive” to its header, the TCP Connection will remain open after being sent, so that the browser can continue sending requests over the same TCP Connection. Maintaining a TCP connection saves the time required to establish a connection for the next request and speeds up resource loading. For example, the images embedded in a Web page are all from the same Web site, and if you initialize a persistent connection, you can reuse that connection to request other resources without having to re-establish a new TCP connection

Disconnecting a TCP connection is done by four waves of the hand. See how TCP/IP works.

10. Special case redirection

The response line returns a status code of 301, which tells the browser that I need to redirect to a different url that is contained in the Location field of the response header. The browser then retrieves the address in the Location field and uses that address to navigate. This is a complete redirection execution process

Issues related to

  • Why do many sites open quickly the second time?

The second visit to many sites is instantaneous because the sites cache many resources locally. The browser cache saves time by responding to requests directly with a local copy rather than generating actual network requests. At the same time, DNS data is also cached by the browser, which eliminates the need for DNS queries.

  • How is login status maintained?

If the response header sent by the server contains a set-cookie field, the browser saves the contents of that field locally. The next time a client sends a request to the server, the client automatically adds the Cookie value to the request header and then sends the request. After discovering the Cookie sent by the client, the server will check which client sent the connection request, and then compare the records on the server to obtain the status information of the user