HTTP request flow diagram

As you can see from the figure, the HTTP request in the browser goes through the following eight stages from initiation to termination: build the request, find the cache, prepare the IP and port, wait for the TCP queue, establish the TCP connection, initiate the HTTP request, the server processes the request, the server returns the request, and disconnect the connection.

1. Initiate a network request

1. User input

1. User input

When a user enters a query keyword in the address bar, the address bar determines whether the entered keyword is the search content or the requested URL.

  • For search content, the address bar uses the browser’s default search engine to synthesize new urls with search keywords.
  • If the input content complies with THE URL rules, for example, time.geekbang.org, the address bar combines the content with the protocol according to the rules to create a complete URL, such as time.geekbang.org.

2. After the user enters the content, press Enter. The browser navigation bar displays loading state, but the previous page is displayed, because the response data of the new page has not been obtained.

Browser process The browser builds the request line and sends the URL request to the network process via interprocess communication (IPC).

2. URL request process

Next, you enter the page resource request process. In this case, the browser process sends the URL request to the network process through interprocess communication (IPC). After receiving the URL request, the network process initiates the actual URL request process here.

1. URL request process

First, the network process looks up whether the local cache has cached the resource. If there is a cached resource, it is returned directly to the browser process.

If the resource is not found in the cache, the network request flows directly.

The first step before the request is to perform a DNS resolution to get the server IP address for the requested domain name. If the request protocol is HTTPS, you also need to establish a TLS connection.

The next step is to establish a TCP connection with the server using the IP address. After the connection is established, the browser side will construct the request line, request information, etc., and attach the data related to the domain name, such as cookies, to the request header, and then send the constructed request information to the server.

2, caching,

Before actually making a web request, the browser looks in the browser cache to see if there is a file to request. Among them, browser caching is a technique for saving a copy of a resource locally for immediate use on the next request.

When the browser discovers that the requested resource already has a copy in the browser cache, it intercepts the request, returns a copy of the resource, and ends the request without going to the source server for a new download. The benefits of this are:

  • Alleviating server-side stress and improving performance (less time to acquire resources);
  • For web sites, caching is an important part of fast resource loading.

Of course, if the cache lookup fails, it enters the network request process.

What data will be cached? How do browsers cache? How do network requests look up in the cache?

1. Why do many sites open quickly the second time?

If the second page is opened quickly, the main reason is that some time-consuming data was cached during the first page loading.

So what data is cached? As you can see from the core request path above, the DNS cache and the page resource cache are two pieces of data that are cached by the browser.

The DNS cache is relatively simple. It mainly associates the corresponding IP and domain name locally in the browser.

After obtaining the IP address of the domain name, the system automatically searches for the IP address corresponding to the domain name in the hosts file. Once the IP address is found, the system establishes a TCP connection with the server. If no IP address is found, the system submits the IP address to the DNS server for IP address resolution. I’m not going to do too much analysis here.

Let’s focus on the browser resource cache. Here’s how it works:

Schematic diagram of cache lookup process:

First, how does the server get the browser to cache data?

As you can see from the first request above, when the server returns an HTTP response header to the browser, the browser uses the cache-Control field in the response header to set whether or not to Cache the resource. Usually, we also need to set a Cache expiration time for this resource, which is set by the max-age parameter in cache-control, such as 2000 seconds.

Cache-Control:Max-age=2000
Copy the code

This means that the cached resource will be returned to the browser if it is requested again if it has not expired.

But if the cache expires, the browser continues to make web requests with HTTP headers:

If-None-Match:"4f80f-13c-3a1xb12a"
Copy the code

After receiving the request header, the server determines whether the requested resource has been updated based on the if-none-match value.

  • If there is no update, the 304 status code is returned, which is equivalent to the server telling the browser: “This cache can continue to be used and it won’t send you data again this time.
  • “If the resource is updated, the server returns the latest resource directly to the browser.

HHTP cache

Developer.mozilla.org/zh-CN/docs/…

Take a look at this. I will organize a cached article and attach a link to it.

Cache-control and Expires are used to set the Cache expiration time. “All representations of resources can be cached in seconds”, or “max-age”. S-maxage: the same as max-age, but only for proxy caches; 3) public: indicates that the response can be cached by any cache; 4), private: can only be for individual users, and can not be cached by proxy server; 5) no-cache: Forces the client to send requests directly to the server, that is, each request must be sent to the server. The server receives the request and determines if the resource has changed, returns new content if it has, or 304 unchanged if it has not. This can be very misleading and can be mistaken for a response that is not cached. Cache-control: no-cache is actually cached, but the Cache evaluates the validity of the cached response to the server each time it provides response data to the client (browser). 6) no-store: disables all caching (this is what the response is not cached). Cache-control takes precedence over Expires. Etag is an HTTP 1.1 attribute that is generated by the server and returned to the front end. ETag Entity tag: Generally, it is the hash value of the resource entity. The first time you make an HTTP request, the server returns an Etag, and the second time you make the same request, the client sends an if-none-match, whose value is the Etag value (set here by the requesting client). The server then checks whether the Etag sent by the client is the same as the server’s. If so, it sets if-none-match to false and returns status 304. The client continues to use the local cache and the server does not return data. If not, if-none-match is set to true, the return status is 200, and the client reparses the data returned by the server. Last-modified Indicates the time when the response resource was Last Modified on the server. 1) The Last modification of the last-Modified tag can only be accurate to the second level. If some files have been Modified more than once in less than one second, the last-Modified tag cannot accurately mark the modification time of the file. 2) If some files are generated regularly, and sometimes the contents are unchanged but last-Modified, the file cannot be cached; 3) The server may not accurately obtain the file modification time, or the time is inconsistent with that of the proxy server. Etag has a higher priority than last-Modified.

In short, many sites are able to get second visits in seconds because they cache a lot of their resources locally. The browser cache saves time by responding to requests directly with a local copy rather than generating actual network requests. DNS data is also cached by the browser, which eliminates DNS queries.

3. IP connection (DNS resolution)

If the previous cache has no resources or is out of date or invalid, new network requests can only be made.

The first step before the request is to perform a DNS resolution to get the server IP address for the requested domain name. If the request protocol is HTTPS, you also need to establish a TLS connection.

The network process requests the DNS to return the IP address and port number corresponding to the domain name. If the DNS data caching service has cached the current domain name information before, the DNS data caching service directly returns the cached information. Otherwise, a request is made to obtain the IP address and port number resolved based on the domain name. If there is no port number, the default value is 80 for HTTP and 443 for HTTPS. If the request is HTTPS, you need to establish a TLS connection.

DNS article reference:

Mp.weixin.qq.com/s/WvL_d54Ot…

4. TCP connection

The next step is to establish a TCP connection with the server using the IP address.

Deliver the data to the application in its entirety.

For browser requests and mail applications that require reliability of data transmission, UDP has two problems:

  • Data packets are easily lost during transmission.
  • Large files are broken into smaller packets for transmission. These packets take different routes and arrive at the receiver at different times. UDP does not know how to assemble these packets into a complete file.

Based on these two issues, we introduced TCP. Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream – based transport layer communication Protocol. Compared with UDP, TCP has the following characteristics:

  • TCP provides a retransmission mechanism for packet loss.
  • TCP introduces the packet sorting mechanism to ensure that out-of-order packets are combined into a complete file.

Like UDP headers, the TCP header contains the destination port and the local port number, as well as a sequence number for sorting, so that the receiver can reorder the packet by the sequence number.

Let’s take a look at the transmission flow of a single packet under TCP:

A simplified four-layer transmission model for TCP networks

The figure above should give you an idea of how a packet is transmitted over TCP. The transmission flow of a single TCP packet is similar to that of UDP. The difference is that the information in the TCP header ensures the integrity of a large piece of data.

Let’s take a look at the complete TCP connection process to see how TCP guarantees retransmission and packet ordering.

As can be seen from the following figure, a complete TCP connection life cycle consists of three phases: “Establish a connection”, “transfer data” and “disconnect”.

The lifetime of a TCP connection

  • First, establish the connection phase. This phase establishes the connection between the client and server through a “three-way handshake.” TCP provides connection-oriented communication transport. Connection-oriented refers to the preparation work between the two ends before data communication begins. The three-way handshake means that when a TCP connection is established, the client and server send a total of three packets to confirm the connection.
  • Secondly, the data transmission stage. At this stage, the receiving end needs to confirm each packet. That is, after receiving the packet, the receiving end needs to send the confirmation packet to the sender. Therefore, if the sender does not receive the confirmation message from the receiver within a specified period after sending a data packet, the packet is considered lost and the retransmission mechanism is triggered. Similarly, a large file is divided into many small packets during transmission. After these packets arrive at the receiving end, the receiving end sorts them according to the sequence number in the TCP header to ensure complete data.
  • Finally, the disconnect phase. Once the data is transferred, the connection is terminated, which involves the final stage of “four waves” to ensure that both parties are disconnected.

By now you can see that TCP has sacrificed packet speed to ensure reliable data transmission, because “three-way handshake” and “packet verification” doubled the number of packets in the transmission process.

The next step is to establish a TCP connection with the server using the IP address. After the connection is established, the browser side will construct the request line, request information, etc., and attach the data related to the domain name, such as cookies, to the request header, and then send the constructed request information to the server.

Note:

Chrome allows a maximum of six TCP connections to be made to the same domain name. If 10 requests are made to the same domain name, four of the requests will be queued until the requests are completed. Of course, if the number of current requests is less than 6, the next step is to establish a TCP connection.

HTTP2 has only one TCP connection, HTTP2 can request resources in parallel, HTTP2 requests are concurrent, can handle many requests at once.

2. Like the UDP header, the TCP header contains the destination port and the local port number. It also provides a sequence number for sorting, so that the receiver can reorder the packet by the sequence number.

3. TCP has a retransmission mechanism. If the sender sends a data packet but does not receive the confirmation message from the receiver within a specified period of time, the packet is judged to be lost and the retransmission mechanism is triggered.

4, first through the three handshake to establish a TCP link, the link is established, send HTTP request line and HTTP request header to the server, and then the server returns the response line, response header and response body, finally completed by four waves to disconnect THE TCP link!

5. HTTP requests

When a TCP connection is established, communication can take place on this basis, and it is during this communication that data in HTTP is transferred, and the browser sends HTTP requests to the target server.

The request includes:

  • The request line
  • Request header
  • Request body

or

First, the browser sends the request line to the server, which contains the request method, Uniform Resource Identifier (URI), and HTTP version protocol.

Sending a request line tells the server what resources the browser needs. The most common request method is Get. For example, typing the geek Time domain name (time.geekbang.org) directly into the browser address bar tells the server to Get its home page resources.

Sending a request line tells the server what resources the browser needs. The most common request method is Get. For example, typing the geek Time domain name (time.geekbang.org) directly into the browser address bar tells the server to Get its home page resources.

Another common request method is POST, which is used to send some data to the server. For example, when you log in to a website, you need to send the user information to the server through POST. If the POST method is used, the browser also prepares data for the server, which is sent in the request body.

After the browser sends the request line command, it also sends additional information in the form of a request header, which tells the server the basic information of the browser. For example, it contains the operating system used by the browser, the browser kernel and other information, as well as the domain name information of the current request, Cookie information of the browser, and so on.

6. Summary

Before we look at network requests, we need to look at the relationship between HTTP and TCP. Because the browser uses HTTP protocol as the application layer protocol, used to encapsulate the requested text information; It uses TCP/IP as the transport layer protocol to send it to the network, so the browser needs to establish a connection with the server over TCP before HTTP work can begin. This means that HTTP content is implemented through the TCP data transfer phase. You can better understand the relationship between the two by combining the following diagram.

The relationship between TCP and HTTP

2. Respond to the request

After receiving the request information, the server generates response data (including response line, response header, and response body) based on the request information and sends it to the network process. After the network process receives the response line and header, it parses the contents of the header. (For the sake of illustration, I refer to the response headers and response rows returned by the server as response headers below.)

1. Return the request

Once the server has finished processing, the data can be returned to the browser. Curl curl curl curl curl curl curl curl curl curl curl curl curl curl curl curl curl curl curl

curl -i https://time.geekbang.org/
Copy the code

Note that the -i is used to return the data for the response row, header, and body, as shown below. You can use this data to understand how the server is responding to the browser.

The data format of the server response:

First the server returns a response line, including the protocol version and status code.

However, not all requests can be handled by the server, so what about some messages that cannot be processed or processed incorrectly? The server tells the browser what it has done with the status code on the request line, for example:

  • The most commonly used status code is 200, indicating that the processing is successful.
  • If the page is not found, a 404 is returned.

There are many types of status codes, and I won’t go over them here. There are many online materials that you can search and learn by yourself. Then, just as the browser sends the request header along with the request, the server sends the response header to the browser along with the response. The response header contains information about the server itself, such as when the server generated the returned data, the returned data type (JSON, HTML, streaming media, and so on), and the Cookie that the server wants to save on the client.

After sending the response header, the server can continue sending the data in the response body, which usually contains the actual content of the HTML.

This is how the server responds to the browser.

2. Redirect

Upon receiving the response header from the server, the network process begins to parse the response header. If the status code returned is 301 or 302, the server needs the browser to redirect to another URL. The network process reads the redirected address from the Location field in the response header, then initiates a new HTTP or HTTPS request and starts all over again.

For example, we enter the following command in the terminal:

curl -I http://time.geekbang.org/
Copy the code

The curl -i + URL command receives the information in the response header returned by the server. After executing the command, we see the following response header returned from the server:

The response line returns the status code 302

As you can see from the figure, the Geek time server converts all HTTP requests into HTTPS requests through redirection. This means that when you make an HTTP request to the Geektime server, the server will return a response header containing either a 301 or 302 status code and fill in the Location field of the response header with an HTTPS address, which tells the browser to navigate to the new address.

Let’s make a geek time request using HTTPS to see what the server’s response header looks like.

curl -I https://time.geekbang.org/
Copy the code

We see the following message from the server:

The response line returns the status code 200

Ok, so that’s the redirection. Now you should understand that during navigation, if the status code in the response line of the server contains a jump of 301 or 302, the browser will go to the new address to continue navigation. If the response line is 200, then the browser can continue processing the request.

3. Processing of response data types

After processing the jump information, we continue the analysis of the navigation flow. The data type of the URL request, sometimes a download type, sometimes a normal HTML page, so how do browsers distinguish between them?

The answer is content-type. The Content-Type is a very important field in the HTTP header that tells the browser what Type of response body data the server is returning. The browser then uses the value of the Content-Type to decide how to display the response body Content.

1. Content-type contains HTML format

Let’s take geek Time as an example and see what the Content-Type value is returned by Geek Time. Enter the following command on the terminal:

2. Content-type with stream format

Curl curl curl curl curl curl curl curl curl curl curl

The curl -i https://res001.geekbang.org/apps/geektime/android/2.3.1/official/geektime_2.3.1_20190527-2136_offical.apkCopy the code

The content-Type value of the response header is Application/OCtet-stream, and the data displayed is a byte stream. Normally, the browser will handle the request according to the download Type.

Note that if the content-type is incorrectly configured on the server, such as setting the text/ HTML Type to application/octet-stream, the browser may misinterpret the file’s Content, for example, by turning a page intended for presentation, It becomes a download file.

Therefore, the subsequent processing flow of different Content-Types is quite different. If the value of the Content-Type field is determined by the browser to be a download Type, the request is submitted to the browser’s download manager and the navigation of the URL request ends. But if it’s HTML, the browser will continue with the navigation process. Since Chrome’s page rendering runs in the render process, the next step is to prepare the render process. 3. Prepare the rendering process

4. Disconnect

Normally, once the server returns the request data to the client, it closes the TCP connection. But if the browser or server adds the following header:

Connection:Keep-Alive 
Copy the code

The TCP connection will remain open after being sent, so the browser can continue sending requests over the same TCP connection. Maintaining a TCP connection saves the time required to establish a connection for the next request and speeds up resource loading. For example, the images embedded in a Web page are all from the same Web site, and if you initialize a persistent connection, you can reuse that connection to request other resources without having to re-establish a new TCP connection.

3. Page rendering

1. Same site

By default, Chrome assigns a render process to each page, meaning that a new render process is created for each new page opened. However, there are some exceptions, in some cases the browser will allow multiple pages to run directly in the same render process.

For example, I opened up another page from geek Time’s home page, Algorithmic Boot Camp. Let’s take a look at Chrome’s task Manager screenshot below:

Multiple pages run in a render process

As you can see from the figure, the three open pages are all running in the same render process with the process ID 23601.

When can multiple pages be running in a render process at the same time?

To solve this problem, we need to understand what same-site is. Specifically, we define “same site” as the root domain (for example, geekbang.org) plus the protocol (for example, https:// or http://), plus all the subdomains and different ports under the root domain, such as the following three:

https://time.geekbang.org
https://www.geekbang.org
https://www.geekbang.org:8080
Copy the code

They all belong to the same site because their protocol is HTTPS and the root domain name is geekbang.org.

Chrome’s default strategy is one render process per TAB. However, if a new page is opened from one page and belongs to the same site as the current page, the new page will reuse the parent page’s rendering process. Officially, this default policy is called process-per-site-instance.

What happens when the new page and the current page are not on the same site? For example, I went to InfoQ’s official website (www.infoq.cn/) via a link on geekbang.org. Since infoq.cn and Geekbang.org are not part of the same site, infoq.cn will use a new rendering process, as you can see below:

Different sites use different rendering processes

As can be seen from the task manager in the figure, because the tabs of Geek Bang and Geek Time have the same protocol and root domain name, they belong to the same site and run in the same rendering process. Infoq.cn has a different root domain from Geekbang.org, meaning that Infoq and Geekbang are not part of the same site, so they run in two different rendering processes.

To summarize, the rendering process strategy used to open a new page is:

  • Typically, a separate rendering process is used to open a new page;
  • If page B is opened from page A, and A and B belong to the same site, then page B reuses the rendering process of page A; If otherwise, the browser process creates a new renderer for B.

Once the renderer process is ready, it cannot immediately enter the document parsing state because the document data is still in the network process and has not been submitted to the renderer process, so the next step is to submit the document.

2. Submit documents

Submitting a document (or submitting a navigation) means that the browser process submits the HTML data received by the web process to the renderer process. The process looks like this:

  • First, when the browser process receives the response header data from the network process, it sends the message “submit the document ****” to the renderer process.
  • After receiving the “submit document ****” message, the rendering process establishes a “pipeline” with the network process to transfer data;
  • After the document data transfer is complete, the renderer process returns a “confirm submit” message to the browser process.
  • After receiving the “confirm submission” message, the browser process updates the browser interface status, including the security status, the URL of the address bar, the historical status of forward and backward, and the Web page.

Where, when the rendering process confirms the submission, the update content is as shown below:

This explains why, when you type an address into your browser’s address bar, the previous page doesn’t disappear immediately, but instead takes a while to load before the page is updated.

At this point, a complete navigation flow is “gone”, after which it is time to enter the rendering phase.

3. Rendering stage

Once the document is submitted, the rendering process begins page parsing and subresource loading,

General process of rendering page:

The browser takes the HTML file and parses it to form a DOM Tree. At the same time, it performs CSS parsing to generate CSSOM, and then combines the DOM Tree and CSSOM into a Render Tree(or layout Tree), and finally draws the page.

The only thing you need to know here is that once the page is generated, the renderer process sends a message to the browser process, and when the browser receives the message, it stops the loading animation on the TAB icon to show the page. As follows:

At this point, a complete page is generated. The “What happens between entering the URL and presenting the page?” question at the beginning of the article. This process and its “cascade” problem are solved.

Conclusion:

Ok, that’s all for today. Let me briefly summarize the main points of this article:

  • The server can control the behavior of the browser based on the response header, such as jump, network data type determination.
  • Chrome defaults to one render process for each TAB, but if two pages belong to the same site, both tabs will use the same render process.
  • The browser navigation process covers all the intermediate stages from the user initiating the request to submitting the document to the rendering process.

The navigation process is important. It is a bridge between the web loading process and the rendering process. If you understand the navigation process, you will be able to string together the entire page display process, which is the key to understanding how the browser works.

Page rendering stage, this article is not detailed, please refer to the article:

1. How do HTML, CSS, and JavaScript become pages? Juejin. Cn/post / 699018…

DOM trees: How do JavaScript and CSS affect DOM tree building and rendering? Juejin. Cn/post / 699018…

Supplementary notes:

Browsers have four types of processes. Browser process, Network process, renderer process, Gpu process,

The main responsibilities of the browser process, renderer process, and web process are:

  • Browser processes are responsible for user interaction, child process management, and file storage.
  • Web processes are web downloads for renderers and browser processes.
  • The rendering process is mainly responsible for parsing HTML, JavaScript, CSS, images and other resources downloaded from the Web into pages that can be displayed and interacted with. Because all the contents of the renderer process are obtained through the network, there will be some malicious code to exploit browser vulnerabilities to attack the system, so the code running in the renderer process is not trusted. This is why Chrome makes the rendering process run in a security sandbox, just to keep the system safe.

Consider:

1. If the input content complies with URL rules, such as time.geekbang.org, the address bar will combine this content with the protocol according to the rules to synthesize a complete URL, such as time.geekbang.org.

How does the browser know whether to add HTTP or HTTPS?

2. The browser process submits documents to the renderer process, but at this time, the data that the server responds to is in the network process. What is the data transmission process like? Web process-browser process-renderer, or direct Web process-renderer?

3, the browser can open multiple tabs at the same time, do they have the same port? If so, how does the data know which TAB to go to?

4, the current browser can open multiple tabs at the same time, do they have the same port? If so, how does the data know which TAB to go to?

5, TCP transfer data browser side to do the rendering process? If the first packet is lost, should the second packet come first? How to deal with the same kind of real-time rendering? For the sequential nature of packets?

Reference article:

1, TCP protocol: how to ensure that the page file can be fully delivered to the browser?

Time.geekbang.org/column/arti…

2. HTTP request flow: Why do many sites open quickly the second time?

Time.geekbang.org/column/arti…

3. Navigation flow: What happens between entering the URL and presenting the page?

Time.geekbang.org/column/arti…

4. Front end test – what happens in the process from the input URL to the completion of page loading and display?

zhuanlan.zhihu.com/p/53351608

Interviewer: Tell me what happens when you enter the URL in the address bar and press Enter.

Mp.weixin.qq.com/s/Ql1tD-YJS…

6. How do HTML, CSS and JavaScript become pages?

Juejin. Cn/post / 699018…

DOM trees: How do JavaScript and CSS affect DOM tree building and rendering?

Juejin. Cn/post / 699018…