1. What happens when the browser enters the URL to the page display?

  • What does the mechanism behind the browser do when it hits the submit button after entering the URL, and what are the optimizations?

  • In order to get a general understanding of this problem, I have read some blogs on the Internet and recommend two clear ones:

    1. What happens to navigate to the URL: mainly introduce the browser after accepting requests http://igoro.com/archive/what-really-happens-when-you-navigate-to-a-url/ 2 events in sequence. Browser working principle: mainly introduces the browser rendering, and optimization points. Note: https://www.html5rocks.com/zh/tutorials/internals/howbrowserswork/ is only for understandingCopy the code

2. Main steps

2.1 Enter the URL in the Browser

Take typing facebook.com, for example.

2.2 IP Address for Searching for a Domain name: DNS

DNS resolves domain names in two ways: recursive query and iterative query

DNS recursive query search process:

  • Browser cache: browser cache DNS records for a period of time, there is an expiration time, time to live;
  • Operating system cache: If the browser cache does not contain the required records, the browser invokes the operating system cache and hosts files on the local disk. The operating system has its own cache.
  • Router cache: If there is no cache in the operating system, the request continues to the router, which generally has its own DNS cache.
  • ISP DNS cache (local name server cache) : The next place to check is to cache your ISP’s DNS server.
  • Recursive search: THE ISP’s DNS server starts a recursive search, from the root name server, through the.com top-level name server, to the Facebook name server. Typically, the DNS server will have the name of the.com name server in the cache, so there is no need to hit the root name server.

Relieved bottlenecks

  • Cyclic DNS: DNS lookup returns multiple IP addresses. Example: Facebook.com actually maps to four IP addresses.
  • Load balancer: Hardware that listens for a specific IP address and forwards requests to other servers. Major sites often use expensive, high-performance load balancers.
  • Geographic DNS: Improves scalability by mapping domain names to different IP addresses, depending on the geographic location of the client. This is ideal for hosting static content so that different servers don’t have to update the shared state.
  • Anycast: Is a routing technique in which a single IP address is mapped to multiple physical servers. Unfortunately, anycast is not suitable for TCP and is rarely used in this case.

Most DNS servers themselves use anycast to achieve high availability and low latency for DNS lookups. DNS iterative query process: The DNS client itself is the center

What is the difference between DNS recursive query and iterative query?

  • Recursive query is a query activity between DNS clients and servers centered on the local name server. In the process of recursive query, the “sender of the query” changes all the time, and the result is to directly tell the DNS client the destination IP address of the website to be queried.
  • Iterative query is the DNS client itself as the center, is the query activities between each server, the process of iterative query “query submission” does not change, the result is to indirectly tell the DNS client the address of another DNS server.

2.3 The Browser Sends an HTTP request to the server








2.4 Data transmission process

The application layer



The transport layer

  • When the HTTP request at the application layer is ready, the browser initiates a TCP connection at the transport layer to the server.
  • TCP provides reliable byte stream service for transmitting packets.
  • The transport layer divides the bulk data into packet segments for management.
  • TCP ensures transmission security and reliability through three-way handshake.

Blocking TCP: Suppose that the client sends three TCP segments numbered 1, 2, and 3. If packet 1 is lost during transmission, even if numbers 2 and 3 have already arrived, the problem is exacerbated by TCP's need to ensure sequence. HTTP Pipelining allows multiple HTTP requests to be sent over a single TCP. For example, if you send two images, you may have received all the data from the second image, but you have to wait for the data from the first image. In order to solve the performance problems of TCP, the Chrome team proposed the QUIC protocol, which is a reliable transport based on UDP. It reduces round trip time significantly compared to TCP, and has features such as Forward Error Correction. QUIC is now available on Google Plus, Gmail, Google Search, Blogspot, Youtube, and almost all of Google's products, via Chrome ://net-internals/#spdy page to find out.In addition, the browser has a limit on the number of connections to the same domain, most of which are 6. However, it's not true that increasing the number of connections to the same domain will improve performance. The Chrome team did some experiments and found that increasing the number of connections from 6 to 10 actually reduced performance. There are a number of factors contributing to this, such as the overhead of establishing a connection, congestion control, etc. Protocols like SPDY and HTTP 2.0 use only one TCP connection to transfer data, but perform better and prioritize requests.Copy the code

Network layer: THE function of IP protocol is to encapsulate various datagrams segmented by TCP into IP packets and transmit them to the receiver. The MAC address of the receiver, that is, the physical address, is required to ensure that the message can be transmitted to the receiver. The IP address and MAC address are one-to-one. The IP address of a network device can be changed, but the MAC address is usually fixed. ARP resolves AN IP address into a MAC address. When the two communication partners are not on the same LAN, multiple forwarding is required to reach the final destination. During the forwarding process, the MAC address of the next forwarding destination is used to search for the next forwarding destination. Function: “Router” data link layer: After the MAC address of the peer is found, the ENCAPSULATED IP packet is encapsulated into the data frame structure of the data link layer, and the data is sent to the data link layer for transmission, and then sent through the bit stream of the physical layer. Function: switch; Physical layer: bitstream; Functions: “optical fiber interface”, etc.

At this point, the client phase of sending the request ends.

2.5 The Server Receives Data

The data link layer accepts, and the transmission layer reconstructs segmented data packets into original HTTP requests through TCP protocol.

Next render:

  • The server processes and responds to requests
  • Browser rendering
  • The browser sends a request to an object embedded in HTML
  • The browser sends further asynchronous (AJAX) requests

3. The optimal point

Mainly reflected in the DNS cache part and rendering part

The DNS query process goes through many steps. If this is the case every time, it will consume too much time and resources. So we should return the real IP address as soon as possible to reduce the query process, also known as DNS cache. After the browser obtains the IP address, it is generally added to the browser cache. The local DNS cache server can also record the IP address. In addition, every day hundreds of millions of net name access demand, a second tens of millions of requests domain name server to meet? DNS load balancing. Often, our website uses a variety of cloud services, or a variety of service providers provide similar services, and they help us solve these problems. DNS systems provide efficient and fast DNS resolution services based on the capacity of each machine, geographical limitations (long distance transmission efficiency), and so on.

4. To summarize

1. Understand the internal working mechanism of the browser when it sends an HTTP request and receives the request from the server; 2. Have an understanding of TCP/IP network protocol in netbook at application level: the meaning of stratification lies in division of labor and cooperation;

  • The data link layer ensures the transmission of data packets between two neighboring hosts through CSMA/CD protocol.

  • Network layer IP packets through the routing algorithm and routing forwarding of routers between different subnets, to ensure the point-to-point communication between two remote hosts on the Internet, but this transmission is not reliable, so the reliability is guaranteed by TCP protocol on the transport layer.

  • TCP through slow start, multiplication reduction and other means to control the flow and avoid congestion, at the same time to provide two remote hosts process to process communication;

  • Finally, the HTTP request header can be received by the HTTP server process that is listening on the remote server.

  • Finally, the data packet was disassembled and encapsulated between hops, and was forwarded again between subnets; Finally entered the server operating system buffer;

  • The server’s operating system thus sends a return to the blocked Accept function, waking it up. 3. The use of the three-way handshake is clearer;

    Recently, I reviewed the network model and the three-way handshake protocol for the interview, but I was confused about the application scenarios of the network model and the handshake protocol. Because of my lack of understanding, IT is easy to forget that one of my senior students happened to ask me about the process of sending requests to the front and back end of the browser, so I did some understanding. I hope this combing can bring help to you who are also unfamiliar with computer network.Copy the code

Reference: www.jianshu.com/p/d616d8879… Github.com/sunyongjian…