What happens between the time we enter the URL and press Enter and the time we see the results of the web page? In other words, a web page, to go through what process, to reach the user? Here’s a look at some of the details.

Keyboard and hardware interrupt

When it comes to typing urls, of course, it starts with typing by hand. For the keyboard, the most common keyboard used in life have two kinds: membrane keyboard, mechanical keyboard.

  • Thin film keyboard: panel, upper circuit, isolation layer, lower circuit. It is the most popular type of keyboard with beautiful appearance, long life and low cost. There is a whole double-layer film in the keyboard, through the film to provide the resilience of the key, using the film is pressed at the key carbon center of the line contact to control the trigger of the key.
  • Mechanical keyboard: by key cap, mechanical shaft composition. The keyboard strikes a strong sense, common in game enthusiasts and typing enthusiasts. Each key has a separate mechanical contact switch, which uses a cylindrical spring to provide the key’s resilience and metal contact to control the trigger of the key.

Schematic diagram of independent contact of mechanical keyboard

The keyboard sends signals to the operating system that trigger hardware interrupt handlers. Hardware interrupt is a very important signal processing mechanism in operating system to improve system efficiency and meet real-time requirements. It is an asynchronous signal and provides the registry (IDT) and request line (IRQ) of relevant interrupts. When the keyboard is pressed, the signal will be input to the operating system through the request line. The CPU responds to the interrupt according to the registry and the signal after the end of the current instruction and uses the segment register to load the interrupt program entry address. See operating system and assembly books for details.

, of course, this article mainly not introduces the details of the hardware and operating system, the introduction, just trying to make from the input URL into a browser to show too much related to the underlying knowledge between the results page, with a heart of fear and there is no detail in the limited space, so this article will focus on the perspective of a slightly higher, What happens between the time the browser sends the request for us and the time we see the page display complete, such as DNS resolution, browser rendering.

Browser parsing URL

Before you hit the Enter key

For example, when I press a ‘B’ key, there will be many urls for me to choose. The first one is Baidu. So when the browser receives this message, it triggers the browser’s auto-complete mechanism, which will show you the relevant URL that best matches the search you’ve visited before, based on a specific algorithm for the user to select.

After you hit the Enter key

According to the keyboard trigger principle, a current loop dedicated to the enter key is closed in a different way. A hardware interrupt is then triggered, and the operating system kernel handles the interrupt. The process is omitted and the browser is handed a “return” signal. Browsers (Chrome 61 in this article) do the following but not limited to the following cool (messy) steps:

  1. Parse URL: Did you enter a network resource starting with HTTP or HTTPS/a file resource starting with file/a keyword to be searched? The browser then performs the corresponding resource loading process
  2. URL transcoding: According to the RFC standard, some characters can be directly used in urls without transcoding, but Chinese characters are not included. So if Chinese characters will be contained in the url path transcoding, such as zh.wikipedia.org/wiki/HTTP%E… Convert zh.wikipedia.org/wiki/HTTP%E…
  3. HSTS: Given the security legacy of HTTPS, most modern browsers already support HSTS. In the case of the browser, the browser checks whether the network resource exists in the pre-configured list of HTTPS-only sites or whether it has saved records of previously visited HTTPS-only sites. If so, the browser will force HTTPS access to the site.

The DNS

Do not check DNS, read cache

  • Cache in browser: For Chrome, the cache view address is: Chrome ://net-internals/# DNS
  • Local hosts file: The hosts file is stored in /etc/hosts, for example, on the Mac and Linux. Therefore, one way to bypass the firewall is to modify the hosts file to avoid THE INTERFERENCE of THE GFW on DNS resolution and directly access the real IP address. However, it does not take full effect because the GFW has a mechanism of filtering by IP address.

Send a DNS lookup request

The DNS search method is as follows: root domain name -> top-level domain name -> secondary domain name -> host name. For a URL, see the following

Host name. secondary domain name. Top-level domain names. The root domain name -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - host. SLD. TLD. The rootCopy the code

The query procedure is as follows:

  1. Example Query the local DNS server. The local DNS server address is the DNS address specified by the router when connecting to the network. Generally, it is the router address automatically assigned by the DHCP server and stored in /etc/resolv.conf. The DNS forwarder of the router forwards requests to the DNS of the upper-layer ISP.
  2. Query the NS record and A record (IP address) of the top-level DNS server from the root DNS server. There are 13 groups of root DNS servers in the world, ranging from A.ROOT-SERVERS.NET to M.ROOT-SERVERS.NET. The IP addresses of these root DNS servers have been stored in the local DNS server.
  3. Search NS record and A record (IP address) of secondary DNS server from top-level DNS server
  4. Get the IP address of “host name” from “secondary DNS”

Take www.google.com as an example, the following is the entire DNS query process:

  1. Since this test is performed on the instance of Ali Cloud, the mapping relationship of all root DNS servers is firstly searched from the Internal DNS server of Ali at 100.100.2.138.
  2. Visit the root domain name server (f.root-servers.net) and get the NS record and IP address of the COM top-level domain name server.
  3. Access the top-level domain name server (e.gtld-servers.net) and get the NS record and IP address of google.com’s secondary domain name server.
  4. Go to the secondary DNS server (ns2.google.com) and get the IP address of www.google.com

So in general, DNS resolution is a narrowing down of the search process.

Establish HTTPS and TCP connections

Determine the sending target

After obtaining the IP address, you also need to obtain the MAC address of the server. According to the Ethernet protocol, to communicate directly with another host on the same LAN, you must know the MAC address of the target host. Therefore, the MAC address is obtained based on ARP (a TCP/IP protocol that obtains physical addresses based on IP addresses), and the MAC address is saved to the local ARP cache before communication with the target host begins. See DHCH/ARP for details.

Establishing a TCP Connection

Why do you have to shake hands three times?

  • The first and second handshake complete means that A can send A request to B, and B can parse A’s request
  • The second and third handshake complete means that A can parse B’s request and B can send the request to A

This ensures that A and B can both send and receive parsing requests to each other. At the same time, the problem of repeated connection caused by network delay is avoided. For example, A sends A connection request but the network delay causes this request after A resends the connection request and completes the communication with B. If there are three handshakes, A will not ignore the establishment request returned by B.

Short connection and long connection?

The figure above shows the process of A short connection. For A long connection, the connection between A and B will not be actively closed after A read/write operation, and the subsequent read/write operation will continue to use the connection. In addition, because the realization of the long connection, it is difficult to require long connection when no data communication, timing to send packets (heart), in order to maintain connection state, and the connection to the server pressure will be very long, so push service for general developer is very difficult to achieve, so there is a lot of different large manufacturers information push service.

Perform TLS encryption

  • Hello – The handshake starts when the client sends a Hello message. Contains all the information a server needs to connect to a client over SSL, including the various ciphers supported by the client and the maximum SSL version. The server also returns a Hello message containing similar information that the client needs, including which encryption algorithm and SSL version is used.
  • Certificate exchange – Now that the connection is established, the server must prove its identity. This is implemented with an SSL certificate, like a passport. SSL certificates contain a variety of data, including the owner name, associated attributes (domain name), the public key on the certificate, the digital signature, and information about the validity of the certificate. The client checks that it is authenticated by the CA. Note that the server is allowed to require a certificate to prove the client’s identity, but this only happens in sensitive applications.
  • Key exchange – Uses the RSA asymmetric public key encryption algorithm (the client generates a symmetric key and encrypts the symmetric key using the server public key contained in the SSL certificate). It is then sent to the server, which decrypts it with the server’s private key, at which point the handshake phase is complete. Or the DH exchange algorithm determines the key to be used by the client and server. The key is a simple and symmetric key that both sides agree on. This process is based on the asymmetric encryption mode and the public/private key of the server.
  • Encrypted communication – Encrypts the actual information between the server and client using a symmetric encryption algorithm, which algorithm is determined during the Hello phase. Symmetric encryption algorithms use a key that is simple for both encryption and decryption, based on what has been negotiated between the client and server in step 3. As opposed to asymmetric encryption algorithms that require public/private keys.

Server-side processing

Static cache, CDN

In order to optimize website access speed and reduce server pressure, static files such as HTML, JS, CSS and files are usually placed on an independent cache server or deployed on a CDN cloud service like Amazon CloudFront. Then, according to the cache expiration configuration, it is determined whether the access will request the source server to update the cache.

Load balancing

There are a variety of specific implementation of load balancing, including F5 directly based on hardware, LVS on the operating system transport layer (TCP), and reverse proxy (also called seven-layer proxy) implemented at the application layer (HTTP). The following is a brief introduction to the latter.

Before the request is sent to the server that actually handles the request, the request needs to be routed to the appropriate server. Once a request is received by the load balancer, some processing needs to be done, such as the compression of the request (in nginx, the gzip compression format is configured by default in nginx.conf. If the amount of data required is not very detailed, the default configuration can meet basic requirements), receives the request (send the request to the Server after receiving the request, improving the processing efficiency of the Server), and then sends the request to a background Server according to the predetermined routing algorithm.

The reverse Proxy needs to be mentioned. First, let’s review the principle of reverse Proxy. Forward Proxy tells Proxy about the resources it wants to access and lets Proxy get the data for you. The reverse Proxy also tells the Proxy about the resources it wants to access, and lets the Proxy get the data for you and return it to you. However, the Proxy serves the Server and sends the request to an appropriate Server after receiving it. In this case, the Client does not know the rules and which Server serves it. Therefore, the reverse proxy is used for load balancing and security control.

Server processing

For HTTPD(HTTP daemons) to be deployed on a server, the most common HTTPDS are Apache and Nginx, commonly used on Linux. For the Java platform, Tomcat is the Servlet container implementation that Spring Boot also chooses by default. Tomcat handles requests as follows:

  1. The request arrives at the TCP port that Tomcat listens on when it starts.
  2. After parsing the various information in the Request, create a Request object and populate it with information that might be used by the referenced Servlet, such as parameters, headers, cookies, query strings, and so on.
  3. Create a Response object that the referenced Servlet uses to send a Response to the client.
  4. Call the Servlet’s Service method and pass in Request and Response objects. Here the Servlet takes the value from the Request object and writes the value to the Response.
  5. Further processing (business processing, further processing of requests) based on our own Servlet programs or Servlet classes carried by the framework
  6. Finally, the corresponding HTTP Response message is generated according to the Response returned by the Servlet.

Browser rendering

The function of the browser is to retrieve the resources you want from the server and display them in the browser window. Resources are usually HTML files, but they can also be PDFS, images, or other types of content. Other types of plug-ins (browser extensions) can also be displayed. For example, display PDF using the PDF browser plug-in. The location of resources is determined by the Uniform Resource Identifier (URI) provided by users.

The way browsers interpret and display HTML files is detailed in the HTML and CSS standards. These standards are maintained by the World Wide Web Consortium (W3C), a Web standards organization.

Using Webkit, the browser engine used in Chrome, as an example, here is a brief introduction to browser rendering. Detailed analysis and rendering will involve a lot of details, please refer to the HTML5 rendering specification and the corresponding page GPU rendering implementation.

HTML parsing

After the browser gets the HTML document, it needs to call the HTML Parser in the browser engine to parse the HTML document into a DOM tree for external interfaces (JS) to call.

  • Document content parsing: Before parsing a large string into the DOM, you need to parse the structured information from it so that the HTML parser can easily extract the data for other operations, so parsing the document content is the first step. The parser has two processes — lexical analysis (dividing a string into symbols that conform to a particular syntax) and parsing (building a syntax tree of the document based on symbols that conform to a particular syntax).
  • HTML parsing: According to THE HTML syntax, the HTML markup to the syntax tree to build a DOM(Document Object Model).

CSS analytical

  • Analyze the content of CSS files and tags and the value of the style attribute based on CSS lexical and syntax
  • Each CSS file is parsed into a StyleSheet Object, which contains CSS rules with selectors and objects corresponding to the CSS syntax

Page rendering

After parsing, the browser engine constructs the render tree from the DOM tree and the CSS Rule tree. Layout is a collection of position, overflow and Z-index attributes. Layout is the result of calculating the position and style of each element.

Next, render the page (which can be interpreted as “draw” elements) according to the render tree.

Of course, the rendering process will be completed and displayed on the screen will involve graphics card drawing, video memory modification, interested readers can further understand.

Welcome to follow the program meow kadun wechat official account: program meow Kadun, get the latest free information, consultation ~