What happens from URL input to page rendering?

preface

This is a common question in the interview process.

Think from the interviewer’s point of view:

Frequently, this may be because interviewers usually like to ask questions that examine both depth and depth
Many interview managers like to continue to ask us according to the candidates’ answers, or even the knowledge points we casually said

Basic answer:

Browser parsingURLGet protocol, host, port,path
Browser Obtaining hostIPaddress
To establishTCPConnect and sendHTTPrequest
The server passes the response packetTCPThe connection is sent back to the browser, which receives itHTTPResponse, depending on the resource type (suppose the resource isHTMLDocument)
parsingHTMLDocument, componentDOMTree, download resources, structureCSSOMThe tree, the implementation ofjsThe script is finally presented to the user

If the candidate only answers these steps without mentioning a number of key steps, it’s likely that the answer won’t be as effective as the interviewer is looking for.

The author elaborates on some key steps in detail. Let’s make this a plus on our interview paper.

Network request

Build request

The browser builds the request line:

// The request method is GET, the path is the root path, and the HTTP protocol version is 1.1

GET / HTTP/1.1

Copy the code

Then check for a strong Cache based on the cache-Control and Expires fields. If a hit is used, go to the next step. If you are not sure about the strong cache, please refer to the following figure:

The DNS

Because we are entering the domain name, and the packet is sent through the IP address. So we need to get the IP address corresponding to the domain name. This process relies on a service system that maps domain names to IP addresses one by one. This system is called DNS (Domain Name System).

The DNS protocol provides the service of looking up an IP address by a domain name or reversely looking up a domain name from an IP address. The process of getting the specific IP address is DNS resolution.

DNS is a network server. Our domain name resolution is simply to record a message on DNS.

For example, baidu.com220.11423.. 56(External IP address of the server)80(Server port number)

Copy the code

The browser uses the domain name to query the IP address corresponding to the URL:

Browser cache: The browser caches at a certain rateDNSrecord
Operating system cache: If it is not found in the browser cacheDNSRecords? Look in the operating system
Route cache: Routers also have thisDNSThe cache
ISP 的 DNSServer:ISPIs an Internet service provider (Internet Service Provider),ISPHave a specialDNSServer responseDNSQuery request
Root server:ISP 的 DNSIf the server can’t find it, it sends a request to the root server for a recursive query (DNSThe server first asks the root DNS server.comDomain name serverIPAddress, then ask again.baiduDomain name servers, and so on.)

Establishing a TCP Connection

The TCP three-way handshake process is as follows:

The client sends a packet with SYN=1, Seq=X to the server port (the first handshake, initiated by the browser, tells the server I am about to send the request).
The server sends back an acknowledgement packet with SYN=1, ACK=X+1, Seq=Y (the second handshake, initiated by the server, tells the browser I’m ready to accept, so send it now).
The client sends back a packet with ACK=Y+1 and Seq=Z, which means “handshake over” (the third handshake, sent by the browser, tells the server, “I’m about to send it, get ready to accept it”).

Xie Xiren, in his book “Computer Network”, said that the purpose of “three-way handshake” was “to prevent the invalid connection request packet segment from suddenly being transmitted to the server, thus causing errors”.

Sending an HTTP request

Now that the TCP connection is established, the browser can start communicating with the server, that is, sending HTTP requests. Browsers carry three things when making HTTP requests: a request line, a request header, and a request body.

1. The request line contains the request method, URL, and protocol version

Request methods include 8:GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, and TRACE
URLThat is, the request address, by< protocol > : //< host > : < port >/< Path >? The < parameter >composition
The protocol version isHTTPThe version number

POST /user.html HTTP/1.1

Copy the code

2. The request header contains additional information about the request and consists of keyword/value pairs as follows

// The file format acceptable to the server

Accept: text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng; q=0.8,application/signed-exchange; v=b3

// Specify the type of content compression encoding returned by the Web server that the browser can support

Accept-Encoding: gzip, deflate, br

// Languages supported by the browser

Accept-Language: zh-CN,zh; q=0.9

// Cache mechanism

Cache-Control: no-cache

// Whether a persistent connection is required

Connection: keep-alive

// Send all Cookie values under the requested domain name to the server

Cookie: /* Omit cookie information */

// Specify the domain name and port number of the requested server

Host: www.baidu.com

Pragma: no-cache

Upgrade-Insecure-Requests: 1

// User agent UA, which contains information about the user making the request

User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit/604.138. (KHTML, like Gecko) Version/11.0 Mobile/15A372 Safari/604.1



Copy the code

3. Request body, which can hold data of multiple request parameters, including carriage return character, newline character and request data, generally exists under POST method.

The network response

Like the request part, the network response has three parts: the response line, the response header, and the response body.

1. The response line contains protocol version, status code, and status code description

HTTP/1.1 200 OK

Copy the code

The status code rules are as follows:

1xx: indicates that the request has been received and processing continues
2xx: Successful: The request is successfully received, understood, and accepted
3xx: Redirection — further action must be taken to complete the request
4xx: Client error — the request has syntax errors or the request cannot be implemented
5xx: Server side error — the server failed to implement a valid request

2. The response header contains the additional information of the response packet, which consists of name and value pairs, as follows:

// Cache mechanism

Cache-Control: no-cache

Connection: keep-alive

Content-Encoding: gzip

// Represents the media type information in the specific request, determining in what form and encoding the browser will read the file

Content-Type: text/html; charset=utf- 8 -

// The time when the original server message was sent

Date: Wed, 04 Dec 2019 12:29:13 GMT

// Name of the Web server software

Server: apache

// The server sends cookies to the client

Set-Cookie: rsv_i=f9a0SIItKqzv7kqgAAgphbGyRts3RwTg%2FLyU3Y5Eh5LwyfOOrAsvdezbay0QqkDqFZ0DfQXby4wXKT8Au8O7ZT9UuMsBq2k; path=/; domain=.baidu.com

Copy the code

Note the two security values in set-cookie: HttpOnly and SameSite

Cookies with the HttpOnly property set cannot be accessed using JavaScript via the Document.cookie property, XMLHttpRequest, and Request APIs to protect against cross-site scripting (XSS) attacks.

SameSite=Lax allows the server to set a cookie not to be sent with a cross-domain request, which provides some protection against cross-site request forgery attacks (CSRF).

3. The response body contains carriage return character, line feed character, and response return data. Not all response packets contain response data

If the request header or response header contains Connection: keep-alive, it indicates that a persistent Connection has been established. In this way, the TCP Connection will remain and will be reused by the resources of the requesting unified site. Otherwise, disconnect the TCP connection, and the request-response process ends.

To summarize the web request process on the browser side:

The browser parses the rendered page

The browser parses the rendered page in five steps:

According to theHTMLResolve theDOM 树
According to theCSSParsing generatedCSSThe rule tree
In combination withDOMTrees andCSSRule tree, generate the render tree
Calculate the information of each node according to the render tree
Draw the page based on the calculated information

At the return flow, the above process will be repeated. When redrawn, the style is recalculated, bypassing the intermediate steps and directly generating the draw list. You can see that redrawing does not necessarily lead to backflow, but the backflow must have redrawn.

Build a DOM tree

HTML Syntax Definition

The vocabulary and syntax of HTML are defined in specifications created by the W3C organization. The current version is HTML4, and work on HTML5 is in progress.
Not context-free syntax

As you saw in the introduction to parsers, the syntax can be defined in a format similar to BNF. Unfortunately, all this talk of regular parsers doesn’t apply to HTML (I don’t mention them for fun; they can be used to parse CSS and JavaScript). HTML cannot be defined in the context-free syntax required by the parser. In the past, the HTML format specification was defined by a DTD (Document Type Definition), but it was not a context-free syntax.

HTML is pretty close to XML. There are many parsers available for XML. HTML also has an XML variant called XHTML, so what’s the main difference? The difference is that HTML applications are more “tolerant” and allow you to leave out start or end tags, etc. It’s a whole “soft” syntax, not as rigid as XML. Overall this seemingly subtle difference creates two different worlds. On the one hand, this makes HTML popular because it tolerates your mistakes and makes life easier for web page authors. On the other hand, it makes it difficult to write syntax formats. So in general, HTML parsing isn’t easy, and off-the-shelf contextual parsers don’t work, and XML parsers don’t work either.

Parsing algorithm

Mark,
done

The two corresponding processes are word segmentation and parsing (see the parsing process in Babel compilation).

Here is an example to highlight the fault tolerance mechanism of HTML5:

Use

instead of

if (t->isCloseTag(brTag) && m_document->inCompatMode()) {

    reportError(MalformedBRError);

    t->beginTag = true;

}

Copy the code

The form of discrete

<table>

    <table>

        <tr><td>inner table</td></tr>

    </table>

    <tr><td>outer table</td></tr>

</table>

Copy the code

WebKit will automatically convert to:

<table>

    <tr><td>outer table</td></tr>

</table>

<table>

    <tr><td>inner table</td></tr>

</table>

Copy the code

Form elements are nested

Ignore the form.

Style calculation

CSS styles generally come from three sources:

linkThe label reference
styleStyle in tags
Elements embeddedstyleattribute

Format style sheet

Browsers cannot directly recognize CSS style text, where the rendering engine receives CSS text and converts it into structured objects called styleSheets.

You can view the final structure (from all three CSS sources) by typing document.stylesheets in the browser console.

Standardized style attributes

There are some CSS style values that are not easily understood by the rendering engine directly, and they need to be standardized before the styles are calculated. For example: em -> px, red -> #ff0000, bold -> 700, etc.

Calculate the specific style of each node

Calculating specific styles follows two main rules: inheritance and cascading

Inheritance:

Each child node inherits the style properties of the parent node by default, and if not found in the parent node, the browser default style, also known as the UserAgent style, is adopted.
Cascade:

The cascading nature of CSS is that the final style depends on how each property works together.

After the style is computed, all the style values are mounted to window.getComputedStyle, which means that the computed style can be obtained through JS.

Generate layout tree

Layout tree generation is mainly divided into two parts:

Traversal generatedDOMTree nodes and add them to the layout tree
Calculates the coordinate positions of the nodes in the layout tree

The layout tree contains only visible elements. Elements with the head tag and display: None will not be put in.

For more details on the layout, read the renrenfed team’s article on how the Browser layout works from the Chrome source code.

Building a Layer Tree

There are two cases, one is explicit synthesis, the other is implicit synthesis.

Explicit synthesis

Nodes that have a cascading context

The cascading context is also basically created with some specific CSS properties, which generally have the following conditions:

The HTML root element itself has a cascading context
A normal element that sets position to not static and sets the Z-index attribute produces a cascading context
The opacity value of an element is not 1
The transform value of the element is not None
The filter value of the element is not None
The isolation value of the element is ISOLATE
Will-change specifies any of the above values

Two, need to cut the place

For example, if you have a div that is only 100 by 100 pixels, and you put a lot of text in it, the extra text needs to be clipped. Of course, if a scrollbar is present, the scrollbar is promoted to a separate layer.

Implicit synthesis

Simply put, once the lower level nodes are promoted to a separate layer, all the higher level nodes become a separate layer.

This implicit composition is actually a huge risk. In a large application, when an element with a low Z-index is promoted to a separate layer, all the elements on top of it will be promoted to separate layers, potentially adding thousands of layers, greatly increasing memory stress, and even crashing the page. That’s how a layer explosion works

When repaint is needed, only the repaint itself is needed and the other layers are not affected.

Generate draw list

The rendering engine breaks the layer drawing into separate drawing commands, such as draw the background first and then the border…… These instructions are then combined in order into a list to be drawn.

You can open Chrome Developer Tools on F12, expand more Tools in the Settings bar, and select Layers to see the draw list.

The main thread of the render process then submits the draw list to the compositing thread. The composite thread then selects the block near the viewport and gives it to the rasterized thread pool to generate the bitmap.

When rasterization is complete, the compositing thread generates a DrawQuad instruction and sends it to the browser process. The Viz component in the browser process receives the command to draw the page contents into memory, thereby generating the page.

disconnect

When data transmission is complete, you need to disconnect the TCP connection and initiate four waves.

The initiator sends a packet to the passive.Fin, Ack, and Seq, indicating that no data is being transmitted. And into theFIN_WAIT_1State. (The request message has been sent)
The passive sends the message,Ack, Seq, indicating that the request is closed. Then the host initiator entersFIN_WAIT_2State. (The request message is accepted)
The passive sends a packet segment to the initiator.Fin, Ack, and SeqRequest to close the connection. And into theLAST_ACKState. (The response message is sent)
The initiator sends a segment to the passive.Ack, Seq. And then we go into waitTIME_WAITState. The passive side closes the connection after receiving the packet segment from the initiator. If the initiator does not receive a reply within a certain period of time, it shuts down. (The response message is accepted)

Refer to the article

What happens from URL input to page presentation?
(1.6w word) the browser soul ask, can you catch how many?
The big reveal! What am I asking about the “scary” side of Ali

Thank you

If this article helped you, just give it a like! Thanks for reading.