What happens after you enter the URL in the browser?

Recently, I happened to sort out some bits and pieces of knowledge. Two days ago, I happened to search for some problems about browser url parsing. I happened to see this article pushed by * front-end university * public account.

Action!

Note:The steps in this article are based on the assumption that the request is a simple HTTP request with no HTTPS, HTTP2, simplest DNS, no proxy, and no problems with the server, although this is unrealistic.

A flowchart

URL parsing
The DNS query
A TCP connection
Handle the request
Accept the response
To render the page

First, URL parsing

Address resolution:

First determine whether you input a legitimate URL or a keyword to search, and according to your input content for automatic completion, character coding and other operations.

HSTS

HSTS is used to force the client to access the page using HTTPS due to security risks. See: What you Don’t Know about HSTS[1].

Other operating

Browsers also have some additional operations, such as security checks, access restrictions (previously limited to 996. Icu for domestic browsers).

Check the cache

Second, DNS query

Basic steps

1. Browser cache

The browser checks to see if it is in the cache and calls the system library function to query if it is not.

2. Operating system cache

The operating system also has its own DNS cache, but before that, it checks whether the domain name exists in the local Hosts file and sends a query request to the DNS server.

3. Router cache

The router also has its own cache.

4. ISP DNS cache

ISP DNS is the preferred DNS server set up on the client computer, and they will be cached in most cases.

Root DNS server query

In the absence of caching in all previous steps, the local DNS server will forward the request to the root domain on the Internet. The following diagram illustrates the process nicely:

Root Domain name Server (Wikipedia)

Points to be aware of

Recursive mode: search all the way to the middle of no return, get the final result to return information (browser to the local DNS server process)
The iterative mode is used to query data from the local DNS server to the root DNS server.
What is DNS hijacking
Front-end DNS-Prefetch optimization

TCP connection

TCP/IP is divided into four layers, each layer encapsulates data when sending:

1. Application layer: Sends HTTP requests

Having obtained the IP address of the server in the previous step, the browser will start constructing an HTTP message that contains:

Request headers: Request method, destination address, protocol to follow, and so on
Request body (other parameters)

Points to note:

Browsers can only send GET and POST methods, but open web pages using GET methods

2. Transport layer: TCP transmits packets

The transport layer initiates a TCP connection to the server. To facilitate transmission, the transport layer divides the data (in packet segments) and marks the numbers so that the server can accurately restore the packet information when receiving the data.

Before establishing a connection, the TCP three-way handshake is performed.

”

The TCP/IP three-way handshake has been vividly depicted in many paragraphs and pictures on the Internet.

Related knowledge:

SYN flood attack

”

3. At the network layer, query the Mac address using the IP protocol

Package the data segment, add the source and destination IP addresses, and find the transmission route.

Check whether the destination IP address and the current IP address reside on the same network. If yes, the IP address is sent based on the Mac address. Otherwise, use the routing table to search for the next-hop address and use ARP to query its Mac address.

”

Note: In the OSI reference model ARP is at the link layer, but in TCP/IP it is at the network layer.

”

4. Link layer: Ethernet protocol

Ethernet protocol

According to the Ethernet protocol, data is divided into packets in the unit of “frame”, and each frame is divided into two parts:

Header: Sender, receiver, and data type of a packet
Data: The content of the packet

The Mac address

The Ethernet provides that all devices connected to the network must have a NETWORK adapter interface. Data packets are transmitted from one network adapter to another. The ADDRESS of the network adapter is the Mac address. Each Mac address is unique and has one-to-one capability.

radio

The method of sending data is very primitive. The data is directly sent to all the machines on the local network through THE ARP protocol. The receiver accepts the data based on the comparison between the header information and its OWN Mac address.

Note: The receiver responds unicast.

”

Related knowledge:

ARP attack

”

The server accepts the request

The process of acceptance is to reverse these steps, as shown above.

The server processes the request

A flowchart

HTTPD

The most common types of HTTPD are Apache and Nginx, commonly used on Linux, and IIS on Windows.

It listens for the incoming request and then starts a child process to process the request.

Handle the request

After receiving a TCP packet, the system processes the connection, resolves the HTTP protocol (request method, domain name, and path), and performs some authentication:

Verify that a virtual host is configured
Verify that the virtual host accepts this method
This method can be used to authenticate the user (based on IP address, identity information, etc.)

redirect

If the server has configured HTTP redirection, it will return a 301 permanent redirect response, and the browser will re-send the HTTP request based on the response (rerun the process above).

”

For more: see this article [2]

”

The URL rewrite

It then looks at the URL rewriting rules, and if the requested file is a real one, such as an image, HTML, CSS, JS file, etc., it will be returned directly.

Otherwise, the server will rewrite the request to a REST-style URL as required.

It then decides what type of dynamic file interpreter to call to handle the request, based on the dynamic language script.

Taking the MVC framework of the PHP language as an example, it first initializes some environment parameters, matches the route from top to bottom based on the URL, and then gives way to the defined method to handle the request.

5. The browser accepts the response

After the browser receives the response resource from the server, it analyzes the resource.

Start by looking at the Response header and doing different things based on the different status codes (such as the redirection mentioned above).

If the response resource is compressed (such as gzip), you also need to decompress it.

The response resources are then cached.

Next, the response content is parsed according to the MIME[3] type in the response resource (for example, HTML and Image are parsed in different ways).

Render the page

Browser kernel

The rendering process varies from browser kernel to browser kernel, but the general flow is similar.

The basic flow

6.1 HTML parsing

The first thing to know is that browser parsing is done line by line from top to bottom.

The parsing process can be divided into four steps:

① Encoding

All that comes back is bits of binary data that the browser needs to convert to a string, or HTML code, according to the file’s encoding (such as UTF-8).

② Pre-parsing.

What preparsing does is load the resource ahead of time to reduce the processing time. It identifies attributes that will request the resource, such as the SRC attribute of the IMG tag, and adds the request to the request queue.

③ Tokenization

Symbolization is the process of lexical analysis, parsing input into symbols. HTML symbols include start tags, end tags, attribute names, and attribute values.

It uses a state machine to identify symbol states, such as <, > state will change.

The tree construction is very important.

”

Note: Symbolization and tree building work in parallel, meaning that a DOM node is created as soon as a start tag is parsed.

”

In the previous step of symbolization, the parser takes these tokens and then creates and inserts them into DOM objects in the appropriate way.

<html><head>    <title>Web page parsing</title></head><body>    <div>        <h1>Web page parsing</h1>        <p>This is an example Web page.</p>    </div></body></html>Copy the code

Browser error tolerant base system

You’ll never see a “syntax invalid” error in a browser, because the browser corrects the syntax and moves on.

The event

When the parsing process is complete, the browser notifies the DOM that the parsing is complete via the DOMContentLoaded event.

6.2. The CSS parsing

Once the browser downloads CSS, the CSS parser processes any CSS it encounters, parses all CSS according to the syntax specification [4] and tokenizes them, and we get a table of rules.

CSS Matching Rules

The CSS rules of a node are matched from right to left. For example, div p {font size :14px} looks for all p tags and determines whether the parent element is div.

So when we write CSS, try to use ID and class, don’t over stack.

6.3. The render tree

This is a merging of the DOM tree and the CSS rule tree.

”

Note: Render trees ignore nodes that do not need to be rendered, such as those with display: None set.

”

To calculate

Calculate any size value to reduce it to one of three possibilities: auto, percentage, px, for example, converting REM to PX.

cascade

Browsers need a way to determine which styles really need to be applied to a specific element, so it uses a formula called Specificity, which passes:

Tag name, class, ID
Inline style or not
! important

And then you get a weight, and you take the highest weight.

Render block

When a Script tag is encountered, DOM building is paused until the script completes execution, and then continues building the DOM tree.

But if JS relies on CSS styles and it has not been downloaded and built, browsers will delay script execution until CSS Rules are built.

All we know is:

CSS blocks JS execution
JS blocks subsequent DOM parsing

To avoid this, the following principles should be followed:

CSS resources come before JavaScript resources
JS is placed at the bottom of the HTML, before

Alternatively, if you want to change the blocking mode, you can use defer and Async, as described in this article [5]

6.4. Layout and drawing

Determine the geometry of all nodes of the render tree, such as position, size, etc., and finally enter a box model that accurately captures the exact position and size of each element on the screen.

It then traverses the render tree, calling the renderer’s paint() method to display its contents on the screen.

6.5. Merge render layers

Combine all the images drawn above and output a single image.

6.6. Reflow and redraw

Reflux (reflow)

When the browser finds that a change in a section has affected the layout, it needs to go back and re-render, recursively starting with the HTML tag and recalculating the position and size.

Reflow is almost inevitable because when you swipe the mouse and resize the window, the page changes.

Redraw (repaint)

Redraw occurs when you change an element’s background color, text color, etc., without affecting the position of the surrounding elements.

After each redraw, the browser also needs to merge the render layers and output them to the screen.

Backflow costs are much higher, so we should avoid backflow as much as possible.

Such as:

Display: None triggers backflow, while visibility:hidden only triggers redraw.

6.7. JavaScript compilation and execution

A flowchart

It can be divided into three stages:

1. Lexical analysis

After loading the JS script, it first enters the syntax analysis stage. It first analyzes whether the syntax of the code block is correct. If the syntax is incorrect, it throws a “syntax error” and stops execution.

A few steps:

Word segmentation, such as var a = 2,, divided into such lexical units as var, a, =, 2.
Parsing, converting lexical units into abstract syntax trees (AST).
Code generation to translate abstract syntax trees into machine instructions.

2. The precompiled

JS runs in three environments:

The global environment
Function of the environment
eval

Each time entering a different running environment, a corresponding execution context will be created. According to different context environment, a function call stack is formed. The bottom of the stack is always the global execution context, and the top of the stack is always the current execution context.

Create execution context

Three things are done to create an execution context:

Creating a variable object

Parameters, functions, variables

Establish scope chains

Verify that the current execution environment has access to variables

Make sure This points to

3. Perform

JS thread

Although JS is single-threaded, there are actually four threads involved in the work:

”

Three of them are just helpers, and only the JS engine threads are actually executing

”

JS engine thread: also called JS kernel, is responsible for parsing the main thread that executes JS scripts, such as V8 engine
Event trigger thread: belongs to the browser kernel thread, mainly used to control events, such as mouse, keyboard, etc. When the event is triggered, it will push the event processing function into the event queue, waiting for the JS engine thread to execute
Timer trigger thread: mainly controls setInterval and setTimeout, used for timing. After the timing is finished, the timer processing function is pushed into the event queue, waiting for the JS engine thread.
HTTP asynchronous request thread: a thread opened by the browser after an XMLHttpRequest connection to monitor changes in the readyState state. If the callback function of the state is set, the state handler is pushed to the event queue, waiting for the JS engine thread to execute.

Note: Browsers are limited in the number of concurrent connections to the same domain, usually 6.

Macro task

Can be divided into:

Synchronization tasks: The synchronization tasks are executed in sequence. A synchronization task can be executed only after the previous task is completed
Asynchronous task: the asynchronous task is not executed directly. Only when the trigger condition is met, the relevant thread pushes the asynchronous task into the task queue and waits for the completion of the task on the main thread of the JS engine to start execution, such as asynchronous Ajax, DOM events, setTimeout, etc.

Micro tasks

Microtasks are available in ES6 and Node environments. The main apis are Promise and Process.nexttick.

Microtasks are executed after synchronous tasks of macro tasks and before asynchronous tasks.

The code example

console.log('1'); // Macro task synchronization
setTimeout(function() {    
Copy the code

console.log('2'); // Macro tasks are asynchronous
Copy the code

})
new Promise(function(resolve) {    
Copy the code

console.log('3'); // Macro task synchronization
Copy the code

resolve(); })Copy the code

.then(function() {    
Copy the code

console.log('4') Micro / / task
Copy the code

})
console.log('5') // Macro task synchronizationCopy the code

The output order of the above code is: 1,3,5,4,2

Reference documentation

[1] What you don’t know about HSTS: http://t.cn/AiR8pTqx

[2] See this article for details: http://t.cn/AiR8pnEC

[3]MIME: http://t.cn/AiR8prtm

[4] Syntax: http://t.cn/AiR80GdO

[5] This article: http://t.cn/AiR80c1k

[6]what-happens-when-zh_CN: http://t.cn/AiR80xb5

[7]Tags to DOM:http://t.cn/AiR80djX

[8] Thoroughly understand browser caching: http://t.cn/AiR8Ovob

[9] How browsers work: Behind the scenes of the new Web browser: http://t.cn/AiR8Oz06

[10] Simple browser rendering principle: http://t.cn/AiR8O4fO

[11] Js engine execution process (I) :http://t.cn/AiR8Ot3s

Find this article helpful? Please share with more people!

Pay attention to “front-end university”, improve front-end skills!