Why does Chrome have 4 processes?

1. Processes and threads

Process: A process is a running instance of a program. When a program is started, the operating system creates a block of memory for the program, which is used to store code, running data and a main thread to perform tasks. We call such a running environment a process.

Thread: A segment of code that runs independently in a program. A process consists of one or more threads, which are responsible for executing code.

The relationship between processes and threads has the following four characteristics:

The failure of any thread in the process will cause the entire process to crash.
Threads share data in a process.
When a process is shut down, the operating system reclaims the memory occupied by the process.
The contents of the processes are isolated from each other.

2. Single-process browser

Single-process browser means that all the functional modules of the browser run in the same process. * * a problem: unstable: a plugin crashes unexpectedly can cause the collapse of the entire browser unsafe: through plug-ins can get to any of the operating system resources, when you run the page a plugin also means that the plugin can operate your computer is not completely smooth: there can be only one module can perform the same moment

3. Multi-process browser

** The latest Chrome includes: 1 main Browser process, 1 GPU process, 1 NetWork process, multiple renderers, and multiple plug-in processes.

Browser process. It is mainly responsible for interface display, user interaction, sub-process management, and storage.
Render process. The core task is to turn HTML, CSS, and JavaScript into web pages that users can interact with. Both the typography engine Blink and JavaScript engine V8 run in this process. By default, Chrome creates a rendering process for each Tab Tab. For security reasons, renderers are run in sandbox mode.
Process of GPU. In fact, Chrome didn’t have a GPU process when it was first released. The original intention of using GPU was to achieve 3D CSS effect, but later the UI interface of web page and Chrome were drawn on GPU, which made GPU become a common requirement of browser. Finally, Chrome has introduced GPU processes on top of its multi-process architecture.
Network process. It is responsible for loading web resources on the page. It used to run as a module in the browser process until recently, when it became a separate process.
Plug-in process. It is mainly responsible for the running of plug-ins. Plug-ins are prone to crash. Therefore, plug-ins need to be isolated through the plug-in process to ensure that the plug-in process crash does not affect the browser and page.

While the multi-process model improves browser stability, smoothness, and security, it also inevitably introduces some problems:

Higher resource usage. Because each process contains a copy of the common infrastructure (such as the JavaScript runtime environment), this means that the browser consumes more memory resources.
More complex architectures. Problems such as high coupling between browser modules and poor scalability lead to the current architecture has been difficult to adapt to new requirements.

TCP protocol

Data on the Internet is transmitted in packets. If a large amount of data is sent, it is broken up into smaller packets for transmission. The audio data you listen to now, for example, is broken down into small packets, not one big file.

1. IP: sends the data packet to the destination host

Let’s look at the journey of the next packet from host A to host B:

The upper layer sends packets containing “geek time” to the network layer;
The network layer attaches the IP header to the packet to form a new IP packet, which is handed to the bottom layer.
The bottom layer transmits data packets to host B through the physical network.
The data packet is transmitted to the network layer of host B, where host B unwraps the IP header of the data packet and delivers the disassembled data to the upper layer.
Eventually, the packet containing the geek time message reaches the upper layer of host B.

2. UDP: Sends data packets to the application

The User Packet Protocol is a simple datagram oriented communication protocol located at the OSI transport layer

IP sends packets to specific computers using IP address information, while UDP sends packets to correct programs using port numbers. Compared with IP, UDP headers contain information such as source port numbers as well as destination port numbers.

In order to support UDP protocol, I extended the previous three-layer structure to four-layer structure, and added a transport layer between the network layer and the upper layer, as shown in the figure below:Let’s see how the next packet travels from host A to host B:

The upper layer sends packets containing geek time to the transport layer;
The transport layer will attach the UDP header in front of the packet to form a new UDP packet, and then the new UDP packet to the network layer;
The network layer attaches the IP header to the packet to form a new IP packet, which is handed to the bottom layer.
The packet is transmitted to the network layer of host B, where host B unwraps the IP header and passes the unwrapped part of the data to the transport layer.
At the transport layer, the UDP header in the packet is broken down and the data part is handed over to the upper-layer application program according to the port number provided in the UDP.
Eventually, packets containing geek time information travel to host B’s upper application.

Advantages: 2.1

UDP doesn’t guarantee data reliability, but it can transmit data very fast, so UDP will be used in areas where speed is a concern but data integrity is not strictly required, such as online video and interactive games.

2.2: shortcomings

1. Data packets are easy to be lost during transmission; 2. Large files are divided into many small packets for transmission. These small packets are routed through different routes and arrive at the receiver at different times, but UDP does not know how to assemble these packets to restore them to a complete file.

3. TCP: Delivers data to the application in its entirety

Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream based transport layer communication Protocol. Compared with UDP, TCP has the following two features: 1. 2. TCP introduces the packet sorting mechanism to ensure that out-of-order packets are combined into a complete file.

The transmission flow of a single TCP packet is similar to that of UDP. The difference is that the information in the TCP header ensures the integrity of a large piece of data.

The life cycle of a complete TCP connection includesEstablish a connection“”To transmit data“And”disconnect“Three stages.Conclusion:

Data on the Internet is transmitted through data packets, which are easy to lose or make errors during transmission.
IP is responsible for delivering packets to the destination host.
UDP is responsible for delivering data packets to specific applications.
TCP ensures the complete transmission of data. Its connection can be divided into three stages: establishing a connection, transmitting data and disconnecting the connection.

HTTP request process

HTTP protocol, based on TCP connections. HTTP is a protocol that allows browsers to fetch resources from servers. It is the foundation of the Web. Why is it that a site is usually slow to open the first time you visit it, and fast the second time you visit it? 2. When you log in to a website, the next time you visit the site, you are already logged in. How does this work?

Type in the browser’s address bar what the site does: **

1. Build request:

The browser constructs the request line information and prepares to make a network request.

2. Find cache:

Browser caching is a technique for saving a copy of a resource locally for immediate use on the next request. When the browser discovers that the requested resource already has a copy in the browser cache, it intercepts the request, returns a copy of the resource, and ends the request without going to the source server for a new download. The benefits of this are:

Ease server-side stress and improve performance (less time to acquire resources).
For web sites, caching is an important part of fast resource loading.

Of course, if the cache lookup fails, it enters the network request process.

3. Prepare IP addresses and ports

Before HTTP work can begin, the browser needs to establish a connection with the server over TCP. That is to say,HTTP content is realized through the transmission data phase of TCP. The browser will ask DNS to return the IP of the domain name,Of course the browser also provides itDNS data caching serviceIf a domain name has already been resolved, the browser will cache the resolution result and use it for the next query, thus reducing a network request. If the URL does not specify a port number, the HTTP protocol defaults to port 80.

4. Wait for the TCP queue

Chrome has a mechanism that allows you to establish a maximum of six TCP connections under the same domain name. If 10 requests occur at the same time under the same domain name, four of the requests will be queued until the ongoing requests are completed.

5. Establish a TCP connection: three-way handshake

6. Send an HTTP request

After the browser sends the request line command, theRequest headerForm sends other information that tells the server some basic information about the browser. For example, it contains the operating system used by the browser, the browser kernel and other information, as well as the domain name information of the current request, Cookie information of the browser, and so on.

7. The server returns response data during HTTP request processing

8. Disconnect

Normally, once the server returns the request data to the client, it closes the TCP connection. But if the browser or server adds the following header:The TCP connection will remain open after being sent.Maintaining a TCP connection saves the time required to establish a connection for the next request and speeds up resource loading支那

9. The redirection

As shown above, the response line returns a status code of 301, which tells the browser that I need to redirect to another URL that is contained in the Location field of the response header.

Answer: 1. Why do many sites open quickly the second time?

**DNS cache: **DNS cache is relatively simple, it is mainly in the browser local IP and the corresponding domain name associated.
Page resource cache:

The browser uses the cache-control field in the response header to set whether or not to Cache the resource, and the duration is set by the max-age parameter in cache-Control

But if the cache expires, the browser will continue to make network requests, and in theThe HTTP request headerBring inAfter receiving the request header, the server determines whether the requested resource has been updated based on the if-none-match value.

If there is no update, the 304 status code is returned, which is equivalent to the server telling the browser, “This cache can continue to be used and I won’t send you the data again this time.”
If the resource is updated, the server returns the latest resource directly to the browser.

In short, many sites are able to get second visits in seconds because they cache a lot of their resources locally. The browser cache saves time by responding to requests directly with a local copy rather than generating actual network requests. DNS data is also cached by the browser, which eliminates DNS queries.

2. How is the login status maintained?** After receiving the message submitted by the browser, the server queries the background to verify that the user’s login information is correct. If correct, it generates a string indicating the user’s identity and writes the string to the set-cookie field in the response header. The response hair is then sent to the browser.

After receiving the response header from the server, the browser parses the response header. If the response header contains a set-cookie field, the browser saves this field locally. For example, keep UID=3431uad locally.

When the user visits again, the browser will make an HTTP request, but before making the request, the browser will read the saved Cookie data and write the data into the Cookie field in the request header (as shown below), and then the browser will send the request to the server.

After receiving the HTTP request header data, the server will search for the “Cookie” field information in the request header. When it finds the information containing UID= 3431UAD, the server queries the background, determines that the user is logged in, and then generates the page data containing the user information, and sends the generated data to the browser.

After the browser receives the page data containing the current user, it can correctly display the status information of the user login.

Simply put, if the response header sent by the server has a set-cookie field in it, the browser keeps the contents of that field locally. The next time a client sends a request to the server, the client automatically adds the Cookie value to the request header and then sends the request. After discovering the Cookie sent by the client, the server will check which client sent the connection request, and then compare the records on the server to obtain the status information of the user.

Conclusion:

Fourth, enter the URL to the page display, what happened in the middle?

1. Synthesize the complete URL

If the input content complies with URL rules, for example, baidu.com is entered. Then the address bar combines this content with the protocol according to the rules to synthesize a complete URL, such as www.baidu.com.

2.URL request process

Next, you enter the page resource request process. In this case, the browser process sends the URL request to the network process through interprocess communication (IPC). After receiving the URL request, the network process initiates the actual URL request process here. So what’s the process?

1. The network process checks whether the resource is cached in the local cache. If there is a cached resource, it is returned directly to the browser process. If the resource is not found in the cache, the network request flows directly.

2. Obtain the IP address of the server that requests the domain name. If the request protocol is HTTPS, you also need to establish a TLS connection.

3. Establish a TCP connection with the server using an IP address. After the connection is established, the browser side will construct the request line, request information, etc., and attach the data related to the domain name, such as cookies, to the request header, and then send the constructed request information to the server.

4. After receiving the request information, the server generates response data (including the response line, response header, and response body) based on the request information and sends it to the network process. After the network process receives the response line and header, it parses the contents of the header.

5. After receiving the response header from the server, the network process parses the response header. If the returned status code is 301 or 302, the server requires the browser to redirect to another URL. The network process reads the redirected address from the Location field in the response header, then initiates a new HTTP or HTTPS request and starts all over again. If the response line is 200, then the browser can continue processing the request.

6. Process the response Content based on the returned Type of the content-type.

7. Prepare the rendering process

Typically, a separate rendering process is used to open a new page;
If page B is opened from page A, and A and B belong to the same site, then page B reuses the rendering process of page A; If otherwise, the browser process creates a new renderer for B.

8. Submit documents

The “submit document” message is sent by the browser process, the renderer process receives the “submit document” message, and the network process will establish a “pipeline” to transfer data.
After the document data transfer is complete, the renderer process returns a “confirm submit” message to the browser process.
After receiving the “confirm submission” message, the browser process updates the browser interface status, including the security status, the URL of the address bar, the historical status of forward and backward, and the Web page.

This explains why, when you type an address into your browser’s address bar, the previous page doesn’t disappear immediately, but instead takes a while to load before the page is updated.

9. Render updated page content

summary

A summary of the whole process 1. The user enters the URL and press Enter 2. The browser process checks the URL and assemitates the protocol to form a complete URL 3. If yes, the resource is returned to browser process 5. If no, the network process sends an HTTP request (network request) to the Web server. The request flow is as follows: 5.1 Perform DNS resolution to obtain the server IP address and port (Is the port obtained through DNS resolution? 5.2 Establishing a TCP connection with the Server using the IP address 5.3 Establishing request Header Information 5.4 Sending Request Header Information 5.5 After the server responds, the network process receives the response header and response information and parses the response content. 6. 6.1 Check the status code. If the status code is 301/302, redirection is required to automatically read the address from Location. Repeat step 4 (does the 301/302 jump also read the local cache? Here’s the question), if 200, continue processing the request. 6.2 200 Response processing: Check the response Type content-Type. If the response Type is byte stream, the request is submitted to the download manager. The navigation process ends without further rendering, and if it is HTML, the browser process is notified to prepare the rendering process for rendering. 7.1 The browser process checks whether the current URL is the same as the root domain name of the previously opened renderer process. If they are the same, the original process will be reused. If they are different, a new renderer process will be started. 8.1 When the renderer process is ready, the browser sends a message of “Submit document” to the renderer process. The renderer process receives the message and the network process establishes a “pipeline” to transfer data. 8.2 After the renderer process receives the data, 8.3 The Browser Process updates the Browser interface status after receiving the Confirmation message, including security, URL, forward and backward historical status, and Web page update.

9. Browsers don’t understand HTML data directly, so the first step is to convert it into a DOM tree structure that browsers can understand; After the DOM tree is generated, it is also necessary to calculate the style of all nodes of the DOM tree according to the CSS style sheet. Finally, the DOM element layout information is computed so that it is stored in the layout tree.

How do HTML, CSS, and JavaScript become pages?

1. Build a DOM tree

Because browsers can’t understand and use HTML directly, you need to transform the HTML into a structure that browsers can understand — a DOM tree.

2. Style calculation

2.1 Convert CSS to a structure that browsers can understand

When the rendering engine receives CSS text, it performs a conversion operation to convert the CSS text into a structure that the browser can understand — styleSheets.

2.2 Transform property values in the stylesheet to standardize them

2.3 Calculate the specific style of each node in the DOM tree

3. Layout stage

You need to figure out the geometry of the visible elements in the DOM tree

3.1 Creating a Layout tree

Traverse all visible nodes in the DOM tree and add them to the layout;
Invisible nodes are ignored by the layout tree, such as everything under the head tag, or the body.p.pan element, which is not included in the layout tree because its attribute contains dispaly: None.

3.2 Layout Calculation

4. The layered

Because there are many complex effects on the page, such as complex 3D transformations, scrolling, or z-indexing, the rendering engine will need to generate a tree of layers for each node to make it easier to achieve these effects. 支那

5. Layer drawing

After building the layer tree, the rendering engine draws each layer in the tree.

6. Rasterization operation

Rasterization refers to the transformation of a map block into a bitmap.

7. Composition and display

8. Summary of rendering process

The renderer transforms the HTML content into a readable DOM tree structure.
The rendering engine translates CSS styleSheets into styleSheets that browsers can understand, calculating the style of DOM nodes.
Create a layout tree and calculate the layout information for the elements.
Layer the layout tree and generate a hierarchical tree.
Generate a draw list for each layer and submit it to the composition thread.
The composite thread divides the layer into blocks and converts the blocks into bitmaps in the rasterized thread pool.
The composite thread sends the DrawQuad command to the browser process.
The browser process generates the page from the DrawQuad message and displays it on the monitor.

9. Expand on relevant concepts

“Rearrange” “redraw” and “compose” **

9.1 Updated element geometry (rearranged)

As you can see from the above figure, if you modify the geometry of an element using JavaScript or CSS, such as changing the width, height, etc., then the browser triggers a relayout, a series of sub-stages after parsing, calledrearrangement. No doubt,Rearrangement requires updating the entire rendering pipeline, so it is also the most expensive.

9.2 Updating the Draw Attribute of an Element (Redraw)

As can be seen from the figure, if you change the background color of the element, the layout phase will not be performed, because there is no change in the geometry position, so you go directly to the drawing phase, and then perform a series of subsequences.the process is calledredraw. Compared to the rearrangement operation,Redraw eliminates the need for layout and layering, so execution is more efficient than scheduling.

9.3 Direct synthesis stage

In the image above, we use the TRANSFORM of CSS to animate the effect, which avoids the rearrangement and redraw phases and executes the compositing animation directly on the non-main thread. This is the highest efficiency, because it is composed on the non-main thread, does not occupy the main thread resources, and avoids the layout and drawing two sub-stages, so compared with redraw and rearrangement, composition can greatly improve the drawing efficiency.

Think 10.

10.1 What Are the Rearrangement Operations triggered?

Such as offsetTop, offsetLeft, OffsetWidth, offsetHeight, scrollTop, scrollLeft, scrollWidth, scrollHeight, clientTop, clientLeft, clientWidth, clientHeight and other properties

10.2 What are the Redraw Operations triggered?

Color, visibility, outline, background-color

10.3 How to Reduce rearrangement and redrawing

There are many ways to reduce rearrangement and redrawing:

Use class to manipulate styles rather than frequently manipulating styles
Avoid table layouts
Batch DOM operations, such as createDocumentFragment, or use frameworks, such as React
Debounce Window Resize event
Read and write dom attributes separately
Will-change: transform to optimize

Refer to the column: time.geekbang.org/column/intr…

Browser working principle (a) macro browser