【7000 words 】 a night to explode liver browser from input to render principle

preface

Chrome Comic, a comic book with a brief overview of the Chrome architecture, was released with the Chrome browser in 2008.

The comic catalog is as follows:

The story behind open source browsers
Stability, rigor, and multitasking architecture
Speed: Webkit and V8
Search and user experience
Security, sandbox mode, and risk-free browsing
Gears, standards and open source

CPU, GPU memory and multi-process architecture

The core of a computer is the CPU and GPU

CPU

The CPU is the brain of the computer and can handle many different tasks. Most cpus are single-chip. A kernel is equivalent to another CPU on the same chip.

GPU

Gpus were originally developed for graphics processing and excel at handling simple tasks across multiple cpus at once.

Typically, applications run on cpus and gpus using mechanisms provided by the operating system.

Processes and threads

A process can be described as the executor of an application program. Threads are threads that exist inside a process and execute any part of its process program.

A program creates a process when it starts, and the program may also create threads to help it do its job. The operating system provides a “block of memory” for processes to use, and all application state is stored in this private memory space. When you close the application, the process also disappears and the operating system frees memory.

A process can ask the operating system to start another process to run different tasks, in which case a different portion of memory is allocated to the new process. If two processes need to chat, they need IPC. If the worker process is unresponsive, it can be restarted without stopping other processes running different parts of the application.

Browser architecture

For browsers, a process can have many different threads, or many different processes can have multiple threads communicating through IPC.

For Chrome, the latest architecture looks like this:

process	role
Browser	The browser process that controls the “Chrome” part of the application, including the address bar, bookmarks, and back and forward buttons. It also handles the invisible privileged parts of the Web browser, such as network requests and file access.
Renderer	The renderer process that controls the display of any content in the tabs of the site.
Plugin	Plugin process that controls any plug-ins used by the site, such as Flash.
GPU	A graphics processing process that processes GPU tasks independently of other processes. It is divided into different processes because the GPU processes requests from multiple applications and draws them on the same surface.

The following image shows different processes pointing to different parts of the browser UI:

Of course, there are more processes, such as extension processes and utility processes, and so on.

Advantages of multiple processes in Chrome

Suppose you have three tabs open, each run by a separate renderer process. If one TAB becomes unresponsive, you can close the unresponsive TAB and continue while keeping the other tabs active. If all tabs are running on a process, when one TAB becomes unresponsive, all tabs become unresponsive.

Another benefit of splitting the browser’s work into multiple processes is security and sandbox. Since the operating system provides a way to limit the permissions of processes, browsers can sandbox certain processes from certain functions. For example, Chrome restricts arbitrary file access to processes that process arbitrary user input, such as the renderer process.

Because processes have their own private memory space, they often contain copies of the common infrastructure (such as V8, which is Chrome’s JavaScript engine). This means more memory usage, because if they are threads within the same process, they cannot be shared in the same way. To save memory, Chrome limits the number of processes it can start. This limit depends on the memory and CPU capacity of the device, but when Chrome reaches the limit, it starts running multiple tabs from the same site in a single process.

Save more memory – servitization in Chrome

Chrome is undergoing architectural changes to run each part of the browser program as a service, which can be easily split into different processes or aggregated into a single process.

When Chrome is running on powerful hardware, it may split each service into separate processes to provide greater stability, but if it is running on resource-constrained devices, Chrome will consolidate the service into a single process to save on memory footprint. Prior to this change, a similar approach was used to merge processes to reduce memory usage on platforms such as Android.

Site isolation

Site isolation runs a separate renderer process for each cross-site IFrame and shares memory space between different sites. The same origin policy is the core security model of the network, which ensures that one site cannot access data from other sites without consent. For an attacker, bypassing the same origin policy is the primary goal of a security attack, and for browsers, processes are required to separate sites. Since Chrome 67, site isolation has been enabled by default on the desktop, with a separate renderer process for each cross-site IFrame in the TAB, and, of course, radically changing the way iframes communicate with each other.

Two, navigation jump

Write a URL in the browser, and then the browser gets the data from the Internet and displays a page. What does the request site do before the browser renders?

As we’ve seen before, everything except tabs is handled by the Browser Process, the Browser Process. There are threads in the browser process, such as the UI thread that draws buttons and inputs, the network thread that handles the network stack to receive data from the Internet, the storage thread that controls access to files, and so on. When you enter a URL in the address bar, the input is handled by the BROWSER process’s UI thread.

start

Step 1: Process the input

When typing something into the address bar, the FIRST thing the UI thread asks is “Is this a search query or a URL?” . In Chrome, the address bar is also a search input field, so the UI thread needs to parse it and decide whether to send it to the search engine or to the requested site.

Step 2: Start looking

When the Enter key is pressed, the UI thread makes a network request to retrieve the site content. Loading spinner is displayed in the corner of the TAB, and the network thread passes the appropriate protocol, such as DNS lookup and establishes TLS connections for the request.

At this point, the network thread might receive a server redirect header, such as HTTP 301. In this case, the network thread communicates with the UI thread that the server is requesting for redirection. Then, another URL request will be made.
Step 3: Read the response

Once the response starts coming in, which is the Payload of the request, the network thread looks at the first few bytes of the stream if necessary. The content-Type header of the response should say what Type of data it is, but because it can be lost or incorrect, the MIME Type validation is done here.

If the response is an HTML file then the next step is to pass the data to the GPU process, but if it’s a ZIP file or some other file then it’s a download request and they then need to pass the data to the download manager.

It is here that the secure browsing check is performed. If the domain and corresponding data match a malicious site, the network thread will issue an alert and display a warning page. The CORS check also occurs during this process to ensure that sensitive cross-site data is not thrown to the renderer.
Step 4: Find the renderer process

Once all the checks are done and the network thread is sure that the browser should navigate to the requested site, the network thread tells the UI thread that the data is ready. The UI thread then finds a renderer process to render the page.

Because network requests can take hundreds of milliseconds to get a response, optimizations are applied to speed up the process. When the UI thread sends the URL request to the network thread in step 2, it already knows which site they want to navigate to. The UI thread attempts to actively find or start the renderer process in parallel with the network request. This way, if all goes as expected, the renderer process is already on standby when the network thread receives the data. If the navigation redirects across sites, this alternate process may not be used, in which case a different process may be required.
Step 5: Submit

Now that the data and renderer processes are ready, an IPC is sent from the browser process to the renderer process to submit the navigation. It also passes a data stream, so the renderer process can continue to receive HTML data. Once the browser process hears confirmation that a commit has occurred in the renderer process, the navigation is complete and the document rendering phase begins.

At this point, the address bar has been updated, and the security indicator and site Settings UI reflect the site information on the new page. The TAB’s session history is updated, so the back/Forward buttons step through the site you just navigated to. To facilitate TAB/session recovery when a TAB or window is closed, session history is stored on disk.
Other steps

After submission, the renderer process continues to load the resource and render the page. When the renderer process “finishes” rendering, it sends IPC back to the browser process (this is after all events are triggered on all frames in the onLoad page and execution is complete). At this point, the UI thread stops the loading mini-loading on the TAB.

After that, client-side JavaScript can still load additional resources and render new views.

Navigate to other sites

What happens if the user puts a different URL into the address bar again? Browser processes follow the same steps to navigate to different sites. But before doing that, it needs to check whether the currently rendered site has beforeUnload events.

Beforeunload can create a “Leave this site?” To alert you when you leave or close the TAB. Everything inside the TAB (including JavaScript code) is handled by the renderer process, so the browser process must check the current renderer process when a new navigation request comes in.

Note: Do not add unconditional beforeunload handlers. It creates more latency because the handler needs to be executed before the navigation begins. You should add this event handler only when necessary, for example, if you need to warn users that they may lose data entered on the page.

When the new navigation arrives at a different site than the currently rendered site, a separate rendering process is called to handle the new navigation, while leaving the current rendering process to handle things like unload. For page life cycle state, see here.

Here are two IPC’s from the browser process to the new renderer process, telling the render page and telling the old renderer process to unload:

Service Worker

First, the Service Worker allows developers more control over what is cached locally and when new data is fetched from the network. If the service worker is set to load pages from the cache, there is no need to request data from the network.

Note: The Service Worker is JavaScript code that runs in the renderer process.

But how does the browser process know which site has a Service Worker when a navigation request comes in?

After a Service Worker is registered, the scope of the Service Worker is preserved. When navigation occurs, the network thread checks the domain against the scope of the registered Service Worker, and if the Service Worker has been registered for the URL, the UI thread looks for the renderer process to execute the Service Worker code. The Service Worker may load data from the cache without having to request data from the network, or it may request new resources from the network.

The UI thread in the browser process starts the renderer process to handle the service worker. The worker thread in the renderer process then requests data from the network:

Navigation preloading

If the Service Worker ultimately decides to request data from the network, this round-trip between the browser process and the renderer process can cause delays. Navigation Preloads are a mechanism to speed up the process by loading resources at the same time the Service Worker starts. It marks these requests with a header, allowing the server to decide to send different content for these requests; For example, just update data rather than complete documentation.

Third, rendering

After navigation, the browser calls the renderer (UI) process to work.

The renderer process handles the Web

The renderer process is responsible for everything that happens inside the TAB. In the renderer process, the main thread handles most of the code sent to the user. If you use a Web Worker or Service Worker, part of the JavaScript is handled by the Worker thread. The synthesizer and raster threads also run within the renderer process to render the page efficiently and smoothly.

The core job of the renderer process is to transform HTML, CSS, and JavaScript into web pages that users can interact with.

parsing

Build the DOM

When the render process receives a commit message for navigation and starts receiving HTML data, the main thread begins parsing the HTML, making it a DOM.

The DOM is the browser’s internal representation of a page, as well as a data structure and API that developers can interact with through JavaScript. Parsing AN HTML document into the DOM is defined by the HTML standard, so mistags are sometimes automatically corrected, as you can see in the parser error handling.

Subresource loading

External resources such as images, CSS, and JavaScript need to be loaded from the network or cache. The main thread can find them during parsing and build the DOM and request them individually, but to speed things up, the Preload scanner runs concurrently. If there is something like < IMG > or in the HTML document, the preloaded scanner looks at the token generated by the HTML parser and sends the request to the network thread in the browser process.

JavaScript prevents parsing

When the HTML parser finds a

How to load resources

If JavaScript doesn’t use document.write(), you can add async or defer properties to the

Style parsing

The main thread parses the CSS and determines the computational style of each DOM node.

Each DOM node has a default style, which is the default style sheet.

layout

So far, the renderer process knows the structure of the document and the style of each node.

Layout is the process of finding the geometry of an element, and the main thread traverses the DOM and evaluates the style and creates a layout tree that contains information such as xy coordinates and bounding box sizes. A layout tree may have a similar structure to a DOM tree, but it contains only information related to what is visible on the page. If display: None is applied, the element is not part of the layout tree (however, having visibility: hidden is in the layout tree). Similarly, if a pseudo-class with similar content is applied, p::before{content:”Hi!” } Even if it is not in the DOM, it is included in the layout tree.

CSS represents the initial layout of the entire page. If you want to know more about it, check out this presentation.

draw

So far, you have the DOM, style, and layout, but to start drawing you need to decide in what order you want to draw. For example, z-index might be set for some elements, in which case drawing in the order of the elements written in HTML would result in incorrect rendering.

In the draw step, the main thread traverses the layout tree to create the draw record. The sequence of drawing records is: background first, text then rectangle. This is similar to the drawing process of

Pay attention to

The most important thing about the drawing process is that each step of the drawing uses the results of the previous operation to create new data. If something in the layout tree changes, you need to regenerate the drawing order for the affected portions of the document.

If you animate an element, the browser must run these actions between each frame. Most of our monitors refresh the screen 60 times per second (60 FPS); The animation appears smooth to the human eye as objects are moved on the screen in each frame. However, if the animation misses the middle frame, the page will appear “Janky.”

Even if the render operation keeps up with the screen refresh, these calculations run on the main thread, meaning that when the application runs JavaScript, it may be blocked.

At this point, JavaScript operations can be broken up into small chunks and handled using requestAnimationFrame(), or JavaScript can be run via WebWorker to avoid blocking the main thread. You can click here to learn about JS execution optimizations.

synthetic

rasterizer

By now, the browser knows the structure of the document, the style of each element, the geometry of the page, and the order in which it draws, allowing it to do the actual drawing, converting the process to pixels on the screen is called rasterization.

When Chrome was first released, it handled rasterization by rasterizing only parts of the page within the viewport, moving the raster shelf as the user scrolled through the page, and filling in the missing sections with more rasters.

However, in modern browsers there is a more complex process called compositing.

Compositing, dividing parts of a page into layers, rasterizing them individually, and merging them into a single page in a separate thread of a synthesizer thread. At this point, if the scroll happens, because the layer has been rasterized, all it has to do is compose a new frame. Animation can be done in the same way by moving layers and compositing new frames. To view the layers of the page, go to More Tools > Layers on the console.

layered

To figure out which elements need to be in which layers, the main thread traverses the layout Tree to create the Layer Tree (which can be called the “Update Layer Tree” in DevTools’ Performance panel). If parts of the page that should be separate layers (such as slide-in side menus) are not available, you can alert the browser by using properties in will-change CSS.

Providing layers for each element, as opposed to rasterizing a small portion of the page per frame, can make compositing slow.

Main thread grating and composition

Once the layer tree is created and the drawing order determined, the main thread submits this information to the synthesizer thread. The synthesizer thread then rasterizes each layer. A layer can be as large as the entire length of the page, so the synthesizer thread divides them into blocks and sends each block to the raster thread. The raster thread rasterizes each graph block and stores them in GPU memory.

The synthesizer thread can prioritize different raster threads so that things in (or near) the viewport can be rasterized first. A layer also has multiple tiling of different resolutions to handle things like zoom operations.

After the slice is rasterized, the synthesizer thread collects the slice information called drawing quadrilateral to create the synthesizer frame.

The name of the	instructions
Synthesizer frame	A collection of drawing quadrangles that represent a page frame.
Draw a quadrilateral	Contains information such as the location of the tiles in memory and the location of the tiles to draw on the page with page composition in mind.

The synthesizer framework is then submitted to the browser process via IPC. At this point, another synthesizer framework can be added from the UI thread used for browser UI changes or from other renderer processes used for extensions. These synthesizer frames are sent to the GPU to display them on the screen. If a scroll event occurs, the synthesizer thread creates another synthesizer frame to send to the GPU.

The advantage of composition is that it is done without involving the main thread. The synthesizer thread does not need to wait for style calculations or JavaScript execution. This is why compositing animation only is considered the best choice for smooth performance. If you need to recalculate the layout or draw, the main thread must be involved.

User input and synthesizer

The browser input event

From the browser’s point of view, input means any event from the user. A mouse wheel scroll is an event, as is a touch or mouse hover.

When a user makes gestures such as a touch on the screen, the browser process first receives the gesture. However, the browser process only knows where the gesture occurred because the contents of the TAB are handled by the renderer process. So the browser process sends the event type (such as TouchStart) and its coordinates to the renderer process. The renderer process handles the event appropriately by looking for the event target and running additional event listeners.

Here’s how Input events are routed to the renderer process by the browser process:

Non-fast scrolling area

If no input event listeners are attached to the page, the synthesizer thread can create a new composite frame that is completely independent of the main thread. If some event listeners are attached to the page, how can the synthesizer thread determine whether the event needs to be handled?

Because it is the main thread’s job to run JavaScript, when composing a page, the synthesizer thread marks the area of the page with event handlers attached as a “non-fast scrollable area.” By obtaining this information, the synthesizer thread can ensure that the input event is sent to the main thread when an event occurs in that region. If the input event comes from outside the region, the synthesizer thread continues to synthesize new frames without waiting for the main thread.

Event delegation

A common pattern of event handling in development is event delegation. Because the event bubbles, you can attach an event handler to the top-level element and delegate tasks based on the event target, such as:

document.body.addEventListener('touchstart'.event= > {
    if(event.target === area) { event.preventDefault(); }});Copy the code

This event-delegate pattern is attractive if you need to write an event handler for all elements. However, if you look at the code from the browser’s point of view, the entire page is now marked as a non-fast scrollable area. This means that even if the program doesn’t care about input from some part of the page, the synthesizer thread must communicate with the main thread and wait for it every time an input event comes in. Therefore, the smooth rolling ability of the synthesizer is defeated.

To reduce this, we can pass property passive: true, which indicates to the browser that we still want to listen for events in the main thread, but the synthesizer can continue to synthesize new frames, for example:

document.body.addEventListener('touchstart'.event= > {
    if (event.target === area) {
        event.preventDefault()
    }
 }, {passive: true});
Copy the code

Check whether the event can be cancelled

There was a scene where there was only horizontal scrolling, but no vertical scrolling.

Passive: true Using options in pointer events means that the page scroll should be smooth, but vertical scrolling might have to start preventDefault if desired to limit the scroll direction. This can be checked with the event.cancelable method, for example:

document.body.addEventListener('pointermove'.event= > {
    if (event.cancelable) {
        event.preventDefault(); // block the native scroll
        /* * do what you want the application to do here */}}, {passive: true});
Copy the code

Alternatively, you can use the CSS rule touch-action to eliminate event handlers entirely, such as:

#area {
  touch-action: pan-x;
}
Copy the code

Looking for the event target

When a synthesizer thread sends an input event to the main thread, the first thing it runs is a hit to find the event target. Hit uses the draw record data generated during rendering to find out below the coordinates of the point where the event occurred.

Minimize event scheduling to the main thread

Given that the typical monitor refreshes the screen 60 times per second, and that we need to keep pace to get smooth animation. For input. A typical touchscreen device delivers 60-120 touch events per second, and a typical mouse delivers 100 events per second. The fidelity of input events is higher than our screens can refresh.

If a continuous event like TouchMove is sent to the main thread 120 times per second, it might trigger too many hits and JavaScript executions compared to the speed of the screen refresh:

To minimize too many calls to the main thread, Chrome merges successive events (such as wheel, MouseWheel, Mousemove, Pointermove, TouchMove) and delays scheduling until the next requestAnimationFrame, You can see that the timeline is the same, but the events are merged and delayed.

Similar events such as keyDown, keyUp, mouseup, mouseDown, TouchStart, and TouchEnd are executed immediately.

use`getCoalescedEvents`Get the in-frame event

For most Web applications, merge events should be enough to provide a good user experience. However, for things like building drawing programs and placing paths based on TouchMove coordinates, you can lose intermediate coordinates when drawing smooth lines. In this case, you can use the method in the getCoalescedEvents pointer event to get information about these merge events.

The image below shows a smooth touch gesture path on the left and a combined finite path on the right:

window.addEventListener('pointermove'.event= > {
    const events = event.getCoalescedEvents();
    for (let event of events) {
        const x = event.pageX;
        const y = event.pageY;
        // draw a line using x and y coordinates.}});Copy the code

The resources

Inside look at modern web browser