Brief introduction to Chrome architecture history

0. Processes and threads

When you open a page with Chrome, Chrome opens five processes.

Note:

What is a process?

A process is an instance of a program running. When a program is started, the operating system creates a new area of memory for the process to store code, running data, and a main thread to perform tasks. Such an environment is called a process.
Thread and process relationship?
- Threads cannot exist alone and are started and managed by processes
- Data in a process is shared between threads (common data can be read and written)
- The failure of any thread in the process will cause the entire process to crash
- When a process is shut down, the operating system reclaims the memory used by the process (even if there is a memory leak, that memory is reclaimed).
- Content between processes is isolated from each other (e.g. QQ and wechat, communication uses IPC)
Why did 5 processes start?
- See the subsequent multi-process browser structure

1. Single-process browser era

Until 2007, browsers were single-process, meaning that all of the browser’s functional modules ran in the same process

Single-process browser architecture:

1.1 Problems with single-process Browsers

unstable

Many of the early browser functions were implemented through plug-ins (Web video, web games, etc.), and plug-ins were the most problematic.

In addition to plug-ins, the render engine module is also unstable, and some complex JS code can cause the module to crash.

Both run in the browser’s page thread, and a failure in either can cause that thread to crash and the entire browser process to crash.
Not smooth

Take the page thread as an example, js scripts, page rendering and plug-ins all run in this thread. If the script of one page has an infinite loop, it will cause other pages and plug-ins to have no chance to execute tasks, and the whole browser will lose response and become sluggish. For details, see the event loop on the page
unsafe

Plug-ins can be written using C/C++ code, through plug-ins can obtain any resources of the operating system.

Page scripts can also gain system permissions through browser vulnerabilities.

2. Multi-process browser era

2.1 Early multi-process architecture

Early Chrome multi-process architecture:

Pages run in a separate render process, one page/plug-in for each render/plug-in process (see sidebar at the end for “same site” issues)

2.1.1 Problems solved by multi-process architecture

unstable

The plug-in, renderer, and main browser process are isolated from each other. If the plug-in or page crashes, only the current page or process of the plug-in is affected, not the process of other browser pages.
Not smooth

Processes are isolated from each other, and blocking the JS loop only makes the current page unresponsive. If there is a memory leak, the memory will be reclaimed when the current page is closed (the current page rendering process ends).
unsafe

The multi-process architecture uses a secure sandbox (as if the operating system locks the process) in which programs can run, but cannot write anything to your hard drive or read anything from sensitive locations.

2.2 Current multi-process architecture

This is equivalent to separating some modules that normally run in the browser’s main process

Functions of each process:

Browser process. It is mainly responsible for interface display, user interaction, sub-process management, and storage
Render process. The core task is to turn HTML, CSS, and JavaScript into web pages that users can interact with. Both the typography engine Blink and the JavaScript engine V8 run in this process. By default, Chrome creates a rendering process for each Tab (” same page “is a bit different, see more at the end). For security reasons, renderers are run in sandbox mode.
Process of GPU. The original intention was to achieve the effect of 3D CSS. Later, the UI interface of web pages and Chrome were drawn by GPU
Network process. Mainly responsible for loading network resources on the page
Plug-in process. Mainly responsible for the running of plug-ins

2.2.1 Problems caused by the multi-process model

Higher resource usage

Because each process contains a copy of the common infrastructure (such as the JavaScript runtime environment)
More complex architectures

The browser modules have high coupling and poor scalability

2.3 The future of service-oriented architecture(SOA)

To address these issues, Chrome’s overall architecture will move toward the “service-oriented architecture” used in modern operating systems, where the various modules will be reorganized into separate services (screenshot of the first five processes), each of which can run in a separate process. Services must be accessed using defined interfaces and communicated through IPC to build a more cohesive, loosely-coupled system that is easy to maintain and expand.

Chrome “Service-oriented Architecture” process model

Chrome eventually reconstructs the UI, database, files, devices, network and other modules into basic services, similar to operating system low-level services.

Chrome also offers a flexible architecture that allows basic services to run in multiple processes on powerful devices, but on resource-constrained devices, Chrome can consolidate many services into one process to save memory.

2.4 Common Problem Scenarios

Even with today’s multi-process architecture, users will occasionally encounter a single page that freezes and eventually crashes all pages

What is the cause of the collapse?

The same site defines the root domain (for example, geekbang.org) + protocol (for example, https:// or http://), and includes all subdomains and different ports under the root domain, such as the following three:

time.geekbang.org
www.geekbang.org
www.geekbang.org:8080

Both belong to the same site, because their protocol is HTTPS and the root domain name is geekbang.org.

You may know the same origin policy, but there are some differences between the same site and the same origin policy, and here you need to understand that they are not the same thing.

Chrome’s default strategy is one render process per TAB. However, if a new page is opened from a page and belongs to the same site as the current page, the new page will reuse the parent page’s rendering process. Officially, this default policy is called process-per-site-instance

In plain English, if several pages fit the same site, they will be assigned to a rendering process. So, in this case, if a page crashes, it will cause pages on the same site to crash at the same time because they are using the same rendering process.

Why let them run in a process?

Because within A rendering process, they share the JS execution environment, meaning that page A can execute scripts directly from page B. Because it is the same site, so there is this demand.

Does a single-process browser open multiple pages and only have one render thread? Is it more reasonable to open one thread per page?

IE6 era, the browser is a single process, all pages are running in a main thread, then IE6 is designed in this way, and at this time IE6 is a single TAB, that is to say, a page a window.

At this time, there are a lot of domestic browser, are based on IE6 to secondary development, and IE6 native architecture is all pages run in a single thread, which means that all pages share the same set of JavaScript running environment, the same, for storing cookies are also in a thread operation.

And these domestic browsers due to the need to use the form of multiple labels, so one of the label page will affect the whole browser.

Due to the lag, domestic browsers began to try to support page multi-threading, that is, let part of the page run in a separate thread, running in a separate thread, which means that each thread has a separate JavaScript execution environment, and Cookie environment, at this time the problem comes:

For example, the page of site A logs in to A website, saves some Cookie data to disk, and then saves part of Session data in the current thread environment. Because the Session does not need to be saved to disk, the Session will only be saved in the current thread environment. At this time, open the page of another site A. Assuming that the page is in another thread, it first reads the Cookie information on the hard disk. However, because the Session information is stored in another thread, it cannot be read directly, so it needs to realize A Session synchronization problem. Because IE does not have the source code, so the implementation is very air crash, domestic browser took a long time to solve this problem.

The Session problem is fixed, but the problem of suspended animation still exists, because the process uses a window that is attached to the browser’s main window, so they share a message loop, which we’ll talk about in more detail later, which means if that window freezes. It can also cause the entire browser to freeze. Another trick made by domestic browsers is to make the page into a separate popover. If the page is stuck, the popover is hidden.

Why does a page fake death in Chrome not affect the main window?

This is because Chrome outputs the actual image, and the browser then pastes the image to its own window. In Chrome’s rendering process, there is no rendering window, only the image output, if stuck, at most the image will not be updated.

It took four or five years to develop the technology for a domestic browser, and when it was almost ready, Chrome was released 🙁