This paper mainly introduces the following contents:

  • The high-level structure of the browser
  • How browsers render
  • How do browsers load Javascript scripts
  • How does the Javascript engine work

The introduction of

What happens when you type the URL into your browser and press Enter?

Let’s start with the most familiar interview question. Can you blurt out the answer without reading the passage? You can skip this section if you can congratulate yourself. If not, take a look

  1. URL parsing: Extract domain name fields from urls
  2. DNS Domain name Resolution:
    • Search the browser cache: the browser caches DNS information about websites visited within 2-30 minutes, if not found
    • Check the system cache: Check the hosts file, which holds data about domain names and IP addresses visited by web sites, if not found
    • Check the router cache: the router has its own DNS cache, if not found
    • Check ISP DNS cache: ISP DNS cache (local server cache), if not found
    • Recursive search: Searches the IP addresses of the target domain names from the root DNS server to the TOP-LEVEL DNS server and then to the Ultimate DNS server
  3. Establishing a TCP connection between the browser and the server (three-way handshake):
    • First handshake: The client sends a request to the server for confirmation
    • Second handshake: The server receives the request, acknowledges it, and replies with an instruction
    • Third handshake: The client receives a reply from the server and returns an acknowledgement
  4. Request and transfer data: The server parses the client request and returns the corresponding data
  5. Browser render page: leave this blank and expand it when we talk about browser principles
  6. Close the TCP Connection: After the data is returned from the request, you can choose whether to disconnect the TCP Connection according to the keep-alive property of the Connection and release the TCP Connection with four waves.

Emmmm, this question is basically answered, but it raises another question. What does the browser do after it gets the data from the server to render the web page to our monitor? Let’s explore the browser’s secrets

The high-level structure of the browser

Let’s first look at the browser’s main components:

  • User interface: includes address bar, forward/back buttons, bookmark menu, etc. All parts of the display belong to the user interface, except for the page you requested displayed in the browser’s main window.
  • Browser engine: Transmits instructions between the user interface and the rendering engine.
  • Rendering engine: Responsible for displaying the requested content. If the requested content is HTML, it is responsible for parsing the HTML and CSS content and displaying the parsed content on the screen.
  • Network: Used for network calls, such as HTTP requests. Its interfaces are platform independent and provide an underlying implementation for all platforms.
  • User interface back end: Used to draw basic widgets, such as combo boxes and Windows. It exposes a common interface that is platform-independent, while underneath it uses the operating system’s user interface approach.
  • Javascript parser: Used to parse and execute Javascript code.
  • Data storage: This is the persistence layer. Browsers need to keep all kinds of data, such as cookies, on their hard drives. The new HTML specification (HTML5) defines a “web database,” which is a complete (but lightweight) in-browser database.

It’s worth noting that unlike most browsers, Chrome has a separate rendering engine instance for each TAB page. Each TAB is a separate process.

The browser’s rendering engine

Of the major components of the browser, the one we care most about is the browser’s rendering engine, because the rendering engine, as the name implies, determines what is rendered in the browser. The rendering engine is also called the browser kernel, and different browsers use different rendering engines. Common rendering engines used by browsers are as follows:

The rendering engine The browser
Trident(MSHTML) IE, MaxThon, TT, The World, 360, Sogou browser, etc
Gecko Netscape6 and above, FF, mozilla silver /SeaMonkey, etc
Presto Opera7 and above. [Opera kernel: Presto, now: Blink]
Webkit Safari, Chrome, etc. [Chrome: Blink (WebKit offshoot)]
EdgeHTML Microsoft Edge. [This kernel is actually from MSHTML fork, which removes almost all IE proprietary features]

Let’s take Webkit as an example of how the browser rendering engine works. Gecko’s workflow is basically the same as Webkit’s, with slightly different terminology.

Webkit’s main workflow

Came back from the HTTP request, produce the streaming data, follow-up of the DOM tree building, CSS calculation, rendering, synthesis, paint, are as much as possible output flow processing in the previous step: don’t need to wait until the end of a step completely namely, began to deal with the output of the previous step, so we are browsing the web, a page will see step by step.

HTML parsing: From HTML to DOM tree

Webkit uses HTML parsing algorithms to transform HTML into a DOM tree. Let’s take a look at HTML to DOM tree conversions:

Tree building

  • Tokenization: Parsing input into multiple tags (HTML tags include start tags, end tags, attribute names, and attribute values). The tag generator recognizes the tag, passes it to the tree constructor, and then accepts the next character to recognize the next tag; And so on until the end. The tokenization algorithm is implemented by state machine.
  • Tree construction: Each node sent by the tag generator is processed by the tree builder, and the specification defines the DOM elements for each tag, which are built upon receipt of the corresponding tag, and are added not only to the DOM tree, but also to the stack of open elements. This stack is used to correct nesting errors and handle unclosed tokens, and its algorithm can also be represented by a state machine.

After the HTML is parsed, the browser marks the document state as interactive and begins parsing scripts in Deferred mode. The document state is then set to done and a load time is triggered.

CSS parsing: From CSS to StyleSheet objects

The CSS parser parses the CSS file into a StyleSheet object. Let’s look at the conversion from CSS to StyleSheet:

  1. The order in which the selectors appear must be the same as the order in which the DOM tree is constructed. That is, when the selectors are constructed to the current node, they can accurately judge the CSS rules matched by the node and do not need subsequent node information.
  2. CSS styles are matched from right to left, and DOM finds all the CSS styles it matches and then does a weighted calculation to determine the final style, so it’s not hard to see why the style sheet information in the Chrome console is displayed that way.

Build a rendering tree: Integrate the DOM tree and StyleSheet object into a rendering tree

When building a rendering tree, you need to calculate the visual properties of each rendering object. Each DOM node has a “attach” method, which is called when the node is inserted into the DOM tree. The style properties of the node are calculated to generate a renderer. Let’s take a look at the process of integrating (‘ attaching ‘in WebKit jargon) :

Layout: Place the renderer box in place

All renderers have a “Layout” or “reflow” method, and each renderer calls the Layout method of its offspring that needs to be laid out. There are many typesetting methods: normal streaming text, absolute positioning, floating element typesetting, Flex typesetting, etc.

Render: Turn each renderer box into a bitmap

Rendering is the process of turning a model into a bitmap, borrowed from computer graphics.

Bitmap is to create a two-dimensional table in memory, and store the color corresponding to each pixel of an image (bitmap information is also the most memory occupied information in the DOM tree of the browser, we do memory optimization, mainly consider this part).

Synthesis: Bitmap synthesis to improve performance

This process is really a performance consideration, not a necessary part of implementing the browser. The process of composition is to combine bitmaps according to the composition strategy. The composition strategy is to minimize the number of draws. It is to “guess” the elements that might change and exclude them from composition.

Currently, major browsers use properties such as position and transform to determine composition strategies to “guess” how these elements might change in the future. However, the accuracy of such guesses is limited, so the new CSS standard provides the will-change attribute, which can be used by the business code to prompt the browser’s composition strategy. The flexible use of this feature can greatly improve the effect of the composition strategy.

Drawing: The process of drawing a bitmap onto a screen to make it visible to the naked eye

In general, the browser doesn’t need code to handle this process, it just needs to hand the final bitmap to the operating system.

So far we have outlined the main workflow of Webkit. Now let’s summarize. The data returned from the HTTP request is parsed into A DOM tree and a StyleSheet object by the HTML parser and CSS parser, and then the two are integrated into a rendering tree. Then render the renderer box into a bitmap, synthesize the bitmap according to the composition strategy to improve the drawing performance, and give the bitmap to the operating system to draw on the screen. It’s easy to understand that CSS does not block DOM parsing, but it does block DOM rendering.

Now that we’ve covered the main workflow of the rendering engine using Webkit as an example, we seem to be missing something. Javascript, we haven’t mentioned what Webkit does when it parses into Javascript code. Let’s take a look at this

The browser loads the JavaScript script

Normal loading process

The browser loads the JavaScript script, mainly through the

  1. The browser’s rendering engine holds rendering control and parses the HTML page as normal
  2. Analytical met<script>Tags, rendering engines hand over control to Javascript engines (e.g. Chrome’s V8)
  3. if<script>Tag references external scripts so download them before executing them, otherwise execute the code directly
  4. The JavaScript engine hands over control to the rendering engine, which continues parsing

When an external script is loaded, the browser pauses the page rendering and waits for the script to download and execute before continuing rendering. The reason is that JavaScript code can modify the DOM, so you have to cede control to it, or it can lead to complex thread races.

The defer attribute

When the browser parses to the

  1. The browser’s rendering engine holds rendering control and parses the HTML page as normal
  2. Resolve the encounter that contains the defer attribute<script>Tag to continue parsing the HTML while downloading the external link script in parallel
  3. Parsing is complete and the document is inInteractive stateStart parsing atdeferredSchema scripts
  4. After the script is parsed, set the document state to done,DOMContentLoadedThe event then fires

Points to note when using the defer attribute:

  • The script file downloaded by the defer property is inDOMContentLoadedExecute before the event is triggered (that is, just finished reading</html>The label)
  • The defer attribute ensures that they are executed in the order in which they appear on the page
  • The defer attribute does not work for script tags that are built in rather than loaded with external scripts, as well as dynamically generated script tags
  • External scripts loaded with defer should not be useddocument.writemethods

Async attributes

When the browser parses to the

  1. The browser’s rendering engine holds rendering control and parses the HTML page as normal
  2. Parsing encounters an async property<script>Tag to continue parsing the HTML while another process downloads the linked script in parallel
  3. After the script is downloaded, the browser stops parsing THE HTML and starts to execute the downloaded script
  4. After the script is executed, the browser resumes parsing the HTML

Points to note when using async properties:

  • The async property ensures that the browser continues rendering while the script is downloaded
  • The async property does not guarantee the order in which scripts are executed
  • Scripts that contain async properties should not be useddocument.writemethods
  • If you use both the async and defer properties, the latter does not work and the browser behavior is determined by the async property

Dynamic loading of scripts

CSS blocks JS loading

The JS script may reference the DOM style for calculation. To ensure the correctness of calculation, Firefox waits until all the style sheets in front of the script are downloaded and parsed before executing the script. Webkit suspends the script once it finds that it references a style, and resumes execution after the style sheet has been downloaded and parsed.

In addition, for resources from the same domain name, such as script files, style sheets, and image files, the browser generally limits the download of a maximum of 6 to 20 resources at a time, that is, the maximum number of TCP connections opened at the same time, to prevent too much pressure on the server. If the resource is from a different domain name, there is no restriction. Therefore, static files are usually placed under different domain names to speed up the download.

Browser preparsing

Both WebKit and Firefox have made this optimization. As the script executes, other threads parse the rest of the document to find and load additional resources that need to be loaded over the network. In this way, resources can be loaded on parallel connections, increasing overall speed. Note that the pre-parser does not modify the DOM tree, but hands that job off to the main parser; The pre-parser only resolves references to external resources, such as external scripts, stylesheets, and images.

Emmmm, the browser’s rendering engine is only responsible for parsing HTML and CSS. When it comes to JS, it hands over control to the JS engine for parsing and execution. Because the JS engine takes away control of rendering, JS will obviously block DOM parsing, and the parsing browser is optimized for asynchronous loading and pre-parsing so that JS does not block DOM. Well, now that we’ve looked at the rendering engine, let’s look at the workflow of its Javascript companion

How Javascript engines work

First, let’s look at a few concepts that will help us better understand how JS code executes.

  • Javascript engine: Responsible for compiling and executing Javascript programs from start to finish.
  • Compiler: responsible for parsing and code generation and other dirty work. The specific work is shown in the figure below:

Now that we understand the basic concepts, let’s go through how the Javascript engine works

  1. A Javascript engine resides in memory and waits for a host (such as a browser) to pass Javascript code to it for execution.
  2. When the host passes Javascript code to it, it throws the Javascript source code to the compiler to compile into executable code.
  3. The engine then starts executing executable code as follows:

Will host (such as browser) launched a macroeconomic task is added to the task queue, if JS engine, the task of the main thread stack is empty, it will automatically pull from the macroscopic task queue and task execution, encountered in the execution of setTimeout asynchronous code will first on the timer module, timer modules timing after the entry to add it to macro task queue; Code such as a Promise encountered during execution adds it as a micro task to the micro task queue at the end of the current macro task. After the execution of normal tasks in the current macro task, the tasks in the micro task queue at the end of the current macro task will be executed. After the execution of tasks in the micro task queue, the execution of the macro task will be completed, and the main thread task stack will pull down the next macro task. During this time, the host environment can add tasks to the macro task queue at any time, and the JS engine can add micro tasks to the micro task queue at the end of the current macro task queue at any time. As shown, a loop of events is formed.

Emmmm, if you want to learn more about how the JS engine works, I recommend the following article and video on how chrome’s V8 engine works.

  • How does V8 work – V8’s JavaScript execution pipeline
  • Start with v8’s memory management algorithm – how to manage memory

I’m grateful that you can still read this article. There may be some parts that are not perfect, and I will slowly iterate to improve it

At the end

Finally, I would like to talk about my own feelings. Personally, I always believe that learning principles is very important. Recently, I learned the course of solving problems outside the circle, which strengthened my belief. Because the first step to solving a problem is to clarify the problem, and learning principles can help us locate the problem quickly and solve the problem. Einstein once said that if I were given one hour to solve a question that would determine my life and death, I would spend 55 minutes trying to figure out what the question was about. Once you know exactly what it’s asking, you have five minutes left to answer the question. It is also true that in practical work, once the program goes wrong, we tend to spend a lot of time on debugging, and once the problem is found, it can be solved quickly.

Deal for front-end engineer is the browser, the understanding of the principle of operation of the browser for writing code and the performance optimization of the project will help, so I have been watching a lot of articles about the working principle of the browser and books, small volume, output finally felt it was time to sort out some things, hope can deepen their understanding, I hope I can be helpful to my friends. If there is something wrong in this article, please point it out in the comments section. Finally, thank you for reading this article.

The resources

  • How browsers work: Behind the scenes of the new Web browser
  • Repeat the front-end
  • Overview of browser Environment
  • Javascript You Don’t Know (Part 1)