This article has participated in the third “topic writing” track of the Denver Creators Training Camp. For details, check out: Digg Project | Creators Training Camp third is ongoing, “write” to make a personal impact.

📢 Hello everyone, I am Xiao Cheng, a prospective sophomore front-end enthusiast

📢 will take you through how V8 executes JS code

📢 May you be true to yourself and love life

The introduction

The source code is first parsed by the parser into the AST, which is then interpreted by the interpreter into the final bytecode

Let’s talk about how the parser parses into an AST

First of all, what is AST

🍉 1. Generate an AST

AST is an abstract representation of the syntactic structure of the source code

It represents the syntactic structure of a programming language as a tree, with each node in the book representing a structure in the source code

Let’s look at an example of how an AST is generated. Okay

let name = 'ljc'
Copy the code

We define a name variable

The first step for the parser is to split the statement into its smallest, non-detachable units

Generate token streams, which are arrays of syntax units

[{"type": "Keyword"."value": "let"
    },
    {
        "type": "Identifier"."value": "name"
    },
    {
        "type": "Punctuator"."value": "="
    },
    {
        "type": "String"."value": "ljc"
    },
    {
        "type": "Punctuator"."value": ";"}]Copy the code

The second step is grammatical analysis

Convert the token data in the previous step into an AST to obtain a tree structure

Hence AST is also called abstract syntax tree

As well as generating the AST, V8 also generates the relevant scopes, which hold the relevant variables

🍏 2. Generate bytecode

Once you have an AST and scope, you can generate bytecode, which is code between the AST and machine code and can be executed without having to convert it to machine code. Bytecode can be understood as an abstraction of machine code. In addition to quickly generating unoptimized bytecode, the Ignition interpreter can also execute partial bytecode.

So why generate bytecode? Wouldn’t it be better to just convert to machine code?

  • Direct conversions create a memory footprint problem because if the abstract syntax tree all generates machine code, machine code takes up much more memory than bytecode

This is a comparison from the Internet

  • For some JavaScript usage scenarios, it is more appropriate to use an interpreter to parse into bytecode, and some unnecessary code can be produced without producing machine code, thus minimizing the memory footprint

🍒 3. Code execution and optimization

The bytecode generated in the previous step is executed directly by the interpreter. As the code continues to run, the interpreter receives a lot of information that can be used to optimize the code, such as the types of variables and which functions are executed more frequently. This information is generated by a compiler called TruboFan. It compiles optimized machine code from this information and bytecode.

Run several optimization strategies

  1. The function only declares that it is not called and is not parsed to generate the AST
  2. The function is called only once, and the bytecode is interpreted directly
  3. Functions are called multiple times, may be marked as hot functions, and may be compiled into machine code

About hot spot function

The TurboFan compiler compiles this hot code into more efficient machine code and stores it until the next time it executes, replacing the bytecode with the current machine code, which gives a great indication of how efficient the code is. TurboFan also performs a de-tuning process when deciding that a piece of code is no longer hot code, throwing away the optimized machine code and returning the process to the interpreter.

Sometimes the interpreter collects incorrect information, which can cause TurboFan to generate machine code that is reverse-engineered into bytecode

Such as: When we define a sum function, in the back of the multiple calls, it receives two parameters we pass it is plastic, the sum function is identified as the hot function, the interpreter will collect to the type of information is sent to the compiler, the compiler to generate optimized machine code, at this time of the type is defined as an integer, in the next call, Execute machine code directly.

And if, on the next call, the argument is a string, the machine code doesn’t know what to do with it, and it returns to the interpreter to explain the execution, right

Therefore, we should try not to change the type of a variable from side to side, which can cause some damage to V8 engine performance


This is how V8 executes JS code

See a picture on the Internet (invade delete), very image, unfortunately my excalidraw can not go up

The resources

  1. How does V8 execute a piece of JS code?
  2. 8 minutes to learn how a V8 engine runs JS!

Thank you very much for reading, welcome to put forward your opinion, if you have any questions, please point out, thank you! 🎈