Guide language | compilation, interpretation has been introduced in this paper, the basic concepts, such as dynamic and static language and the basic flow of V8 engine. This article will elaborate on it in detail, hoping to provide experience and help for more developers.

I. Compilation and interpretation

Binary instructions are machine code:

  • Compilation: The process of converting source code into object code at once. The program that performs the compilation process is called a Compiler.
  • Explanation: The process of converting source code into object code and running it line by line. The program that performs an interpretation is called an Interpreter. Interpreters are generally VMS, and there are two types of VMS, one stack-based and one register-based.

The compilation process roughly includes five steps, including lexical analysis, syntax analysis, semantic analysis, performance optimization and executable file generation, which involves complex algorithms and hardware architecture. The interpreter is similar.

Static and dynamic languages

High-level languages can be divided into static languages and dynamic languages according to the different ways of execution.

Static languages: languages that use compilation execution, such as C, C++, Golang, etc. Using the compiler to generate object code once, “compile once, run infinite times”, the program runs faster. Compiled languages are generally not cross-platform, meaning they cannot switch between different operating systems.

Dynamic languages: Use languages that explain execution, such as Python, Javascript, PHP, and so on. The execution process requires source code, as long as there is an interpreter, source code can be run on any operating system, good portability, “write once, run anywhere”.

Interpreted languages are cross-platform because of the interpreter middle layer. The interpreter translates the same source code into different machine code on different platforms, and the interpreter helps mask the differences between platforms.

Java and C# are a bit of an oddity. They are half-compiled, half-interpreted languages in which the source code is converted into an intermediate file (bytecode file), which is then taken to the virtual machine for execution. Java led the way, with its intent to be cross-platform and efficient; C# was a late follower, but C# stayed on Windows and had little success on other platforms.

Conclusion:

V8 engine

Javascript is an interpreted language, so the V8 engine corresponds to the interpreter. However, in order to improve JS efficiency, V8 engine will compile in advance.

The V8 engine consists of two stages: compilation, where V8 converts JavaScript to bytecode or binary machine code, and execution, where the interpreter interprets bytecode or the CPU executes binary machine code.

(a) the JIT

The V8 engine uses both interpreted and compiled execution, that is, compilation at run Time, which is called JIT (Just in Time) compilation.

When V8 executes JavaScript source code, it first parses the source code into an AST through the parser, which translates the AST into bytecode and interprets the execution.

The interpreter also logs the number of times a piece of Code is executed, and if it exceeds a certain threshold, the Code is marked as Hot Code. This information is fed back to TurboFan, which optimizes and compiles the bytecode to generate optimized machine Code.

Parser generates abstract syntax trees

Parser generates the AST abstract syntax tree through parsing and lexical analysis, similar to tools such as Babel.

One optimization in generating an AST was Lazy Parsing, because source code that was fully parsed before execution not only took too long to execute, but also consumed more memory.

Lazy parsing means that if a function is not executed immediately, it will be pre-parsed (pre-parser) and fully parsed only when the function is called. The pre-parser validates the syntax of the function, parses the function declaration, and determines the scope of the function. The AST is not generated. This is done by the pre-Parser.

3. Ignition generates bytecode

Bytecodes are abstractions of machine code and can be viewed as small building blocks. Compared to machine code, bytecode not only takes less memory, but also takes a faster time to generate bytecode, which increases startup speed.

In addition, the bytecode is independent of the particular type of machine code and can be converted into machine code by the interpreter, making it easier to port V8 to different CPU architectures.

You can run the following command to view the bytecode generated by the JavaScript code.

node --print-bytecode index.js
Copy the code

Note that the interpreter still converts the bytecode to machine code before executing it, because the computer only recognizes machine code.

(4) TurboFan

Ignition executes the bytecode generated in the previous step and records information such as how many times the code has been run. If the same code has been run many times, it is marked as “HotSpot” and sent to the compiler TurboFan.

TurboFan then compiles it into more efficient machine code and stores it. The next time it executes the code, it replaces the bytecode with the current machine code, making the code much more efficient.

In addition, when TurboFan decides that a piece of code is no longer hot code, it performs de-tuning, throws away the optimized machine code, and returns to Ignition.

TurboFan optimizations include inlining and Escape Analysis.

Inlining is combining related functions to reduce running time. Such as:

function add(a, b) {
  return a + b
}
function foo() {
  return add(2, 4)
}
Copy the code

After inline processing:

Function fooAddInlined() {var a = 2 var b = 4 var addReturnValue = a + b return addReturnValue} // Function fooAddInlined() {return 6}Copy the code

Escape analysis is simply analyzing whether an object’s lifetime is limited to the current function, and if so, optimizing it. Such as:

function add(a, b){
  const obj = { x: a, y: b }
  return obj.x + obj.y
}
 
Copy the code

Will be processed into:

function add(a, b){
  const obj_x = a
  const obj_y = b
  return obj_x + obj_y
}
Copy the code

Iv. Overall process

Author’s brief introduction

Yang Guowang

Tencent front-end development engineer.