1. The introduction

This week’s intensive reading was on V8 engine Lazy Parsing, and check out what V8 engines tried to do to optimize performance.

The tuning technique described in this article is called Preparser, and it optimizes performance by skipping unnecessary function compilation.

2. Overview & Intensive reading

Parsing Js occurs in the critical path of web page operation, so accelerating the parsing of Js can accelerate the efficiency of web page operation.

However, not all Js need to be executed at initialization, so there is no need to parse all Js at initialization! There are three cost issues associated with compiling Js:

  1. Compiling unnecessary code takes up CPU resources.
  2. It takes up unnecessary memory space before GC.
  3. The compiled code is cached on disk and takes up disk space.

As a result, all major browsers implemented Lazy Parsing, which preparsing unnecessary functions, Parsing only what was required by an external function, and full Parsing only occurred when the function was called.

Pre-parsing challenges

Preparsing is also easy, since it is only necessary to determine whether a function will execute immediately, and only functions that execute immediately need to be fully parsed.

What complicates pre-parsing is the problem of variable allocation. The article uses an example of a stack call to illustrate why:

Execution of Js code is done on the stack, as in the following function:

function f(a, b) {
  const c = a + b;
  return c;
}

function g() {
  return f(1.2);
  // The return instruction pointer of `f` now points here
  // (because when `f` `return`s, it returns here).
}
Copy the code

The call stack looks like this:

First, globalThis globalThis, then f, and then assign to a and b. When f is executed, the state of the g stack is saved by

(return Instruction Pointer), and the pointer to the return position of the stack is saved by

(frame Pointer). Finally, variable C is assigned.

This doesn’t seem to be a problem, as long as the values are stored on the stack. But defining variables inside functions is different:

function make_f(d) {
  // ← declaration of 'D'
  return function inner(a, b) {
    const c = a + b + d; // ← reference to 'd'
    return c;
  };
}

const f = make_f(10);

function g() {
  return f(1.2);
}
Copy the code

Declare d in function make_f and use d in return function inner. The function call stack looks like this:

We need to create a context to store the value of variable D in function f.

That is, if a variable defined inside a function is used by Scope, the Js engine needs to recognize the situation and store the value of the variable in the context.

So for each input parameter of a function definition, we need to know whether it will be referenced by the function. That said, in the Preparser phase, we need to analyze as little as possible about what variables are referenced by internal functions.

Indistinguishable references

Tracking variable declarations and references in preprocessors is complicated because Js syntax prevents inferences from partial expressions, such as the following function:

function f(d) {
  function g() {
    const a = ({ d }
Copy the code

We don’t know if the D in line 3 actually refers to the D in line 1. It might be:

function f(d) {
  function g() {
    const a = ({ d } = { d: 42 });
    return a;
  }
  return g;
}
Copy the code

It may also be just a custom function argument, independent of d above:

function f(d) {
  function g() {
    const a = ({ d }) = > d;
    return a;
  }

  return [d, g];
}
Copy the code

Inert parse

When functions are executed, only the outermost functions are fully compiled and an AST is generated, and only preparser is done for the internal modules.

// This is the top-level scope.
function outer() {
  // preparsed
  function inner() {
    // preparsed
  }
}

outer(); // Fully parses and compiles `outer`, but not `inner`.
Copy the code

To allow lazy compilation of functions, the context pointer points to the object of ScopeInfo (as you can see from the code, ScopeInfo contains context information, such as whether the current context has a function name, whether it is in a function, and so on). When compiling the inner function, you can use ScopeInfo to continue compiling the subfunctions.

But to determine whether a lazy-compiled function itself needs a context, we need to parse the internal function again: for example, we need to know whether a subfunction refers to a variable defined by the outer layer function.

This results in recursive traversal:

Because code always contains some nesting, and compilation tools produce multilayer nested expressions like IIFE(call function now), recursion performance is poor.

And here is a kind of method can be simplified to linear time complexity: will the location of the variable assignment serialized as a dense array, when the inertia analytic function, the variable will recreate in the original order, so you don’t need to because the son function might refer to outer reasons, define a variable for all sub function recursive inert parsing.

The time complexity optimized in this way is linear:

Optimization for modular packaging

Since modern code is almost always written modularly, the build encapsulates modular code in IIFE (a closure that is invoked immediately) during packaging to ensure that the simulated modular environment runs. Such as the function () {… }) ().

This code looks like it should be lazily compiled in functions, but in fact this modular code has to be compiled from the start, otherwise it will affect performance, so V8 has two mechanisms for identifying functions that might be called immediately:

  1. If the function is in parentheses, for example(function(){... }), assuming it will be called immediately.
  2. Starting with V8 V5.7 / Chrome 57, uglifyJS will be recognized! function(){... }(), function(){... }(), function(){... } ()This pattern.

However, in the browser engine parsing environment is more complex, it is difficult to complete the function string matching, so only a simple judgment of the function header. So browsers don’t recognize the behavior of anonymous functions like the following:

// pre-parser
function run(func) {
  func()
}

run(function(){}) // Execute it here, full Parser
Copy the code

The above code looks fine, but since the browser only detects functions enclosed in parentheses, this function is not considered to be executed immediately and will therefore be repeated full-parse on subsequent executions.

There are also some code-assist conversion tools to help V8 recognize it correctly, such as optimize-js, which will transform the code as follows.

Before conversion:

!function (){} ()function runIt(fun){ fun() }
runIt(function (){})
Copy the code

After the transformation:

! (function (){}) ()function runIt(fun){ fun() }
runIt((function (){}))
Copy the code

However, V8 V7.5 + has solved this problem to a large extent, so there is no need to use the optimize-js library

4. To summarize

The JS parsing engine does a lot of work on performance optimization, but it also has to deal with special IIFE closures generated by the code compiler to prevent parsers from repeating the immediate execution closures.

Finally, don’t try to always enclose functions in parentheses, as this will prevent lazy compilation from being enabled.

Parsing V8 Engine Lazy Parsing · Issue #148 · DT-fe /weekly

If you’d like to participate in the discussion, pleaseClick here to, with a new theme every week, released on weekends or Mondays. Front end Intensive Reading – Helps you filter the right content.

Pay attention to the front end of intensive reading wechat public account

special Sponsors

  • DevOps full process platform

Copyright Notice: Freely reproduced – Non-commercial – Non-derivative – Remain signed (Creative Commons 3.0 License)