Chen Jinhong, front end engineer of Micro Medical Front End Technology Department, a girl who likes to contact with nature!

JavaScript is a programming language with automatic garbage collection. The execution environment is responsible for managing memory while the code is executing. When running JavaScript, you need an engine to handle it, either in a browser or in a Node.js environment. And it must be the perfect choice for you. Let’s take a look at V8’s efficient garbage collection.

How is junk data generated

  • First, let’s look at how junk data is generated. Look at the following code:
 let test = new Object()
 test.a = new Array(10)
Copy the code

When JavaScript executes this code, it first adds a test variable to the global scope and creates an empty object in the heap with its address pointing to Test.

We then create an array of size 10 and point the property address to test.a. The memory layout is as follows:

If, at this point, I assign another object to a, the code looks like this:

test.a = new Object()

The memory layout is as follows:

We can see that the a property, which was referring to an Array object in the heap, now points to another empty object, and then the Array object in the heap is garbage, because we can’t walk from a root object to the Array object, and then the Array object is garbage.

Garbage collection algorithm

The implementation of garbage collection can be divided into the following three steps:

Step 1: Accessibility

From the GC Roots object, traverse all objects in GC Root:

  • Reachable objects: Objects that are iterated through by GC Root are considered reachable, and must be guaranteed to remain in memory.
  • Unreachable objects: Objects that are not traversed by GC Roots are unreachable and tagged, and may be reclaimed.

In a browser environment, there are many GC roots, usually including (but not limited to) the following: the global Window object (located in each iframe); The document DOM tree, which consists of all the native DOM nodes that can be reached by traversing the document; Store variables on the stack.

Step 2: Reclaim memory occupied by unreachable objects

  • All objects marked as recyclable in memory are cleaned uniformly after all tags are completed.

Step 3: Memory defragment

  • After objects are reclaimed frequently, there will be a large amount of discontinuous space in memory, which is called memory fragmentation. When you have a large amount of memory fragmentation, you run out of memory if you need to allocate large contiguous memory, so the last step is to defragment the memory. However, this step is not necessary, for example, the paralgarbage collector we will introduce next does not generate memory fragmentation.

This is the general garbage collection process. V8 currently uses two garbage collectors, the main garbage collector and the secondary garbage collector. Let’s take a look at how the two collectors collect garbage in detail.

Secondary garbage collector and primary garbage collector

  • In V8, the heap is divided into new generation (which typically supports only 1-8m capacity) and old generation (which holds short-lived objects), and old generation (which holds long-lived objects).

Secondary garbage collector

  • Responsible for the garbage collection of the new generation, most of the small objects will be allocated to the new generation, garbage collection is relatively frequent.
  • Insane garbage data from the new generation is processed using the Scavenge algorithm. Divided into two areas: object area, free area. As shown below:

  • Garbage collection process:

    New objects are stored in the object area. When the object area is nearly full, a garbage cleanup operation is performed.

    • 1. Garbage marking and cleaning: first to mark the garbage in the object area; After the marking is complete, the garbage cleaning phase is entered, as shown below:

> < span style = "max-width: 100%; clear: both; min-height: 1em;Copy the code
  • 2. Role reversal: After the replication, perform role reversal. Change the original object area to free area, and change the original free area to object area, as shown below:

Main garbage collector

  • Is responsible for garbage collection in the old generation. Most of the objects that occupy large space and live long will be allocated to the old generation.

  • Garbage data in the old generation is collected by — mark-clean algorithm. Because objects in the old generation are usually large, it is time-consuming to copy large objects, which will lead to low efficiency of collection execution, so the mark-clean method is adopted.

  • Garbage collection process:

    • 1. Marking: The marking stage is to start from a group of root elements and recursively traverse this group of root elements. In this traversal process, the elements that can be reached are called active objects, and the elements that cannot be reached can be judged as junk data.

    • 2. Garbage collection: It is completely different from the garbage collection process of the secondary garbage collector. The main garbage collector will directly clean up the data marked as garbage, as shown below:

  • 3. Collation: As can be seen from the figure above, a large number of discontinuous memory fragments will be generated after clearing. Excessive fragments will lead to large objects unable to allocate enough continuous memory, so another algorithm — mark-collation needs to be introduced.

Optimize the garbage collector

  • Because JavaScript runs on the main thread, the execution of JavaScript scripts will be blocked during garbage collection, resulting in page stuttering and other problems, resulting in poor user experience.

  • To address these issues, the V8 team introduced parallel, concurrent, and incremental garbage collection technologies that address garbage collection efficiency in two main ways:

    • 1. Divide a complete garbage collection task into several smaller tasks to solve the problem of long garbage collection time.
    • 2. Transfer tasks such as marking objects and moving objects to background threads to reduce the time of the main blocking thread.

Next, let’s look at how to optimize the specific technologies.

Parallel recovery

  • If there is only one main thread for garbage collection, the pause time is too long. Therefore, the V8 team introduced the main garbage collection task to introduce multiple helper threads in parallel to speed up the garbage collection execution, as shown in the following figure:

  • The subgarbage collector adopts the parallel strategy. During the garbage collection process, it starts multiple threads to clean up the garbage in the new generation, and these threads simultaneously move the data in the object space to the free area. Because the address of the data has changed, Pointers referencing these objects also need to be updated synchronously

The incremental recovery

  • Although parallel collection can increase the efficiency of garbage collection, it is still a blocking way to garbage collection. Imagine that there is a large object in the old generation, but it will still cause a long pause.

  • Incremental collection uses the marking work to decompose garbage collection into smaller blocks, and only a small part of garbage collection is carried out each time to reduce the main thread blocking time, as shown in the following figure:

Concurrent collector

  • Although incremental collection has greatly reduced the blocking time on our main thread, all of the marking and clearing remains on the main thread. Is there a way to do this without blocking the main thread? From this V8 came concurrent collection.

  • Concurrent collection means that during the execution of JavaScript by the main thread, the auxiliary thread can complete the operation of garbage collection in the background, as shown below:

In practice, these three recycling mechanisms are often used together.

The resources

  • The illustration Google V8