When it comes to garbage collection, we have to think about:

  1. Why is it being collected and to whom?
  2. How does V8 garbage collection work?
  3. What techniques are used to implement and optimize V8 garbage collection?

After reading this article, I believe you will be clear about these problems.

JavaScript’s memory management determines who gets garbage collected

Basic data types: null, undefined, String, Boolean, Number, Symbol, BigInt. Call by value. Values are stored in the Stack.

Reference data types, such as arrays, objects, functions, etc., are called by reference, the reference/pointer is stored in stack memory, and the reference to the data is stored in Heap memory. Strictly speaking, objects are accessed by reference, because arrays belong to objects and functions belong to objects.

As mentioned in MDN, functions are objects of the first class.

In JavaScript, functions are first-class objects, because they can have properties and methods just like any other object. What distinguishes them from other objects is that functions can be called. In brief, they are Function objects.

For stack memory, the operating system automatically allocates and frees memory.

The heap memory needs to be freed manually by the JS engine.

Question: Why is the stack faster than the heap?

A: The stack is the data structure provided by the machine system, and the computer provides support for the stack at the bottom: special registers are allocated to store the address of the stack, and special instructions are executed to push and unload the stack, which determines the high efficiency of the stack.

The heap is provided by the C/C++ function library, and its mechanism is complex. For example, to allocate a block of memory, the library functions follow a certain algorithm to search the heap for contiguous memory space of sufficient size to be available for allocation.

So, to sum up two points:

  1. From the point of view of allocation and release, the allocation/release of heap memory requires a call to the function (malloc, free), which takes a certain amount of time, while the stack does not.

  2. To access a specific unit of the heap, read a pointer from the stack, and then read data from the heap according to the pointer. Access to the stack only needs to be accessed once.

Chrome V8 garbage collection mechanism

Chrome V8 is one of the JS engines. It is an open source high-performance engine written in C++ and developed by Google Denmark as part of Google Chrome, which is also used in node.js.

V8 divides heaps into two categories: new generation and old generation.

The Minor GC-scavenge. Responsible for the recycling of the new generation of garbage.

Major GC-Mark-sweep & Mark-Compact: Is responsible for garbage collection of the entire heap.

Minor GC

The new generation reactor is divided into two regions, from-space and to-space.

Scavenge algorithm:

  1. Marks active and inactive objects
  2. Copy from-space live objects into contiguous memory blocks of to-space
  3. Free from-space space
  4. Swap from-space and to-space roles
  5. Updates a pointer that refers to the original object that has been moved. The newly created object is assigned to the next free address in from-space

The Cenozoic era is also divided into the Nursery and Intermediate progeny. The first time an object is allocated memory, it is allocated to the Nursery child, and if it survives a round of garbage collection, it is moved to the Intermediate child. If it survives another round of recycling, it will be moved to old age.

When objects survive the GC, they move generation by generation.

The Nursery activity object moves To the to-space and becomes Intermediate.

The Intermediate activity object is moved To the old generation, and the Nursery activity object is moved To the to-space.

Question: How does GC distinguish between live and inactive objects?

A: Object Reachability. The pointer to the root node (window, global) is used to search for the child nodes. The child nodes are reachable and marked, and the search process is recursed until all the child nodes have been traversed. An unmarked node indicates that there is no chain of references from the root node to the node, which is an object to be reclaimed by the garbage collector. The object to which the root node pointer points is called a root set.

Major GC

When the size of the active objects in a legacy generation exceeds heuristic-derived limits, garbage collection of the whole heap is performed.

It is divided into two steps:

1. Mark-Sweep

Mark: Marks all active objects

Sweep: Cleans all inactive objects, resulting in memory fragmentation gaps

The inactive objects are cleaned up and the remaining memory gaps are added to the data structure of the free list. After the marking is complete, the contiguous memory gaps are added to the appropriate free list so that the appropriate memory space can be found for storage when new objects are allocated.

2. Mark-Compact

Objective: To solve the memory fragmentation gap problem

Move all live objects into a free contiguous memory space, and when you’re done, you can clean up the memory fragmentation. (Essentially copy-delete)

The stop-the-world main thread is paused

An important measure of garbage collection time is the time it takes for the main thread to pause while GC is being performed.

Orinoco, V8 garbage collector

Parallel

Parallelism is where the main thread and the worker thread perform roughly the same amount of work simultaneously.

This is still a stop-the-world approach, but now The total pause time is divided by The number of threads participating (plus some synchronization overhead).

Difficulty: When JavaScript is not running, the JavaScript heap is suspended (because there are no new objects allocated in the heap), so each worker thread needs to make sure it can synchronously access any objects that other worker threads may access.

Incremental

Deltas are a small fraction of the total amount of work required for the main thread to perform GC intermittently, with JavaScript executed between two incremental working segments.

This means that the state of the heap has changed, which may invalidate work previously performed incrementally.

As you can see from the figure, this does not reduce the time spent on the main thread (in fact, it usually increases it slightly), and it expands over time. But it’s still a good technique for dealing with main thread latency.

Advantages: Applications can still respond to user input and progress on animation by allowing JavaScript to execute intermittently and still continue to perform garbage collection tasks.

Delta = delta mark + lazy cleanup

Incremental markup is only for active and inactive objects, lazy cleanup is used to really free up memory. When incremental tag is completed, if the current available memory to rapidly perform code, then there is no need to immediately clean up memory, can delay the cleaning process, let JavaScript logic code execution, also need not disposable clean up all the active objects, garbage collector will clear demand one by one, until cleared.

Concurrent

Concurrency is when the main thread executes JavaScript continuously while the helper thread performs full GC in the background.

Difficult points:

  1. Anything on the JavaScript heap can change at any time, invalidating previous work.
  2. There can be read/write contention issues where the worker thread and the main thread read or modify the same object at the same time.

Advantages: The main thread is free to execute JavaScript.

State of GC in V8

Scavenging

During the new generation garbage collection, V8 uses parallel cleanup to distribute the cleanup across multiple worker threads and the main thread.

Regardless of thread, each copied object has a forward address that updates the original pointer to a new location.

Major GC

In V8, the Major GC starts with the concurrency tag. When the heap approaches the dynamic computation limit, the concurrent marking task is started. When JavaScript is running on the main thread, the concurrent markup is done entirely in the background (the worker thread). Write Barriers are used to track new references to objects created by the main thread during concurrent marking by the worker thread.

The main thread performs a quick tag termination step when the concurrent tag completes, or when the dynamic allocation limit is reached. The main thread pause begins at this stage, which represents the total pause time for the Major GC.

The main thread scans the root set again to make sure all live objects are marked, then compresses and updates Pointers with many worker threads. The main thread starts a concurrent cleanup task during suspension. They run concurrently with the parallel compression task and the main thread itself, and can continue to run even if JavaScript is running on the main thread.

Idle time GC

V8 provides an embeddable mechanism to trigger garbage collection.

The GC can publish Idle Tasks, which are optional Tasks that will eventually be triggered. An embed like Chrome might have some notion of idle time. In Chrome, for example, at 60 frames per second, the browser takes about 16.6 milliseconds to render each frame of the animation. If the animation completes early, Chrome can choose to run GC to create some idle tasks during the idle time before the next frame.