How and why React Fiber uses linked lists to traverse the component tree

The How and Why on React’s usage of linked list in Fiber to walk The component’s tree

Originally written by Max Koretskyi

Translation from: Aliyun Translation Group

Text link: github.com/dawn-plex/t…

Translator: Zhao Tian

Proofreader: Trees, sleeping clouds

The React scheduler’s main algorithm for the work loop

speech

In order to educate myself and the community, I have spent a lot of time in Web technology reverse engineering and writing about my findings. Over the past year, I have focused on Angular source code, releasing angular-In-Depth, the largest Angular publication on the web. Now I’m focused on React. Change detection has become a major area of my expertise in Angular, and with a bit of patience and a lot of debugging validation, I hope to reach this level in React soon. In React, the change-detection mechanism is often referred to as “coordination” or “rendering,” and Fiber is the latest implementation. Thanks to its underlying architecture, it provides the ability to implement many interesting features, such as performing non-blocking rendering, performing updates according to priority, pre-rendering content in the background, and so on. These features are called time sharding in the Concurrent React philosophy.

In addition to solving practical problems for application developers, the internal implementation of these mechanisms has broad appeal from an engineering perspective. Having such a wealth of knowledge in the source code can help us grow as better developers.

If you Google “React Fiber” today, you’ll see a lot of articles in the results. But with the exception of Andrew Clark’s notes, all of the articles are fairly high-level readings. In this article, I’ll use Andrew Clark’s notes to elaborate on some particularly important concepts in Fiber. Once we’re done, you’ll have enough knowledge to understand the work cycle illustrated in Lin Clark’s excellent presentation at ReactConf 2017. This is a speech you need to see. However, after you have spent a little time reading this article, it will be easier for you to understand.

This article kicks off a series of articles on React Fiber’s internal implementation. I know about 70% from internal implementations, plus three articles on coordination and rendering mechanics.

Let’s get started!

basis

Fiber’s architecture has two main phases: coordination/rendering and commit. In the source code, the coordination phase is often referred to as the “render phase.” This is the React phase of traversing the component tree and:

Update status and properties
Call the lifecycle hook
Get component’schildren
Compare them to the previous oneschildrenTo compare
And calculate the DOM updates that need to be performed

All of this activity is called work inside Fiber. The type of work that needs to be done depends on the type of React Element. For example, a Class needs to be instantiated for Class Component React, but not for Functional Component. If you’re interested, here you can see all types of work targets in Fiber. These activities are exactly what Andrew is talking about here:

The problem with the UI is that if you do too much work at once, you can lose frames for the animation…

What exactly is too much execution at once? Well, basically, if React were to synchronously traverse the entire component tree and perform tasks for each component, it would probably run for more than 16 milliseconds for the application code to execute its logic. This will result in frame loss, resulting in poor visual performance.

So is there a good way?

Newer browsers (and React Native) implement apis that help solve this problem…

The new API he mentioned is the requestIdleCallback global function, which can be used to queue functions that are called when the browser is idle. Here’s how you’ll use it:

requestIdleCallback((deadline) = >{
    console.log(deadline.timeRemaining(), deadline.didTimeout)
});
Copy the code

If I open the console now and execute the code above, Chrome will print 49.9 false. It basically tells me that I have 49.9ms to do whatever work I need to do, and THAT I haven’t used up all the allotted time, otherwise deadline.didTimeout will be true. Keep in mind that timeRemaining may change immediately after the browser is assigned some work, so you should keep checking.

The requestIdleCallback was actually a bit too strict and not implemented often enough for smooth UI rendering, so the React team had to implement its own version.

Now, if we put all the activities React performs on the component into the function performWork and use requestIdleCallback to arrange the work, our code might look like this:

requestIdleCallback((deadline) = > {
    // Perform work for a part of the component tree when we have time
    while ((deadline.timeRemaining() > 0|| deadline.didTimeout) && nextComponent) { nextComponent = performWork(nextComponent); }});Copy the code

We perform work on one component and then return a reference to the next component to be processed. This would work if it were not for the fact that you cannot process the entire tree of components synchronously, as shown in the previous coordination algorithm implementation. That’s what Andrew is talking about here:

To use these apis, you need a way to break down the rendering effort into incremental units

Therefore, to solve this problem, React had to re-implement the algorithm for traversing trees, moving from a synchronous recursive model that relied on built-in stacks to an asynchronous model with linked lists and Pointers. That’s what Andrew wrote here:

If you rely only on the [built-in] call stack, it will continue to work until the stack is empty… Wouldn’t it be nice if we could interrupt the call stack at will and manipulate the stack frame manually? That’s what React Fiber is all about. Fiber is a reimplementation of the stack, specifically for the React component. You can think of a single Fiber as a virtual stack frame.

That’s what I’m going to talk about now.

One thing to say about the stack

I’m assuming you’re all familiar with the concept of a call stack. If you pause the code at the breakpoint, you can see this in the browser’s debugging tools. Here are some quotes and charts from Wikipedia:

In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program… The main reason the call stack exists is to track where each active subroutine should return control when it completes execution… The call stack consists of stack frames… Each stack frame corresponds to a call to a subroutine that has not yet returned a termination. For example, if a subroutine called DrawLine called by the subroutine DrawSquare is currently running, the top of the call stack might look like the picture below.

Why is the stack related to React?

As we defined in the first part of this article, React walks through the component tree in the coordination/rendering phase and does some work for the component. Previous implementations of the coordinator used a synchronous recursive model that relied on built-in stacks to traverse the tree. The official documentation on coordination describes this process and talks a lot about recursion:

By default, when recursing over the child nodes of a DOM node, React iterates over both lists of child nodes at the same time, generating mutations when differences occur.

If you think about it, each recursive call adds a frame to the stack. And it’s synchronous. Suppose we have the following component tree:

Represented as an object by the Render function. You can think of them as component instances:

const a1 = {name: 'a1'};
const b1 = {name: 'b1'};
const b2 = {name: 'b2'};
const b3 = {name: 'b3'};
const c1 = {name: 'c1'};
const c2 = {name: 'c2'};
const d1 = {name: 'd1'};
const d2 = {name: 'd2'};

a1.render = (a)= > [b1, b2, b3];
b1.render = (a)= > [];
b2.render = (a)= > [c1];
b3.render = (a)= > [c2];
c1.render = (a)= > [d1, d2];
c2.render = (a)= > [];
d1.render = (a)= > [];
d2.render = (a)= > [];
Copy the code

React requires iteration trees and performs work for each component. To simplify things, the job is to print the name of the current component and get its children. Here’s how we do it recursively.

The recursive traversal

The main function to loop through the tree is called walk, which is implemented as follows:

walk(a1);

function walk(instance) {
    doWork(instance);
    const children = instance.render();
    children.forEach(walk);
}

function doWork(o) {
    console.log(o.name);
}
Copy the code

Here is my output:

a1, b1, b2, c1, d1, d2, b3, c2

If you have no faith in recursion, check out my in-depth article on recursion.

The recursive method is intuitive and perfect for traversing a tree. But as we are discovering, it has limitations. The biggest one is that we can’t break down work into incremental units. We cannot pause work on a particular component and resume it later. With this approach, React can only iterate until it runs out of components and the stack is empty.

So how does React implement an algorithm that traverses a tree without recursion? It uses a single list tree traversal algorithm. It makes it possible to pause traversal and prevent stack growth.

List traversal

I was lucky enough to find the main points of the algorithm outlined here by SebastianMarkbage. To implement the algorithm, we need a data structure with three fields:

Child – a reference to the first child node
Sibling – Reference to the first sibling node
Return – a reference to the parent node

In the context of React’s new coordination algorithm, the data structure that contains these fields is called Fiber. Underneath it is a React Element that represents keeping work queues. More on that in my next article.

The following diagram shows the hierarchy of objects linked by a linked list and the types of connections between them:

We first define the constructor for our custom node:

class Node {
    constructor(instance) {
        this.instance = instance;
        this.child = null;
        this.sibling = null;
        this.return = null; }}Copy the code

And a function that takes an array of nodes and links them together. We use it to link the child nodes returned by the Render method:

function link(parent, elements) {
    if (elements === null) elements = [];

    parent.child = elements.reduceRight((previous, current) = > {
        const node = new Node(current);
        node.return = parent;
        node.sibling = previous;
        return node;
    }, null);

    return parent.child;
}
Copy the code

This function iterates through the array of nodes starting with the last node, linking them together in a separate linked list. It returns a reference to the first sibling node. Here’s a simple demonstration of how it works:

const children = [{name: 'b1'}, {name: 'b2'}];
const parent = new Node({name: 'a1'});
const child = link(parent, children);

// The following two lines of code execute true
console.log(child.instance.name === 'b1');
console.log(child.sibling.instance === children[1]);
Copy the code

We will also implement a helper function to do some work for the node. In our case, it will print the name of the component. But in addition, it also takes the children of the component and links them together:

function doWork(node) {
    console.log(node.instance.name);
    const children = node.instance.render();
    return link(node, children);
}
Copy the code

Ok, now we are ready to implement the main traversal algorithm. This is a parent-first, depth-first implementation. Here is the code with useful comments:

function walk(o) {
    let root = o;
    let current = o;

    while (true) {
        // Perform work for the node to get and connect its children
        let child = doWork(current);

        // If child is not empty, set it to the current active node
        if (child) {
            current = child;
            continue;
        }

        If we return to the root node, exit the function
        if (current === root) {
            return;
        }

        // Iterate until we find sibling nodes
        while(! current.sibling) {If we return to the root node, exit the function
            if(! current.return || current.return === root) {return;
            }

            // Set the parent node to the current active node
            current = current.return;
        }

        // If sibling nodes are found, set sibling nodes to the current active nodecurrent = current.sibling; }}Copy the code

While the code implementation isn’t particularly hard to understand, you may need to run the code a little to understand it. Let’s do it here. The idea is to keep a reference to the current node and reassign it as we walk down the tree until we reach the end of the branch. We then use the return pointer to return the root node.

If we now examine the call stack for this implementation, here is what we would see:

As you can see, the stack does not grow as we go down the tree. But if we now put the debugger into the doWork function and print the node name, we’ll see the following image:

** It looks like a call stack in the browser. ** So with this algorithm, we are effectively replacing the browser’s call stack implementation with our implementation. This is what Andrew describes here:

Fiber is a reimplementation of the stack, specifically for the React component. You can think of a single Fiber as a virtual stack frame.

So we now control the stack by keeping a reference to the node that acts as the top stack frame:

function walk(o) {
    let root = o;
    let current = o;

    while (true) {... current = child; . current = current.return; . current = current.sibling; }}Copy the code

We can always stop traversing and resume later. This is exactly what we want to achieve to be able to use the new requestIdleCallback API.

Work loops in React

Here is the code that implements the work loop in React:

function workLoop(isYieldy) {
    if(! isYieldy) {// Flush work without yielding
        while(nextUnitOfWork ! = =null) { nextUnitOfWork = performUnitOfWork(nextUnitOfWork); }}else {
        // Flush asynchronous work until the deadline runs out of time.
        while(nextUnitOfWork ! = =null&&! shouldYield()) { nextUnitOfWork = performUnitOfWork(nextUnitOfWork); }}}Copy the code

As you can see, it maps nicely to the algorithm I mentioned above. The nextUnitOfWork variable serves as the top frame, preserving a reference to the current Fiber node.

The algorithm synchronously traverses the tree of components and performs work for each Fiber point in the tree. This is usually the case with so-called interactive updates caused by UI events (clicks, inputs, etc.). Or it can asynchronously traverse the component tree to check if there is any time left after Fiber node work is performed. The function shouldYield returns the result based on the deadlineDidExpire and Deadline variables, which are constantly updated when React works for the Fiber node.

Here’s an in-depth look at the peformUnitOfWork function.

I’m writing a series of articles that delve into the details of implementing Fiber’s change detection algorithm in React.

Please continue toTwitterandMediumFollow me on Twitter and I’ll tweet as soon as it’s ready.

Thanks for reading! If you liked this article, please click the like button below 👏. It means a lot to me and helps others see this article.