This is the 22nd day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

If we don’t know how React 15 works, we can’t grasp its limitations; Without a firm grasp of React 15’s limitations, we can’t fundamentally understand the design motivations behind the React 16 overhaul. Therefore, one must learn history well before following the trend of The Times.

Reconciliation process and Diff algorithm

The React website explains the concept of the virtual DOM as follows:

The Virtual DOM is a programming concept. In this concept, the UI is kept in memory in an idealized, or “virtual,” representation, synchronized with the “real” DOM through libraries such as ReactDOM. This process is called harmonization.

Synchronizing the virtual DOM with the “real” DOM through libraries such as ReactDOM is a process called harmonization.

Harmonization refers to the process of mapping the virtual DOM to the real DOM. So, strictly speaking, the harmonic process cannot be equated with Diff. Reconciliation is the process of conformity, whereas Diff is the process of finding difference. It is only one step in the process of conformity.

This is illustrated by the React source code structure, which breaks it down into Core, Renderer, and Reconciler from large blocks. Including the Reconciler (harmonic) source is located in the SRC/renderers/Shared/stack/Reconciler this path, is the work done by a series of harmonic, including component mounted, unloading, update process, the update process involves calls to Diff algorithm. So reconcile! Diff’s conclusion holds water.

But! In today’s popular perception, when we talk about reconciliation, we’re talking about Diff.

Such a recognition is reasonable, because Diff is indeed the most representative part of the reconciliation process: the reconciliation process can be divided into “stack reconciliation” represented by React 15 and “Fiber reconciliation” since React 16 according to the different implementation forms of Diff. During the actual interview, when the interviewer asks questions about the Reconciliation, it’s mostly to find out how well the candidate knows Diff.

Design idea of Diff strategy

The Diff algorithm is a process of “finding different”. In the field of computer science, to find the difference between two tree structures, the traditional calculation method is to compare tree nodes one by one through cyclic recursion, and the algorithm complexity of this process is O (N3). Although the algorithm itself has been continuously optimized by generations of programmers, O (N3) is still a performance disaster for browsers with limited computing power.

Combining some design-level derivations, the React team summarized the following two rules to establish the general premise for converting O (n3) complexity to O (n) complexity:

  • If two components are of the same type, they will have the same DOM tree structure;
  • A set of child nodes at the same level can be uniquely identified by a key to maintain the stability of each node in different rendering processes.

There is also a more practical rule that provides inspiration for efficient Diff in React: there aren’t many cross-level operations between DOM nodes, and same-level operations are the norm.

Grasp the three “essentials” to illustrate Diff logic

There are three key points to understand when splitting and interpreting Diff logic:

  1. The key point of Diff algorithm performance breakthrough lies in “hierarchical comparison”;
  2. Nodes of the same type are necessary to continue Diff;
  3. The key attribute is set to help us reuse as many nodes as possible within the same hierarchy.

1. The decisive idea for changing the order of time complexity: layered comparison

In combination with the rule that “there are not many cross-level operations between DOM nodes, and same-level operations are the mainstream”, the React Diff process directly abandons cross-level node comparison and only compares nodes at the same level, as shown in the figure below. In this way, the comparison of the entire tree can be done in a single walk from top to bottom, which is one of the most important designs for reducing the order of complexity.

It is important to note that although stack harmonization optimizes the traditional tree comparison algorithm for hierarchical comparison, the whole algorithm still operates in a recursive form, and hierarchical recursion is also recursive.

What if A cross-hierarchy node operation (such as moving A subtree with node B as the root from below node A to below node C, as shown in the figure below) actually happens? Unfortunately, as a “secondary contradiction”, React cannot determine the behavior of “move” in this case. It can only assume mechanically that the component moved out of the subtree layer disappears and the corresponding subtree needs to be destroyed. The layer that moves into the subtree adds a new component and needs to create a new subtree for it.

React also advises developers not to operate across hierarchies and to keep the DOM structure as stable as possible.

2. A “one size fits all” strategy to reduce recursion: consistency of types determines the need for recursion

Combined with the rule that “if two components belong to the same type, they will have the same DOM tree structure”, we cannot directly deduce that “different types of components have different DOM structure”, but in most cases, this conclusion is valid. After all, the probability of encountering two components with exactly the same DOM structure but not the same type in real development is really low.

According to the basic principle of “principal contradiction”, React believes that only components of the same type need to be further compared. If the two components participating in Diff are of different types, the comparison is simply abandoned and the old node is replaced in place, as shown in the figure below. React tries to Diff deeper into the DOM tree (or subtree) of the component only after confirming that the component type is the same. In this way, redundant recursive operations in the Diff process can be greatly reduced.

3. Re-use nodes: The React “remember” node with the key attribute

Where does the conclusion come from that the key attribute helps maintain node stability? First, React defines the key property:

Key is used to help React identify what has been changed, added, or removed. The key needs to be written inside the element rendered in an array and given a stable value. Stability is important here, because React triggers a rerendering of the UI if the key changes. This is a very useful feature.

It tries to solve the reuse problem of nodes at the same level. Before we start our analysis, let’s consider a situation, as shown in the figure below, based on our understanding of the Diff process so far:

In the figure, component A inserts A new node (C) between two child nodes (B and D) while keeping the type and other attributes unchanged. According to the known Diff principle, the Diff process between two trees should look like this:

  • Firstly, the nodes located at the first layer are compared, and the node types of the two trees are found to be the same (both are A), so further Diff is performed.
  • Start to compare nodes at the second layer, and the first one to accept the comparison is the position B. After comparison, it is found that nodes at this position of the two trees are BOTH B. There is no problem, so let it go.
  • The second position to be compared is D. After comparing D and C, the type before and after is found to be inconsistent, so D is directly deleted to reconstruct C.
  • The third position to accept the comparison is E. After comparing E and D, the type before and after is found to be inconsistent, so E is directly deleted to rebuild D.
  • The last one to accept the comparison is the E node in tree 2, which is empty in tree 1, which means that E in tree 2 is a new node, so a new E is added.

A strange thing happened: C, D, and E are all directly usable. Originally added a node can be done, but now it is deleted and rebuilt for a long time, and is not quite the same as moving nodes across the hierarchy, the latter is originally a low-frequency operation, with reasonable best practice constraints can be completely avoided; However, the form of inserting nodes shown in the diagram is a very high frequency operation, which cannot be avoided. Frequent addition and deletion of nodes will inevitably drag down performance, so we need to invoke the key attribute to help us reuse nodes.

We’re all familiar with the form of the key attribute. When generating nodes dynamically from arrays, we typically add a key attribute to each node (here is a code example) :

const todoItems = todos.map((todo) = >
  <li key={todo.id}>
    {todo.text}
  </li>
)
Copy the code

When we don’t set the key, the Diff process is just as brutal as the one described above. However, as long as you install an appropriate key, it will act as a token that helps React “remember” a node so that it can be traced in subsequent updates. For example, in the virtual DOM tree, if we give each child node at level 2 a key value, as shown below:

When C is inserted between B and D, React does not assume that C, D, and E slots need to be rebuilt — it identifies the ID of each node. Realize that D and E have not changed (D’s ID is still 1, E’s ID is still 2), but have simply been reordered. React can then easily reuse it to “trace” old nodes, move D and E to new locations, and complete the insertion of C. As a result, the operation cost of elements at the same level is greatly reduced.