Diff update algorithm

Because Vue3 has done a lot of performance optimization, it does not update all nodes diff. Currently, diFF updates are performed in the following two situations:

  • v-forContainer node
  • Since the writerender()function

There is also a special case for sequential updates without diff, which are in all-replace mode and are time-consuming:

  • There is nokeyThe value of thev-forStatement, is typedUNKEYED_FRAGMENTtag

Note that Vue3 does not actively provide fragments. Fragments are generated only when written as follows:

  • A component with multiple root nodes generates a fragment wrap, which is a stable fragment (STABLE_FRAGMENT)
  • v-forStatement that generates a fragment package
  • v-ifStatement, with multiple child nodes or not a single text node, generates a fragment wrapped around (STABLE_FRAGMENT)

The arguments above are based on the browser environment, not NodeJS environment (SSR). The code presented in this article has been appropriately simplified.

In our case with template, updates are basically updated by a block — that is, the node updates itself

Update the diff

Diff updates the internal patchKeyedChildren method called, and its general process is divided into three steps:

  1. Compare the nodes with the same pointer in the head of the old and new nodesdiffIf no, go to the next step.
  2. Compare the old and new node head pointer to the node, if the samediffIf no, go to the next step.
  3. In this case, the remaining old and new nodes may be out of order, removed, or newly added.

Here is the initial initialization of the function, where c1/c2 represents an array of children of the old and new nodes. I indicates that the two old and new nodes point to synchronized Pointers, which are synchronized; E1 /e2 represents Pointers to the end of two child node arrays respectively:

let i = 0
const l2 = c2.length

// Change the last index of the former node
let e1 = c1.length - 1 // prev ending index

// The last index of the changed node
let e2 = l2 - 1 // next ending index
Copy the code

The patch() function is used to update nodes, and the isSameVNodeType() function is used to determine whether two nodes have the same type. In this case, the two nodes must have the same type and the same key value.

The type here refers to, for example, component, which is the configuration object of the component, and element, which is the tag of the element.

1. Compare the allelic pointer nodes in the header

It starts with the header pointer and compares whether the old and new nodes are the same:

// 1. sync from start
// 1. Synchronize from the start position
// (a b) c
// (a b) d e
while (i <= e1 && i <= e2) {
  const n1 = c1[i]
  const n2 = c2[i]

  // If the node does not change, perform patch
  if (isSameVNodeType(n1, n2)) {
    patch(n1, n2)
    // End immediately on different nodes
  } else {
    break
  }
  i++
}
Copy the code

An example is given in the comment if there are old and new queues as shown below:

Since the vNodes of the two Pointers are the same, patch() can be directly updated if it can be reused. We then move the pointer one bit to the right to compare the positions of two arrays with subscript 1:

Same, same node, update and move pointer back:

At this point, the two nodes are different, so the head comparison ends here. So far we have excluded the reusable nodes at the head position, then we must exclude the reusable nodes at the tail position.

2. Compare the tail allelic pointer nodes

The end of the reusable node check method is the same as the head, except that they have Pointers to the end of the child node array. So we use two Pointers e1/e2.

// 2. sync from end
// 2
// a (b c)
// d e (b c)
while (i <= e1 && i <= e2) {
  const n1 = c1[e1]
  const n2 = c2[e2]
  if (isSameVNodeType(n1, n2)) {
    patch(n1, n2)
  } else {
    break
  }
  e1--
  e2--
}
Copy the code

I won’t repeat it here, but I’ll use the example in the notes:

The current pointer points to the same node, the node can be reused, directly updated, and the pointer moves forward together:

The current pointer points to the same node, the node can be reused, directly updated, and the pointer moves forward together:

At this point, the two Pointers to the node are no longer the same, so stop comparing here.


After the previous two steps, the old and new queues have been further shortened, and the remaining nodes may have the following three situations:

  • A node is added. Procedure
  • A node has been deleted
  • Same node, but moved

In a real scenario, there are only three remaining cases:

  • Only new nodes are addedi > e1)
  • Only node deletion (certain at this pointi > e2)
  • Out of order, there must be moving nodes, which may contain new or deleted nodes (at this time there must bei <= e2andi <= e1)

Vue deals with the first two cases separately and then the three mixed cases separately according to the ease of processing.

3.1 Handling the case of newly added nodes

If the new node is judged to be based on the old queue, the situation is as follows:

Or the new node is in the header:

E1 < I and e2 >= I, so we just need to update the node between e1 => I, so we have this code:

// 3. common sequence + mount
// 3. A node is added
// (a b)
// (a b) c
// i = 2, e1 = 1, e2 = 2
// (a b)
// c (a b)
// i = 0, e1 = -1, e2 = 0
if (i <= e2) {
  if (i > e1) {
    // The node after the current node
    const nextPos = e2 + 1

    // We will use this node as the anchor point to add elements before it, if not to add elements at the end of the parent node
    const anchor = nextPos < l2 ? c2[nextPos].el : parentAnchor
    while (i <= e2) {
      patch(null, c2[i])
      i++
    }
  }
}
Copy the code

3.2 Handling the individual case of node deletion

If there is no new node, it will determine whether the node is only deleted. In this case, the following two situations may occur:

Delete node at end:

Delete node at end:

If I > e2, we simply delete all nodes between I => e1 in the old array (else if statement follows above) :

// 4. common sequence + unmount
// 4. The node is removed
// (a b) c
// (a b)
// i = 2, e1 = 2, e2 = 1
// a (b c)
// (b c)
// i = 0, e1 = 0, e2 = -1
else if (i > e2) {
  while (i <= e1) {
    // Remove the original node
    unmount(c1[i])
    i++
  }
}
Copy the code

3.3 Out of order, but there must be moving nodes

Finally a case is more complex, the Vue do processing, check the old node sequence after sequence into a new node, whether some old node in a sequence of nodes or according to the current order (intermittent), at this time only for the rest of the changes of node, to have the minimum amplitude of DOM manipulation.

Now, instead of finding the longest increasing sequence of old nodes, here you can think why not ask for the minimum edit distance, because the time complexity of the minimum edit distance is higher in most cases

3.3.1 Mapping the Key of a new node to its subscripts

First of all, Vue traverses the new node array and establishes a mapping between nodes with key values and their subscripts in the new node array, which is stored in keyToNewIndexMap for easy searching during reuse:

const s1 = i // prev starting index
const s2 = i // next starting index

// 5.1 build key:index map for newChildren
5.1 Generate a key map
const keyToNewIndexMap = new Map(a)// Iterate over new nodes, out-of-order parts, and store these nodes with keys into the map
for (i = s2; i <= e2; i++) {
  const nextChild = c2[i]
  if(nextChild.key ! =null) {
    keyToNewIndexMap.set(nextChild.key, i)
  }
}
Copy the code

3.3.2 Remove old nodes that do not exist in the new node queue and update the reused nodes

After that, we iterate through the old node array, and through the newly created Map, if the current old node does not exist in the new node array, then we need to remove it.

The whole process is complicated because it requires pre-processing to prepare if the node needs to be moved later:

// 5.2 loop through old children left to be patched and try to patch
// matching nodes & remove nodes that are no longer present
// 5.2 Iterate over old nodes, patch matched nodes, remove nodes that are no longer in the node
let j

// Number of nodes currently processed
let patched = 0

// Number of nodes that require patch
const toBePatched = e2 - s2 + 1

// Whether the node needs to be moved
let moved = false

// used to track whether any node has moved
// Records whether the node has been moved
let maxNewIndexSoFar = 0

// works as Map<newIndex, oldIndex>
// Note that oldIndex is offset by +1
// and oldIndex = 0 is a special value indicating the new node has
// no corresponding old node.
// Note that the value of the old subscript is always +1, because 0 means that there is no corresponding old node
// used for determining longest stable subsequence
// Map of new and old subscripts
// The new subscript here takes s2 position 0 and the old subscript takes the old subscript value +1
const newIndexToOldIndexMap = new Array(toBePatched)

// Initialize to 0
for (i = 0; i < toBePatched; i++) newIndexToOldIndexMap[i] = 0
Copy the code

Here I describe three variables. The first is newIndexToOldIndexMap, which is used to record the mapping of the new coordinates of the node to the old coordinates (if the node is reusable of course). In order to calculate the longest increasing subsequence in the future, its new coordinate takes S2 as the starting point, and its length is the total number of nodes that need to be processed at present, and the subscript of the old node stored in it should be + 1 on the basis of the original value (because 0 means that the current node has no corresponding old node). For example, the update sequence looks like this:

Then after storing node E into newIndexToOldIndexMap:

// display, not assignment statement
newIndexToOldIndexMap = [4.0.0.0]
Copy the code

The remaining nodes are as follows:

// display, not assignment statement
newIndexToOldIndexMap = [4.3.2.0]
Copy the code

The Moved variable indicates whether any nodes need to be moved based on the value of maxNewIndexSoFar.

MaxNewIndexSoFar represents the current distance of the reusable node to S2 (i.e. the furthest distance of the first out-of-order node). If there are corresponding reusable nodes, in each iteration of nodes, if the distance between s2 of the current node and maxNewIndexSoFar exceeds maxNewIndexSoFar, maxNewIndexSoFar will be updated as the subscript of the current node in the new queue. When it is less than maxNewIndexSoFar, it marks moved = true.

// Determine if you need to move the current node. Imagine if each node increases in order.
// Then the if statement is entered each time
if (newIndex >= maxNewIndexSoFar) {
  // The current node is not moved, update the subscript
  maxNewIndexSoFar = newIndex

  // If you enter the else statement, the nodes are crossed before there are any
} else {
  moved = true
}
Copy the code

Imagine that if the nodes in the old and new sequences are incrementing in the same order, then maxNewIndexSoFar will also be incrementing continuously, that is, each iteration newIndex >= maxNewIndexSoFar, then there is no need to move the nodes; However, if newIndex < maxNewIndexSoFar in one iteration, it indicates that the current node has moved from the previous position to the current position.

Take the figure just out of order as an example. In the first iteration, newIndex of node C = 4, maxNewIndexSoFar = 4; In the second iteration of node D, its newIndex = 3. At this time, newIndex < maxNewIndexSoFar indicates that the position between C/D nodes has crossed between the old and the new, so we need to move at least one of them.

With that in mind, we can now take a look at the code in earnest. Since nodes need to be uninstalled, the traversal should be performed based on the nodes between S1 <-> E1. The overall traversal code is as follows:

// Iterate over the old node
for (i = s1; i <= e1; i++) {
  // The old node of the current subscript
  const prevChild = c1[i]

  // If the number of current patch nodes exceeds the total number of patches to be added, unmount the node
  // Unmount the node directly because the unnecessary nodes are no longer needed
  if (patched >= toBePatched) {
    // all new children have been patched so this can only be a removal
    unmount(prevChild)
    continue
  }

  // Try to find if there are any new nodes that are correct
  let newIndex

  // If the old node has a key, get the index of the node with the same key value
  if(prevChild.key ! =null) {
    newIndex = keyToNewIndexMap.get(prevChild.key)

    // If there is no key, check whether a node of the same type exists
  } else {
    // key-less node, try to locate a key-less node of the same type
    // The current search scope is between nodes that need patch in the new node
    for (j = s2; j <= e2; j++) {
      if (
        // 0 indicates that there is no node under the corresponding subscript (note that the current subscript s2 is 0)
        // Confirm that the current new subscript position does not correspond to the old subscript, in case it is already in the map node

        newIndexToOldIndexMap[j - s2] === 0 &&
        isSameVNodeType(prevChild, c2[j])
      ) {
        newIndex = j
        break}}}// If no node is found, the node does not exist. Unmount the node directly
  if (newIndex === undefined) {
    unmount(prevChild)

    // Move the position when it is found and patch
  } else {
    // Store the location of the old node with subscript +1. The new node starts with s2, i.e. 0
    newIndexToOldIndexMap[newIndex - s2] = i + 1

    // Determine if you need to move the current node. Imagine if each node increases in order.
    // Then the if statement is entered each time
    if (newIndex >= maxNewIndexSoFar) {
      // The current node is not moved, update the subscript
      maxNewIndexSoFar = newIndex

      // If you enter the else statement, the nodes are crossed before there are any
    } else {
      moved = true
    }

    // patch The node
    patch(prevChild, c2[newIndex])
    patched++
  }
}
Copy the code

First, when patched patched the current patched nodes can be patched with a patched +1 value. When patched nodes exceed the length of new nodes out of order, the remaining nodes must be unpatched nodes (because the sequence of new nodes has been processed) :

if (patched >= toBePatched) {
    // all new children have been patched so this can only be a removal
    unmount(prevChild)
    continue
  }
Copy the code

After that, Vue tries to find out if the current old node has been reused, that is, it has been moved to another location in the new node sequence. First, if the current node has a key, it will try to look it up directly from keyToNewIndexMap. If not, it iterates through the current sequence of all new nodes, compares whether they are the same as the current node in turn, and multiplexes them when compounding nodes of the same type.

// Try to find if there are any new nodes that are correct
let newIndex

// If the old node has a key, get the index of the node with the same key value
if(prevChild.key ! =null) {
  newIndex = keyToNewIndexMap.get(prevChild.key)

  // If there is no key, check whether a node of the same type exists
} else {
  // key-less node, try to locate a key-less node of the same type
  // The current search scope is between nodes that need patch in the new node
  for (j = s2; j <= e2; j++) {
    if (
      // 0 indicates that there is no node under the corresponding subscript (note that the current subscript s2 is 0)
      // Confirm that the current new subscript position does not correspond to the old subscript, in case it is already in the map node
      newIndexToOldIndexMap[j - s2] === 0 &&
      isSameVNodeType(prevChild, c2[j])
    ) {
      newIndex = j
      break}}}Copy the code

A newIndexToOldIndexMap[J-s2] === 0 condition exists when the reusable node is directly searched by type, which means that the current new node has no corresponding old node subscript (0 means no, in the following code, if the corresponding newIndex is found, It will be stored in newIndexToOldIndexMap). This prevents old and new nodes from being reused or being processed again.

If the newIndex of the old node is found, there are two cases:

  1. If no, the current node has been deletedDOMnode
  2. There are, reuse, update node attributes
// If no node is found, the node does not exist. Unmount the node directly
if (newIndex === undefined) {
  unmount(prevChild)

  // Move the position when it is found and patch
} else {
  // Store the location of the old node with subscript +1. The new node starts with s2, i.e. 0
  newIndexToOldIndexMap[newIndex - s2] = i + 1

  // Determine if you need to move the current node. Imagine if each node increases in order.
  // Then the if statement is entered each time
  if (newIndex >= maxNewIndexSoFar) {
    // The current node is not moved, update the subscript
    maxNewIndexSoFar = newIndex

    // If you enter the else statement, the nodes are crossed before there are any
  } else {
    moved = true
  }

  // patch The node
  patch(prevChild, c2[newIndex])
  patched++
}
Copy the code

In the case of reuse, there are the scenarios we just mentioned where the compute nodes are crossed (whether they need to be moved) :

// Determine if you need to move the current node. Imagine if each node increases in order.
// Then the if statement is entered each time
if (newIndex >= maxNewIndexSoFar) {
  // The current node is not moved, update the subscript
  maxNewIndexSoFar = newIndex

  // If you enter the else statement, the nodes are crossed before there are any
} else {
  moved = true
}
Copy the code

I won’t go into it here.

Note that the patch() update is made to the reused node, so it is only necessary to move the node.

3.3.3 Handling New nodes and Moving Nodes

At this point, only new and mobile nodes are left unprocessed.

First of all, it will need a mobile node (version = true) according to the newly created newIndexToOldIndexMap generated a longest sequence increasingNewIndexSequence incrementing the new node.

// move and mount
// 5.3 Move and mount
// generate longest stable subsequence only when nodes have moved
// There are nodes that need to be moved to generate long-term stable sub-sequence, and only the nodes that have been moved are processed
const increasingNewIndexSequence = moved
  ? // Get the subscript array of the longest increment subsequence
    getSequence(newIndexToOldIndexMap)
  : EMPTY_ARR
Copy the code

So what does this sequence do? It is used to assist in moving nodes, and it is used to move nodes at a minimum number of times. Since newIndexToOldIndexMap is created from a mapping between old and new nodes, its subscripts naturally represent the order of the new node array out of order, and the elements stored in its corresponding subscripts also represent the subscripts of the old node reused by the new node, here we can see two sequences:

  1. The sequence of new node subscripts (incremented, because we created the array against it)
  2. newIndexToOldIndexMapA sequence of subscripts of old nodes, either increasing or out of order

On how to find the longest increasing subsequence, please learn by yourself, here is not clear.

At this point, if the sequence formed by the subscripts of the old node also shows an increasing trend, then we can operate those non-increasing nodes to achieve the purpose of changing to the new node sequence. And the longer the increasing sequence, the fewer nodes we have to move. Here’s an example:

Observing from the figure, we can clearly see that c/ D nodes maintain an increasing relationship before and after, so we only need to move e node and create H node at this time.

At this point, its newIndexToOldIndexMap is:

newIndexToOldIndexMap = [4.2.3.0]

// This array returns the index of the corresponding element in the newIndexToOldIndexMap
// Instead of the actual old node subscript
increasingNewIndexSequence = [1.2]
Copy the code

Can see nodes 2, 3, and the new increasing subscript relationship is consistent, the longest increasing subsequence (increasingNewIndexSequence) [1, 2], 4/0 at this time we only need to manipulate the two nodes.

IncreasingNewIndexSequence returned results for the corresponding elements in newIndexToOldIndexMap subscript subscript is not a real old node

Now let’s look at the code that handles this:

// Get the tail of the increment sequence
j = increasingNewIndexSequence.length - 1

// looping backwards so that we can use last patched node as anchor
// Loop backwards so that we can use a patch node as an anchor point
for (i = toBePatched - 1; i >= 0; i--) {
  // The subscript of the current node to be processed and its node
  const nextIndex = s2 + i
  const nextChild = c2[nextIndex]

  // Get the next node, or its parent if not
  const anchor = nextIndex + 1 < l2 ? c2[nextIndex + 1].el : parentAnchor

  // If the Map does not find the old node information of the current location of the new node,
  // Indicates a new node
  if (newIndexToOldIndexMap[i] === 0) {
    // mount new
    patch(null, nextChild)

    // Moved Indicates that a node needs to be moved, by shaping an incrementing sequence, the nodes in the incrementing sequence will
    // Do not move, only move the remaining nodes, thus reducing the node movement
  } else if (moved) {
    // Move if the following conditions exist:
    // 1. No stable subsequence
    // 2. The current node is not in the stable subsequence
    // move if:
    // There is no stable subsequence (e.g. a reverse)
    // OR current node is not among the stable sequence
    if (j < 0|| i ! == increasingNewIndexSequence[j]) { move(nextChild, container, anchor) }else {
      j--
    }
  }
}
Copy the code

In this traversal, the out-of-order sequence (S2 <-> e2) of the new node is used as the benchmark and the traversal is carried out in reverse. The reason for the reverse is that when it adds or updates a Node, it can update the Node as an anchor (think of the node.insertbefore ()/Node.appendChild() arguments).

There are three things that happen in each iteration:

  • If the current node has no corresponding subscript of the old node, it is a new node
  • The node needs to be moved
  • This node is in ascending order with the old node sequence and can be skipped directly.

Cases 2 and 3 are mutually exclusive in the full iteration and do not occur simultaneously in the entire iteration process.

The first case, which is simpler, will not be explained here:

// If the Map does not find the old node information of the current location of the new node,
// Indicates a new node
if (newIndexToOldIndexMap[i] === 0) {
  // mount new
  patch(null, nextChild
}
Copy the code

The second case is going to determine ‘Moved = True,’ and we’ve already explained how that works. What we want to focus on here is the body of the function. Based on our understanding of the incrementing subsequence, it should move or not move nodes in the following cases:

  • The current node is in the longest increasing subsequence — skip
  • The current node does not exist in the longest increasing subsequence — move
    • There are no nodes to skip in the increasing sequence but any nodes that exist need to be updated (both real and non-existent)
    • The current node does not have the longest increasing subsequence
// Move if the following conditions exist:
// 1. No stable subsequence (in fact, as in case 2, any node is returned as the longest sequence in reverse order)
// 2. The current node is not in the stable subsequence
// move if:
// There is no stable subsequence (e.g. a reverse)
// OR current node is not among the stable sequence
if (j < 0|| i ! == increasingNewIndexSequence[j]) {// Move the current node to the front of the anchor node or the last of the container node (when there is no anchor)
  move(nextChild, container, anchor)
} else {
  j--
}
Copy the code

Again, take this picture:

Again, increasingNewIndexSequence storage nodes in newIndexToOldIndexMap subscript, not real old node subscript

Then generate the following increasing sequence:

newIndexToOldIndexMap = [4.2.3.0]

// This array returns the index of the corresponding element in the newIndexToOldIndexMap
// Instead of the actual old node subscript
increasingNewIndexSequence = [1.2]
Copy the code

Then in the first update, it is found that node H is not in newIndexToOldIndexMap, so it is processed according to the new node and inserted before the new node F:

In the second update, it is found that d node exists in monotone increment sequence, so this update can be skipped:

The same is true for node C, so I won’t show the graph here. After processing node C, it looks like this:

If node E is reusable, move it to the front of node C.

That’s the end of the diff.

For the third case, since the reusable nodes maintain an increasing relationship before and after, we do not need to process the nodes repeatedly at this time, so we can directly skip the reusable nodes:

At this point, all node updates have been completed.

Contrast Vue2

As is known to all, the time complexity of diff algorithm in Vue2 is O(n), while the diff algorithm in Vue3 is based on the longest increasing subsequence up to O(nlogn). So why switch to an algorithm?

Personally, I think it is to reduce the DOM operation. Although in most cases, the DOM operation times of Vue3 and Vue2 are almost the same, there are also many cases where Vue3 has less DOM operation times than Vue2, such as the following example:

In the example above, the number of operations on Vue2 is 3; The number of operations for Vue3 is two.

Learn the DIFF of Vue2 by yourself

The advantage of this DOM manipulation becomes more apparent the longer the increment sequence is.

Afterword.

Vue3 has many static optimizations for template compilation, such as blocks, closure cache nodes, cache cache nodes, and so on. It is not recommended that you write render() yourself, you will need to optimize it manually.

Blocks in particular you can’t simulate because there’s no open API. Under Block, only dynamic nodes will be collected. When a value of the dynamic node is updated, it will directly traverse the dynamic nodes in the Block for self-update through fast Path (i.e., no pipe nodes).

Thank you for your questions.

Diff update algorithm