The article has been published simultaneously on the wechat official account JasonGaoH. I have drawn nearly 100 pictures to understand the red-black tree, and the article has been slightly revised.

I shared the working principle of red-black tree in the company group before. Today, I sorted out and distributed it, hoping it could be helpful to everyone and a summary of my knowledge point.

This article is my blog to write the public account since the drawing of the most one article, no one, I hope to use pictures as much as possible to vividly describe the operation of the red black tree before and after the transformation principle, to help you understand the working principle of the red black tree, below, multi-graph warning began.

Before we talk about red-black trees, let’s look at the following concepts: binary trees, sorted binary trees, and balanced binary trees.

Binary tree

Binary trees are ordered trees with a maximum of two words per node. Usually the left subtree is called the left subtree and the right subtree is called the right subtree. Ordered trees emphasize that the order of the left and right subtrees of a binary tree cannot be arbitrarily reversed.

A simple diagram of a binary tree is as follows:

Code definition:

class Node {
    T data;
    Node left;
    Node right;
}
Copy the code

Sort binary trees

The so-called sorted binary tree, as the name implies, sorted binary tree is ordered, it is a special structure of the binary tree, we can sort and retrieve all nodes in the tree.

The nature of the

  • If its left subtree is not empty, the values of all nodes in the left subtree are less than the values of its root node.
  • If her right subtree is not empty, the values of all nodes in the right subtree are greater than the values of its root node.
  • With recursion, the left and right subtrees of sorted binary trees are also sorted binary trees.

Simple diagram of sorting binary tree:

Sorted binary trees degenerate into linked lists

The value of all nodes in the left subtree of sorted binary tree is less than the value of the root node, and the value of all nodes in the right subtree is greater than the value of the root node. When we insert a group of elements that are exactly ordered, sorted binary tree will degenerate into a linked list.

Normally, a sorted binary tree would look like this:

However, when the inserted set of elements happens to be in order, the sorted binary tree looks like this, as shown in the following figure:

Under normal circumstances, the sorting binary tree retrieval efficiency is similar to binary search, the time complexity of binary search is O(log n), but if the sorting binary tree degenerated into a linked list structure, then the retrieval efficiency becomes linear O(n), so compared with O(log n), the retrieval efficiency must be much worse.

Think about it, binary search and normal sorted binary trees are order log n in time, so why order log n?

Analysis about O (log n) explained the following article is very good, interested can look at the time complexity of binary search, this article is the article take the example of binary search, balanced binary tree and binary search is the same time complexity, understand the time complexity of binary search, then becomes easy to understand the balanced binary tree, I won’t go into it here.

Returning to our topic, in order to solve the problem that sorted binary trees degenerate into linked lists in special cases (linked lists are O(n) less efficient than normal binary trees), someone invented balanced binary trees that are similar to red-black trees.

Balanced binary trees

Balanced binaries are also known as AVL trees, which take their name from the initials of their authors, G.M. Adelson-Velsky and E.M. Landis.

Official definition: It is either an empty tree, or a sorted binary tree with the property that the difference in depth between its left and right subtrees (the balance factor) has an absolute value of no more than 1, and that both its left and right subtrees are balanced binary trees.

Two conditions:

  • A balanced binary tree must be a sorted binary tree, that is, the values of all nodes in the left subtree of a balanced binary tree must be less than the values of the root node, and the values of all nodes in the right subtree must be greater than the values of the root node.
  • The absolute value of the depth difference between the left and right subtrees is no more than 1.

Red and black tree

With all these concepts in mind, it’s time for the red-black tree.

Why red black trees?

In fact, red-black tree is similar to the above balanced binary tree, which is essentially to solve the problem that the retrieval efficiency of sorted binary tree is greatly reduced when it degenerates into a linked list in extreme cases. Red-black tree was first invented by Rudolf Bayer in 1972.

A red-black tree must first be a sorted binary tree, which adds a memory bit to each node to indicate the color of the node, which can be RED or BLACK.

The general structure of red-black tree in Java is shown as follows:

Properties of red-black trees

  • Property 1: Each node is either red or black.
  • Property 2: The root node is always black.
  • Property 3: All leaf nodes are null and are black.
  • Property 4: Two children of each red node are black. (There are no two consecutive red nodes on the path from each leaf to the root.)
  • Property 5: The path from any node to every leaf node in the subtree contains the same number of black nodes.

For the five properties above, let’s just say that for properties 1 and 2, this is a constraint on every node in a red-black tree, the root node is black, and all the other nodes are either red or black.

3 to property specified in the red and black tree node of each leaf node is empty, and the leaf node is black, but a Java implementation of a red-black tree can use null to represent an empty node, so we in the traversal Java red-black tree will can’t see the leaf node, and see is each leaf node is red, which need to pay attention to.

For property 5, what we need to note here is that the description here is from any node, from any node to each leaf node of its subtree the number of black nodes is the same, and this number is called the black height of this node.

If our path from the root node to each leaf node contains the same number of black nodes, this number of black nodes is called the black height of the tree. The black height of a tree is different from the black height of a node.

In fact, some of you might say, well, I’ve talked a lot about the properties of red-black trees, but does that mean that if you make sure that the nodes of a red-black tree are alternating red and black, you can make sure that the tree is balanced?

That’s not the case. Here’s a picture:

All the subtrees on the left are black nodes, but this red-black tree is still balanced, it satisfies all five properties.

The tree has a black height of 3, and the shortest path from root to leaf is 2. The path is full of black nodes, including the leaf. The longest path from root to leaf is 4, and red nodes are inserted between each black node.

By property 4 and property 5 above, it actually guarantees that no path is twice as long as any other path, so this red-black tree is balanced.

And this is kind of a corollary, red black tree in the worst case, the longest path is never twice as long as the shortest path. In fact, red black tree is not really balanced binary tree, it can only be roughly balanced, because the height of red black tree does not increase indefinitely, in practical application, red black tree statistical performance is higher than balanced binary tree, but the extreme performance is slightly worse.

Red-black tree insertion

To fully understand red-black trees, in addition to understanding the properties of red-black trees mentioned above, it is necessary to understand the insertion operations of red-black trees.

The red-black tree insertion and ordinary binary tree insertion, sorting the requirement of the binary tree is a left son all nodes in the tree should be smaller than the root node, all nodes are right subtree than with node, when inserting a new node, the first thing to find the current to insert to be put in order binary tree nodes which position, then insert the current node. Red-black trees differ from sorted binary trees in that they need to adjust the structure of the tree at the insertion point to keep the tree balanced.

Normally, new nodes added to a red-black tree are red, so why should new nodes added to a red-black tree be red?

This problem can understand it, we know from nature of 5, the current red and black tree from the root node to the number of black nodes each leaf node is the same, at this time if the new black node, inevitable destruction rules, but not necessarily, but to join the red node unless its parent node is red, so to join the red node, the possibility of a smaller damage rules.

Let’s focus on how a red-black tree is balanced when new nodes are inserted.

Given the following red-black tree:

When we insert the node with the value of 66, the schematic diagram is as follows:

Obviously, at this time, the structure still follows the above five characteristics, and there is no need to start the automatic balancing mechanism to adjust the node balance state.

If I insert a node of 51, then the red-black tree looks like this.

So this structure actually doesn’t satisfy property four, the red two children have to be black, and this red node right here, 49, now has a red node 51 connected to it.

At this point we need to adjust the structure of the tree to make sure that the red-black tree is balanced.

First try setting the node 49 to black, as shown in the diagram below.

The path 60-56-45-49-51-NULL has 4 black nodes and the other paths have 3 black nodes.

To adjust the red-black tree, we try again to set the node 45 to red, as shown below:

At this time, we found the problem again, 56-45-43 are red nodes, the problem of connecting red nodes appeared.

So we need to make 56 and 43 black again, as shown below.

So we set the red node 68 to black.

For this red-black tree with nodes inserted, we can keep the tree balanced by simply changing colors. But it’s not always so lucky, and when color-changing doesn’t work, another option to consider is rotation.

So here’s the example, again, of this red-black tree.

Now this red-black tree, we’re going to insert node 65.

We try to set the 66 node to black, as shown in the figure below.

The path 60-68-64-NULL has 3 black nodes, while the path 60-68-64-66-NULL has 4 black nodes.

Or we could set 68 to black and 64 to red, as shown below:

However, the black height of the above red-black tree is still inconsistent. 60-68-64-NULL and 60-68-64-66-NULL are still inconsistent.

You can’t balance a red-black tree just by changing colors.

Red black tree rotation

Now, let’s talk about rotations of red-black trees, and rotations can be left or right.

left-handed

Text description: Rotate two nodes counterclockwise so that one node is replaced by its right child node, which becomes the left child node of the right child node.

The text description is too abstract. Let’s look at the picture presentation.

Firstly, the relationship between node PL and its right child node G is disconnected, and the reference of its right child node points to node C2. Then disconnect the relationship between node G and left child node C2, and point the application of the left child node of G to node PL.

I’m going to put down this GIF, hopefully to help you understand it a little bit better, from the web.

Right hand

Text description: Rotate two nodes clockwise so that one node is replaced by its left child, which becomes the right child of the left child.

Pictures of dextral rotation:

Firstly, the relationship between node G and its left child node PL is disconnected, and the reference of its left child node points to node C2. Then disconnect node PL from right child node C2, and point the application of the right child node of PL to node G.

Right-handed GIF display (image from web) :

Having introduced the basic left and right rotation operations, let’s take a look at several rotation scenarios for red-black trees in detail.

Left-left node rotation (the parent of the insert node is the left node, and the insert node is also the left node)

In the red-black tree shown below, we insert a node of 65.

The operation procedure is as follows: Right rotation around grandfather node 69, combined with discoloration, as shown below:

Left and right node rotation (the parent of the insert node is the left node and the insert node is the right node)

Again, this red-black tree, let’s insert 67.

In this case, we can do left-rotation around the parent 66, then right-rotation around the parent 69, and finally set 67 to black and 69 to red, as shown in the figure below.

Right left node rotation (insert node parent is right node, insert node left node)

In this case, we insert node 68.

In this case, we can rotate right around the parent node 69, then left around the parent node 66, and finally set node 68 to black and node 66 to red, as shown below.

Rotation of right and right nodes (the parent of the insert node is the right node, and the insert node is also the right node)

As an example, we insert node 70 into the red-black tree.

We can rotate left around the grandfather 66, set the rotated root 69 to black, and set 66 to red. See the following figure for details:

Red-black tree implementation in Java

The red-black tree implementation class in Java is TreeMap, so let’s try to explain how TreeMap works line by line from a source point of view.

TreeMap static final class Entry<K,V> implements Map.Entry<K,V> {K key; V value; Entry<K,V> left; Entry<K,V> right; Entry<K,V> parent; boolean color = BLACK; . }Copy the code

TreeMap put method.

Public V put(K key, V value) {// Save the root Entry of the linked list with t <K,V> t = root; // If t=null, the TreeMap contains no Entry as rootif (t == null) {
            compare(key, key); // typeRoot = new Entry<>(key, value, null); size = 1; ModCount++;returnnull; } int cmp; Entry<K,V> parent; // split comparator and comparable paths Comparator<? super K> cpr = comparator; // If the comparator CPR is not null, the custom sort is usedif(cpr ! = null) {do{// Use the Entry parent referenced by t after the last loop. // Compare the new key with t's key CMP = cpr.pare (key, t.key); // If the new key is smaller than the key of t, t is equal to the node to the left of tif(cmp < 0) t = t.left; // If the new key is greater than the key of t, t is equal to the node to the right of telse if (cmp > 0)
                    t = t.right;
                else// If the two keys are equal, the new value overwrites the original value and returns the original valuereturn t.setValue(value);
            } while(t ! = null); }else {
            if (key == null)
                throw new NullPointerException();
            @SuppressWarnings("unchecked")
                Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while(t ! = null); } // Make the newly inserted node the child node Entry of the parent node <K,V> e = new Entry<>(key, value, parent); // If the new key is smaller than the parent key, e is the left child of the parentif(cmp < 0) parent.left = e; // If the new key is smaller than the parent key, e is the parent's right child nodeelseparent.right = e; // Fix red-black tree fixAfterInsertion(e); size++; modCount++;return null;
    }
Copy the code
Private void fixAfterInsertion(Entry<K,V> x) {x.color = RED; // Until the parent of x is not root and the parent of x is redwhile(x ! = null && x ! Color == RED) {// If the parent of x is the left child of its parentif(parentOf(x) == leftOf(parentOf(x)))) {// Get the sibling Entry of the parent node of x <K,V> y = rightOf(parentOf(parentOf(x))); // If the sibling of the parent node of x is redif(colorOf(y) == RED) {// Set x's parent to blacksetColor(parentOf(x), BLACK); // Set the sibling of the parent node of x to blacksetColor(y, BLACK); // Set the parent of x to redsetColor(parentOf(parentOf(x)), RED); x = parentOf(parentOf(x)); } // If the sibling of the parent node of x is blackelse{//TODO corresponds to the second case, the left and right nodes rotate // if x is the right child of its parent nodeif(x == rightOf(parentOf(x))) {// Set x = parentOf(x); // Rotate rotateLeft(x); } // Set the parent of x to blacksetColor(parentOf(x), BLACK); // set the parent of x to redsetColor(parentOf(parentOf(x)), RED); rotateRight(parentOf(parentOf(x))); }} // If the parent of x is the right child of its parentelse<K,V> y = leftOf(parentOf(parentOf(x))); // The colors-only case corresponds to the original example, no rotation, but multiple transformations // If the parent of x's sibling is redif(colorOf(y) == RED) {// Set x's parent to blacksetColor(parentOf(x), BLACK); // Set the sibling of the parent node of x to blacksetColor(y, BLACK); // Set the parent node (G) of X's parent node to redsetColor(parentOf(parentOf(x)), RED); ParentOf (parentOf(x)); } // If the sibling of the parent node of x is blackelse{// If x is the left child of its parentif(x == leftOf(parentOf(x))) {x = parentOf(x); // Rotate rotateRight(x); } // Set the parent of x to blacksetColor(parentOf(x), BLACK); // Set the parent of x to redsetColor(parentOf(parentOf(x)), RED); rotateLeft(parentOf(parentOf(x))); }} // Force the root node to be BLACK root.color = BLACK; }Copy the code

TreeMap insertion nodes are the same as normal binary trees, except that the insertion method fixAfterInsertion(e) is called to rebalance the red-black tree.

Red-black tree fixAfterInsertion(e) method for red-black tree insertion (e)

Scenario 1: Just change color to balance

Using the same red-black tree as an example, now we insert node 51.

When we insert node 51, TreeMap put results in the following figure.

I then call the fixAfterInsertion(e) method, as shown in the code flow below.

When you enter the loop for the first time, you get the red-black tree structure below.

After reassigning x, we re-enter the while loop with x at 45.

After executing the above process, you get the red-black tree structure shown below.

At this point x is reassigned to 60, which exits the while loop because 60 is the root node. After exiting the sequence, the root node is set to black again, resulting in the resulting structure as shown below.

Finally, after executing the while loop twice, our red-black tree is adjusted to the current structure, which is balanced so that the paths have the same black heights and no red nodes are connected.

The second scene rotates and changes color to maintain balance

Next we will demonstrate the second scenario, which requires a combination of color change and rotation to maintain balance.

Given the following red-black tree:

Now we insert node 66, resulting in the following tree structure.

Similarly, we go to the fixAfterInsertion(e) method.

Finally, the red-black tree structure we obtained is shown in the figure below:

With this configuration our red-black tree is balanced again.

The TreeMap process uses these two scenarios as examples, but not the rest.

Delete red black tree

Because the previous share only sorted out the insertion part of the red-black tree, I originally thought that the deletion of red-black tree would not be sorted out. Someone told me that the deletion of red-black tree was more complicated, so I just sorted out the deletion of red-black tree again.

Compared with insertion, deletion is indeed a bit more complicated, but it is complicated because there are many kinds of operation cases of node deletion, but insertion is different. When inserting a node, the position of the node is actually determined. After the node is successfully inserted, it only needs to adjust the balance of the red-black tree.

However, when deleting a node, we cannot simply set the node to NULL, because if the node has children, we cannot simply set the current deleted node to NULL. The position of the deleted node needs to be filled by a new node. So you have to deal with multiple cases.

The deleted node is the root node

Delete the root node.

The left and right children of the deleted node are empty

Delete the current node.

The deleted node has a child node that is not empty

In this case, you need to replace the current node with a child node, and then delete the child node.

Given the following tree, when we need to delete node 69.

Replace the current node with a child node, and then delete the child node.

The resulting red-black tree structure is shown below. We do not need to change color + rotate to keep the red-black tree balanced, because the tree is already balanced after removing the child nodes.

Another scenario is that the nodes to be deleted are black. After the black nodes are deleted, the black height of the tree will be inconsistent. At this time, the structure needs to be adjusted.

Taking the example of the red-black tree above, we now need to delete node 67.

Since the two child nodes of the node 67 are null, it is directly deleted and the structure as shown in the figure below is obtained:

At this point our tree’s black height is inconsistent, the left black height is 3, and the right black height is 2, so we need to set the 64 nodes to red to maintain balance.

Delete node Neither child node is empty

The case that the two child nodes of the deleted node are not empty is similar to the case that the above node is not empty. It is also necessary to find a node that can replace the current node. After finding the node, copy the value of the node that can replace the deleted node, and then delete the replacement node.

  • Find the replacement node first, that is, the precursor node or the successor node
  • Then copy the precursor node or successor node to the location of the node to be deleted, and then delete the precursor node or successor node.

So what is the precursor, what is the successor? The precursor is the largest node in the left subtree, and the successor is the smallest node in the right subtree.

The precursor or successor are the nodes closest to the current node. When we need to delete the current node, that is, to find the node that can replace the current node, the node that can replace the current node must be the node closest to the current node.

In the scenario where the two child nodes of the deleted node are not empty, we need to subdivide the node again, which is mainly divided into the following three situations.

First, the precursor node is black and has a non-empty node

For a tree like this, we need to delete node 64:

First find the precursor node and copy the precursor node to the current node:

Then delete the precursor node.

At this point, 63 and 60 are red, and we try to make 60 black to balance the whole red-black tree.

In the second case, the precursor node is black and the child nodes are empty

The precursor node is black, and the child nodes are empty. At this time, the operation steps are basically similar to the above.

The operation steps are as follows:

Because we need to delete node 64, then find the precursor node 63, copy the 63 node to the current position, and then delete the precursor node 63. When the black height is inconsistent after the color change, finally set the 63 node to black, set the 65 node to red, so as to ensure the balance of the red-black tree.

Third, the precursor node is red node, and the child nodes are empty

Given the following red-black tree, we need to delete node 64.

Similarly, we find 63, the precursor of 64, and assign 63 to 64.

Then delete the precursor node.

Removing nodes does not require discoloration or rotation to keep the tree balanced.

Finally the basic principles of the red and black tree part finished, with a lot of schematic diagram, this article is on the share of PPT before sorting out again, I think I should be the basic operation make myself clear, finishing this article took nearly a week or so back and forth, because work at ordinary times, basically only have time at the weekend have time to arrange, please leave a message if you have any questions for discussion.

If you think your writing is ok, please give me a thumbs-up. Your thumbs-up is really the biggest support for me and also the motivation for me to continue writing. Thank you.

The original link

Many of the articles refer to some of the diagrams of the following articles. Thank you very much for the following articles.

What does the time complexity O(log n) actually mean?

Java Enhancement -TreeMap

What about the r-b tree principle