1. Ngx_rbtree_t red-black tree


Ngx_rbtree_t (red-black tree) is a very efficient data structure, which is used by the core modules in NGINx (such as timer management, file caching module) for fast retrieval. Red-black tree is actually a kind of self-balanced binary search tree. The ordinary binary search tree may degenerate into a single linked list in extreme cases, and the operation efficiency is low. So what is a self-balanced binary search tree? When nodes are continuously added and deleted to binary search tree species, binary search tree itself always keeps a certain degree of balance through morphological transformation, that is, self-balanced binary search tree. Red-black tree is a binary search tree with good self-balance, which supports fast and fast retrieval, insertion and deletion operations, as well as range query and traversal operations. It is a high-level data structure with wide application scenarios.

2. Origin and characteristics of red-black tree


2.1 Introduction to red-black trees

Concept of red-black tree in introduction to algorithms:

1. Each node is either black or red.

2. The root node is black.

3. Each leaf node (the last empty node) is black

4. If a node is red, its children are black.

5. From any node to the leaf node, the black node is the same.

Such an introduction does not make it clear how red-black trees come from, but rather gives a particularly rigid definition that is too formal for people to understand.

In Algorithm 4, the inventor of red-black trees directly bypassed the definition in the introduction of five algorithms and first explored another tree, 2-3 trees.

In fact, red black trees are equivalent to two or three trees, and when you understand the equivalence, and look at the five basic properties of red black trees, it’s perfectly natural.

 

2.2 What is a 2-3 tree

1. Firstly, the basic properties of binary search tree are satisfied.

2. Nodes can hold one or two elements.

A node with one element is called a two-node, and a node with two elements is called a three-node.

2.3 The 2-3 tree is a perfectly balanced tree

The number of nodes passing from the root node to any leaf node is the same. So how do 2-3 trees maintain absolute equilibrium

Next I’ll show you how to add nodes to a 2-3 tree.

2-3 rule for adding nodes to trees: never add nodes to an empty location, only fuse with the last found leaf node.

Example: add 42,37, 12, 18,6, 11,5 in sequence

1. First, add 42 nodes. For an empty 2-3 tree, it is very easy to add a single node as the root node.

2. Add the second node 37.

3. Add the third node 12.

Since the left subtree of 37 is empty, the last leaf node found in the process of adding is a three-node node, which is still fused. Temporarily form a four node. You can’t have four nodes for a 2-3 tree, so you split it into a subtree.

A four node becomes a balanced tree of three two nodes.

4. Add four nodes 18.

5. Add the fifth node 6.

If a leaf node is itself 3 nodes, add a leaf node that becomes 4 nodes and is not a perfectly balanced 2-3 tree when disassembled down. So it merges up, 12 and 37 merge directly into a 3 node.

Still a perfectly balanced 2-3 tree.

6. Add the sixth node 11.

7. Add the seventh node 5.

Step 1: After the node 5 is fused with its parent node, a new four node is formed.

Step 2: Transform the three elements of the four nodes into three two nodes.

Step 3: At this time, because it is not in absolute equilibrium, it continues to fuse upward to form a new four nodes

Step 4: A 2-3 tree cannot have four nodes, so it is necessary to split the four nodes. For this 2-3 tree, it is now all two nodes, keeping absolute balance.

2.4. Equivalence between red-black tree and 2-3 tree

With the familiar tree structure, each node can store only one element, because it is much easier to manipulate the nodes, including the tree, and write the code. (If you are interested, you can write code for 2-3 trees.)

We still store only one element per node, which implements the logic of 2-3 trees. In fact, this tree structure is a red-black tree.

As shown: it’s easy for two nodes, one black node represents two nodes. The complicated thing is the three nodes, we use two nodes to represent the three nodes,

In a 2-3 tree, the 3 nodes (b, C) are together in a parallel relationship, while in a red-black tree, to represent this relationship, we set the edge between B and C to red to represent this relationship.

However, for our common tree (binary search tree) structure, there is no special class to represent the edge object, which will be more complex and increase the complexity. So how do you do that?

First, each node has only one parent. In other words, each node has only one edge connected to its parent. So we can store the edge information on the node, so we can set node B to red.

That is, node B, and its parent, node C, represent the three nodes in the original 2-3 tree.

All the red nodes are tilted to the left. This is actually a definition, not a conclusion. For the three nodes, because we chose to define it in such a way,

The left element is treated as the child node of the right element, while the left node is the red node. So the red node is tilted to the left. Here we’re talking about left-leaning red-black trees.

If you understand the above statement, then for a two or three tree, it’s natural to be represented as a red black tree.

If it still feels abstract, take a look at the picture on the right below. I think you’ve seen the equivalence between two and three trees and red black trees.

2.5. Basic properties and complexity analysis of red-black trees

After we understand the equivalence relation between red-black tree and 2-3 tree, we look up based on the properties in the introduction to algorithms.

1: It’s either red or black.

2: The root node is black. For a red-black tree, this is itself equivalent to a 2-3 tree, so the root node is either 2-minus or 3-minus.

3: Each leaf node (the last empty node, not the one where both the left and right subtrees are empty) is black. This property is a definition, a node defined as empty in a red-black tree is black.

Also consistent with the above property, for an empty tree, its root node is empty, the root node is black, and the empty node is red.

4: If a node is red, its child nodes are black.

In a red-black tree, the red node, that’s the element to the left of the three nodes of the 2-3 tree, this red node, its children are either 2-minus nodes or 3-minus nodes.

When it’s a 2-minus node, it’s obviously black.

When it’s a 3- node, this is actually the same thing as if the root node is black. It must be connected to a black node first.

5: In a red-black tree, from any node to the leaf node, the black node must pass through the same.

A 2-3 tree is a perfectly balanced tree, which means that there are the same number of nodes to go from any node to the leaf node.

And because in a 2-3 tree, whether it’s a 2-node or a 3-node, when you convert it to a red-black tree, there’s always a black node.

So if you go from any node in the red-black tree, every time you go through a black node, that’s the same thing as any node in the original 2 or 3 tree.

So a red-black tree is a binary tree with black balance: for a red-black tree, there are the same number of black nodes going from the root node to any leaf node.

The maximum height is 2logN, and the time complexity is o(logN). In the extreme case, from the root node to the deepest leaf node, logN levels of black nodes are experienced, and the left subtree of each black node is red node

Paths correspond to 2-3 trees, with each node having 3 nodes. So we have logN of red nodes, so the maximum height is 2logN.

2.6. Discoloration and rotation of red-black trees

Red-black tree itself as a binary search tree, so its task is used for dynamic table data insertion and deletion operations. During this operation, the structure of the red-black tree will be damaged. In this case, you need to adjust the tree to make it a red-black tree again. You can adjust the tree from the following two aspects:

  • Adjust the pointer structure of some nodes in the tree;

  • Adjust the color of some nodes in the tree;

2.6.1 levorotation

Left rotation process:

There is a slight problem here. If Node itself is red in color and X. color is red, the basic properties of a red-black tree may not be maintained.

Now, there’s a caveat here, because left rotation is a subprocess that doesn’t maintain the basic properties of a red-black tree, and then there’s processing.

2.6.2 Color reversal

2.6.3 right

3. Ngx_rbtree_t red-black tree to add and delete nodes source code analysis


This is a particular kind of red-black tree, the left-leaning red-black tree. But red-black trees don’t have to be all left-leaning red-black trees, they can have right-leaning red-black trees, or they can have two red children, which means that red-black trees can also be equivalent to 2-3-4 trees with 2-, 3-, and 4- nodes.

Ngx_rbtree_t is the red-black equivalent of a two-three-four tree.

3.1 Nginx red-black tree related structure

Structure:

Red black tree operation method:

3.2 Nginx red-black tree insert new node

When creating a red-black tree or inserting new data into an existing red-black tree, 3 steps are required:

  • Because red-black tree itself is a binary search tree, so when inserting new nodes, the position of new nodes can be found according to the method of binary search tree insertion.

  • Initialize the newly inserted node node, set the color to red and insert it to the specified position; (Insertion of new nodes initialized to red does not break article 5 of red-black tree)

  • Since the insertion of new nodes may destroy the property of article 4 of red-black tree (if the parent node color is red, the property of red-black tree is damaged), it is necessary to adjust the binary search tree and find a way to make it become red-black tree again by rotating and modifying the color of nodes in the tree.

The first and second steps of node insertion are very simple, while the last step is complicated. Node insertion in red-black tree can be divided into the following three situations according to different insertion positions:

  1. Insert into the root of the entire tree. Solution: The root node is black.

  2. The color of the parent node at the insertion position is black. Solution: No work needs to be done in this case, and the insertion of red nodes does not break the properties of the red-black tree.

  3. The color of the parent node at the insertion position is red.

    Solution: Since the color of the inserted node is red, its parent node is also red, which breaks the property of the fourth red-black tree. At this time, the state of the grandfather node and another child node of the grandfather node (called “uncle node” here) need to be combined, which can be divided into three cases:

  • The parent node of the current node is red, and the “uncle node” is also red: the fourth property of red-black tree is broken, the solution is: change the parent node color to black; Change the color of the uncle node to black. Change the grandparent’s color to red; The next step is to recognize the grandfather node as the current node (that is, to continue the upward fusion, like 2-3 trees) and continue the judgment. The processing results are shown in the figure below:

In this case, the color of the parent node and the current node are both red, so to avoid conflict, change the color of the parent node to black. However, while avoiding the destruction of article 4, it causes the black height of the path to increase by 1, and destroys the property of Article 5. However, by changing the color of the grandfather node to red and the color of the uncle node to black, this molecular tree does not break the property of Article 5. However, since the color of the grandfather node is changed, it is necessary to determine whether the structure of the upper tree is damaged. Therefore, it is necessary to regard the grandfather node as the current node and continue to judge.

  • The parent of the current node is red, the uncle is black, and the current node is the right child of the parent. Solution: Left-rotate the parent as the current node.

Tip: After a left-handed operation with the parent as the current node, this case changes to case 3, and the process is synchronized with case 3.

  • The parent of the current node is red, the uncle is black, and the current node is the left child of the parent. Solution: Change the color of the parent to black and the color of the grandparent to red, right-handed from the grandparent. As shown below:

Analysis: In such a case, because the current nodes and the parent F S color to red, has violated the nature of the red-black tree 4, the S color can be changed to black, has violated the nature of 5, because all through the path of the black S highly increased by 1, so you need to color to red immediately after his grandfather nodes a right-handed, So this molecular tree has a red black tree. (The graph in the figure above does not look like a red-black tree, but it is only a part of the whole tree. The subtree with S as root must be a red-black tree.)

3.3 Nginx red black tree deletes new nodes

To delete nodes in a red-black tree, the idea is simpler, and only two steps are needed:

  1. Delete the specified nodes in red-black tree according to the method of deleting nodes in binary search tree;

  2. Re-adjust the tree after deleting nodes to make it a red black tree again; (Again, rotate and recolor)

When nodes are deleted in binary search tree, there are three cases:

  • If the deleted node itself is a leaf node, it can be deleted directly.

  • If there is only one child node (left child or right child), the deleted node is replaced by its child node.

  • If there are two children, replace the node with the smallest leaf in the right subtree (or the largest leaf in the left subtree), and then delete the leaf with the smallest value.

In the above three cases, a node needs to be deleted eventually. At this time, it is necessary to determine whether deleting this node will destroy the properties of red-black tree. The judgment is based on:

1. If the color of the deleted node is red, it will not be damaged.

2. If the color of the deleted node is black, the fifth property of red-black tree will definitely be destroyed. At this point, the tree needs to be adjusted.

  • The color of the sibling node of the deleted node is red, and the adjustment measures are as follows: change the color of the sibling node to black, and the parent node to red. The parent node is left-handed, and the sibling node of the deleted node is updated at the same time (the left-handed sibling node changes), as shown in the figure below:

  • The sibling nodes of the deleted node and its children are all black. The adjustment measures are as follows: set the sibling node of the deleted node to red, and mark the parent node of the deleted node as a new node, and continue to judge; As shown below:

  • The sibling of the deleted node is black, its left child is red, and its right child is black. The adjustment measures are: set the sibling node to red, set the left child node of the sibling node to black, take the sibling node as the criterion for dextral operation, and finally update and delete the sibling node; As shown below:

  • Delete nodes brother is black, the right child (left child no matter what color is red, adjustment measures: to delete the parent nodes color assignment to his brother, and then set the parent color is black, black brother node’s right child node, according to its parent node l do operation, the final set to replace delete nodes to root node;

    As shown below:

4. At the end


At this point, we have basically covered the basic principles and implementation of red-black trees. We’ll start with 2-3 trees, and then show that a (left-leaning) red-black tree is really a BST representation of 2-3 trees. In fact, left-sided red-black trees are defined, and there can be right-sided red-black trees, and there can be left and right subtrees with red nodes, so that’s equivalent to a two-three-four tree, and a red-black tree in Nginx is equivalent to a two-three-four tree, and then we’ll talk about insertion and deletion algorithms.

Insertions and deletions inevitably destroy the basic properties of a red-black tree, so you need to rotate and change color to make it a red-black tree again.

Finally, if you are also learning about red black trees, I hope this article will help you. Due to the complexity of the red-black tree itself, and my limited level, it is inevitable to make some mistakes. If there is any mistake, I hope we can point it out and discuss it together.

reference

  • Algorithms, fourth edition

  • Red black tree – Wikipedia

Highlights of the past

The realization and application of hashing table in Redis

【Swoole second play 】 Swoole Server analysis 【Nginx source code research 】 primary Nginx HTTP processing process

In-depth understanding of Swoole coroutine implementation


Interesting knowledge is waiting for you

Scan code for more highlights