Art is long, life is long

Introduction to the

The following is a detailed description of the B-tree. A M-order B-tree has the following characteristics:

  1. The root node has at least two children.

  2. Each intermediate node contains K-1 elements and k children, where m/2 <= k <= m

  3. Each leaf node contains k-1 elements, where m/2 <= k <= m

  4. All the leaf nodes are in the same layer.

  5. The elements in each node are arranged from small to large, and k-1 elements in the node are exactly the range division of the elements contained by K children.

Let’s take a third order B-tree as an example to see the specific structure of the B-tree:

In this tree, let’s focus on node (2,6), which has two elements, two and six, and three children, one, three, five, eight. Among them, 1 is less than 2, (3,5) is between 2 and 6, and 8 is greater than (3,5), which conforms to the characteristics listed just now.

B- tree query

Disk I/O for the first time:

Locate in memory (compared with 9) :

Second disk I/O:

Locate in memory (compared with 2,6) :

Third disk I/O:

Locate in memory (compared with 3 and 5) :

Through the whole process, we can see that the comparison times of B-tree in query are not less than that of binary search tree, especially when there are many elements in a single node. However, compared to the disk I/O speed, the memory comparison time is almost negligible, so only the tree height is low enough, I/O times are small enough, can improve the search performance. By contrast, it doesn’t matter if there are more elements inside the node, just a few more in-memory interactions, as long as the size of the disk page is not larger, which is one of the advantages of b-trees.

B- tree insertion

If we want to insert 4, here’s a demonstration:

Looking at the node position of 4 from the top down, we find that 4 should be inserted between node elements 3 and 5.

Nodes 3 and 5 are already two-element nodes and cannot be added. Parent nodes 2 and 6 are also two-element nodes and cannot be added. The root node 9 is a single-element node that can be upgraded to a two-element node. Split nodes 3,5 and 2,6, and upgrade root node 9 to two-element node 4,9. Node 6 is independently the second child of the root node.

The insertion process of B-tree is quite complicated. As can be seen from the above example, the insertion of a new element 4 leads to a chain reaction of so many nodes in the whole B-tree. However, because of this, B-tree can always maintain multi-path balance, which is also a major advantage of B-tree: self-balance.

B- tree deletion

If we delete 11, here’s another example:

Finds the node position of element 11 from the top down.

After 11 is deleted, node 12 has only one child, which does not meet b-tree specifications. So find the median of 12,13, and 15,13, and replace node 12, which itself moves down to become the first child. (This process is called sinistral rotation)

Application of B-tree

B-trees are mainly applied to file systems and indexes of some databases, such as the famous non-relational database MongoDB.