preface

A friend interviewed Ali. In the interview process, the interviewer asked B+ tree, and the uncle kept nodding his head when answering. Finally, he nodded too often and was taken away by the police. Today, I will talk about B+ tree in detail. \

\

Balanced binary trees

Balanced Binary Tree is also called AVL Tree (different from AVL algorithm), and has the following properties: it is an empty Tree or the absolute value of the height difference between the left and right subtrees is not more than 1, and both subtrees are Balanced Binary trees. This scheme solves the problem of binary search tree degenerating into linked list well, and keeps the time complexity of insertion, search and deletion in O(logN) at best and at worst. However, frequent rotation causes insertion and deletion to sacrifice order (logN) time, but it is much more stable than binary search trees.

\

\

B tree (B-tree)

B tree and balanced binary tree slightly different is that B tree is a multi-fork tree, also known as balanced multi-path search tree (search path is not only two), database indexing technology in a large number of users of B tree and B+ tree data structure  

\

M order B tree characteristics

First, there are at most M subtrees for non-leaf nodes; Second, the root node has at least two subtrees, and the non-root and non-leaf nodes have at least M /2 subtrees. Third, the number of keywords stored in non-leaf nodes is equal to the number of node point trees -1, that is, if a node has three subtrees, it must contain two keywords; Fourth, the size of keywords in non-leaf nodes is orderly. In the left node in the figure, the two elements 37 and 51 are orderly. Fifth, the keywords in the left subtree of each keyword in the node are smaller than the keyword, and the keywords in the right subtree are larger than the keyword. In the figure, the left subtree of keyword 51 has 42 and 49, both smaller than 51, while the node of the right subtree has 59, greater than 51. Sixth, all leaf nodes are in the same layer.

\

\

advantages

\


\

\

The advantage of the B-tree over the B-+ tree is that if the frequently accessed data is close to the root node, the non-leaf nodes of the B-tree store the address of the keyword data, so the data retrieval will be faster than the B-+ tree. \

\

\

B + tree

   

\

\

M order B+ tree definition

  • 1. Each node can have a maximum of M elements;
  • 2. Each node has at least (m/2) elements except the root node;
  • 3. If the root node is not a leaf node, then it has at least 2 child nodes;
  • 4. All leaf nodes are in the same layer;
  • 5. A non-leaf node with K child nodes has (K-1) elements, in ascending order;
  • 6. All the elements in the left subtree of an element are smaller than it, and all the elements in the right subtree are greater than or equal to it.
  • 7. Non-leaf nodes only store keywords and indexes pointing to the next child node, and records are only stored in leaf nodes.
  • 8. Adjacent leaf nodes are connected with Pointers.

\

advantages

\


\

B+ tree has fewer levels: compared with B+ tree, EACH non-leaf node stores more key words, and the tree has fewer levels, so data query is faster;

\

B+ tree is more stable in query speed: B+ all keyword data addresses are stored on leaf nodes, so the search times are the same, so the query speed is more stable than B tree.

\

B+ tree has the natural sorting function: all the leaf node data of B+ tree forms an ordered linked list, which is more convenient for querying the data in the size range, with high data tightness and higher cache hit ratio than B tree.

\

B+ tree traversal of all nodes is faster: A B+ tree traversal of the entire tree only requires traversal of all leaf nodes, rather than traversal of each layer like a B tree, which is conducive to the database to perform full table scan.

\

\

\

To adapt to the scene

\

Typically used in databases and operating system file systems.

\

Node splitting


\

\

The full node is split. After the full node, M/2 node generates a new node, and the first element of the new node points to the parent node. \

\

The parent node is full, and the parent node is split.

\

If the root node is full, the root node needs to be classified, and the height of the tree increases.

\

\

advantages

\


\

It can keep data stable and orderly, and its insertion and modification have relatively stable logarithmic time complexity

\

\

conclusion

\


\

The B+ tree was pretty much busted in the interview. In addition to the balanced binary, B, and B+ trees mentioned in this article, the application scenarios of B+ trees are highly topical, such as the B+ tree structure used in MySQL and some file systems.