Let’s review the concepts of B+ tree, B- tree, balanced binary tree and red black tree

Balanced binary trees

  • Balanced binary trees are also known as AVL trees
  • A balanced binary tree is an empty tree or the absolute value of the difference in height between the left and right subtrees is no more than 1, and the left and right subtrees are also balanced trees
  • The value of the non-leaf node is greater than the value of the left child and less than the value of the right child
  • A non-leaf node can have up to two child nodes

Red and black tree

  • Each node is either red or black
  • The root node is black
  • Each leaf node (NIL) is black
  • The two children of each red node must be black
  • The path from any node to each leaf node contains the same number of black nodes
  • If a node has a sunspot node, the node must have two children

b-tree

  • Each node contains data (key and data fields) and Pointers, separated from each other
  • The leaf nodes have the same depth and the pointer to the leaf node is null
  • The data index in the node is arranged in ascending order from left to right
  • All index elements are not duplicated

B + tree

  • Non-leaf nodes do not store data, only indexes, and can store more indexes
  • The leaf node contains all index fields
  • Leaf nodes contain data (key and data fields) and Pointers
  • Leaf nodes are connected by Pointers to improve the performance of interval access

Insufficient balanced binary trees

  • Given the extreme case where each insert is larger than the last, the balanced binary tree stores the data in a linear fashion with O(n) time. It is common for mysql to store millions of pieces of data in a single table when there is a large amount of data. This will cause the tree to become deeper and consume a lot of IO when mysql reads.
  • When mysql reads data from disk, it reads data in the unit of pages. Each node represents a page. However, the balanced binary tree stores one keyword per node, resulting in a waste of storage space.

B- trees have advantages over balanced binary trees

  • Each node stores multiple keywords and makes proper use of space. Each time mysql reads data, it prereads data from the node and stores it in memory
  • Each node in a B-tree stores more keywords, resulting in the storage of the same amount of data. The depth of a B-tree is shallower than that of a balanced binary tree, reducing the search times and complexity of data

B+ trees versus red black trees

  • Red-black trees are mostly used for internal sorting, that is, all in memory
  • B+ trees are also known as disk-friendly data structures when they are mostly used for external storage
  • Red-black trees and balanced binary trees have the same disadvantages. Each node stores a keyword. When the data volume is large, the red-black tree is very deep, and mysql consumes a lot of IO for each read

Advantages of a B+ tree over a B- tree

  • Non-leaf nodes of a B+ tree store only key values, while a B- tree stores key and data values, so that the B+ tree can read more keys each time
  • When mysql performs interval access, since the leaf nodes of B+ tree are connected by Pointers, only all leaf nodes need to be traversed. A B-tree requires a middle-order traversal
  • Non-leaf nodes of a B+ tree store only key values, while A B- tree stores key and data values. As a result, the B+ tree has fewer levels and higher query efficiency
  • All key word addresses of B+ tree exist on leaf nodes, so the number of queries is the same each time, which is more stable than that of B- tree

Why does a height 3 B+ tree store tens of millions of data?

To explain the premise of this problem, mysql uses the InnoDB engine and the default mysql page file size is 16K

Suppose we have a row of data size of 1K, then a page can store 16 data, that is, a leaf node can store 16 data

If the primary key ID is bigint, then the length is 8B. If the pointer size is 6B in InnoDB engine, then the total size is 14B. Then 16K /14B=1170 can be stored in one page.

In other words, the B+ tree of height 2 can store 117016=18720 data; A B+ tree of height 3 can store 11701170*16=21902400(ten million pieces of data)

This is why mysql can support tens of millions of data levels, right

The article will be updated continuously. You can search “Maimo Coding” on wechat to read it for the first time. Every day to share quality articles, large factory experience, school recruitment experience, help the school recruitment interview, is worth paying attention to every programmer platform.