preface

Anyone with a background in databases should know that databases can use indexes to find data quickly.

It is possible to further understand that indexes (InnoDB) are implemented using B+ trees.

But why use B+ trees?

Of all the trees in the world, why love this one?

It all started as an online ‘love story’.

The story of IO

Mr. Ai, who lives in the city, met a woman named Ou through the Internet. After months of getting to know each other, Ai discovered that he loved Ou and confessed his love to her.

Ms. Ou said, “I live in Hard Disk town, if you can find me, I will promise you.”

Ms. Ou left an address (0X00000001) for Mr. Ou.

Mr. Ou took the high-speed train to the town of Hard disk, found that the houses here are very similar, want to find Ms. Ou a little difficult.

Fortunately, the first place you get off the bus is 0X0000001, but only a box is left inside.

Ou opened the box and found a piece of paper inside with the name 0X00000005, so He continued to look for the address in Hard Drive town.

After a while, I finally found it, but it was the same box with the name 0X00000099.

After N period of time, finally found this address, also successfully found Ms. Ou.

So Mr. Ai took Ms. Oh back to the city.

The amount of time it took Mr. O to find Ms. O is IO time, and the amount of time it took to go from one box to another.

As you can see, the fastest way to find Ms. O is to shorten the search.

Therefore,InnoDB must take the I/O count into account when considering data structures.

Index data structure selection

Hash

Hash structure equivalent search can achieve O(1) effect, but light does not support range search is dead.

Binary tree

In the search for tree structure data, each sub-tree query can be regarded as an IO. Therefore, the height of the tree and the IO count are equal.

But since the root node of the tree is usually stored in memory, the root node does not consume IO.

Generally, the query of a tree with tree height of 3 requires a maximum of two I/OS.

  • Binary search tree

The left-hand side is smaller than the root node and the right-hand side is larger than the root node, but inserting new data does not make it rotate.

This leads to the problem that the data inserted consecutively degrades into a linked list, and any search is a sweep of the table.

  • AVL tree search

Also called strongly balanced binary search tree, the height difference between the left and right subtrees cannot be more than 1.

This causes it to rotate frequently, affecting write performance.

  • Red and black tree

Children of nodes that feature red must be black.

Each node must pass through the same black dot to its children

This ensures you don’t rotate too much, but you can’t control the height of the tree.

Many tree

  • B tree

The characteristic of a B-tree is that each node holds data.

There is no IO to load the data, but there are fewer nodes to record.

  • B + tree

B+ trees have a feature that only leaves hold data, which allows the tree to record as many nodes as possible.

InnoDB adds a ‘billion’ point detail to the B+ tree. Each head and tail node records a pointer to one or the previous node, so that range queries can be handled perfectly.

conclusion

Any program that doesn’t take into account hardware features is a rogue.

InnoDB uses B+ tree for stable I/O count and adds Pointers for range search.

The next article will continue to cover how B+ trees store index data, if you find it helpful, feel free to do so.

This article image source tool: dynamic data structure website