Compared with the traditional Page based database storage method, OceanBase uses the LSM Tree, which is very popular now, as the basic data structure of the storage engine to store data, which is rare in the distributed general relational database. Today we will explain the technical principle of LSM Tree in detail.

First of all, one of the main reasons for the emergence of LSM Tree technology is that the random disk write speed is much lower than the sequential write speed, and the database has to face many write intensive scenarios, so many database products have introduced the IDEA of LSM Tree into the database field. LSM Tree, as The name suggests, is The abbreviation of The Log-structured Merge-tree. There are a few key messages from the name:

Log-structred, log-structred, log-structred, log-structred, log-structred, log-structred, log-structred, log-structred, log-structred, log-structred

It’s not really a tree, it’s not really a concrete data structure, it’s really an idea of data preservation and updating. In simple terms, the data is sorted by key (in a database, that is, the primary key of the table), and then formed into a small tree structure, or not a tree structure, a small table can be used. The data is usually called baseline data. After that, every data change (i.e. log) is recorded and sorted by primary key. Then, changes in the log are periodically merged into the baseline data. The following figure describes the basic structure of an LSM Tree.

C0 in the figure represents the data cached in memory. When the data in memory reaches a certain threshold, the data in memory is sorted and saved to disk. This results in C1 level incremental data on disk (also sorted by primary key), which is often called dump. When C1 level data also reaches a certain threshold, will trigger another merger (the merger process can be thought of as a merge sort), the formation of C2 levels of data, by analogy, if the structure of this step defines the k layer, and finally the first k layer is the last of the baseline data, this process is usually referred to as a merger.

In a word, LSM Tree is a data storage idea based on merge sort. As you can see from the above structure, the LSM Tree is very friendly to write intensive applications, because most write operations are sequential. However, for many read operations, some performance is lost because the data may have multiple versions on the disk. Therefore, the storage engine using THE LSM Tree usually stores the data of multiple versions in the memory and builds the required data version according to the query requirements. In the database field, many products use LSM Tree structure as the database storage engine, such as OceanBase, LevelDB, HBase, and so on.