Index and algorithm

Binary search method

The records are arranged in order (increasing or decreasing), and the search process is conducted in a leap-forward way, that is, the midpoint position of the ordered sequence is first taken as the comparison object. If the element value to be found is less than the midpoint element, the search sequence is reduced to the left half. The search interval is halved by a single comparison.

Balanced binary trees

Binary search tree: The left subtree’s key is always less than the root’s, and the right subtree’s key is always greater than the root’s.

Balanced binary tree: first meets the definition of binary tree, secondly must satisfy the height of the left and right subtrees of any node maximum difference of 1. Balanced binary trees are indeed fast for queries, but maintaining a balanced binary tree is very expensive, with one or more left and right rotations to achieve the balance of inserted or updated trees.

B + tree

A B+ tree is a balanced search tree designed for disks or other direct access assistive devices. In a B+ tree, all the record nodes are stored in the leaf nodes of the same layer in order of the size of the key values, and the Pointers to the leaf nodes are connected.

B + tree index

The essence of B+ tree index is the realization of B+ tree in the database. In the database, the height of B+ tree is generally at 2 or 3 levels, that is, it only takes 2 or 3 IO times at most to find a row record of a certain key value. The B+ tree index in the database can be divided into clustered index and auxiliary clustered index, which are all B+ trees, that is, highly balanced, leaf nodes store all the data. Clustered indexes differ from non-clustered indexes in whether leaf nodes hold a whole row of information.

Clustered index

Clustered index is to construct a B+ tree according to the primary key of each table, and the leaf node stores the row record data of the whole table, so it also makes the leaf node of clustered index become data page.

Clustered index storage is not physically contiguous, but logically contiguous. Pages are linked by a bidirectional linked list, in order of primary keys; Records in each page are also maintained through a bidirectional linked list and can also be stored physically without primary keys.

Another benefit of a clustered index is that it is very fast in sorting and range lookups for primary keys. The leaf node data is the data we want to query.

Secondary index

Secondary indexes, also known as non-clustered indexes, do not contain all data for rows at the leaf level. In addition to the leaf nodes containing key values, each leaf level index row contains a bookmark that tells the InnoDB storage engine where to find the row corresponding to the index. Since InnoDB storage engine tables are index organized tables, the bookmarks of InnoDB storage engine secondary indexes are clustered index keys for the corresponding row data.

The presence of secondary indexes does not affect the organization of data in clustered indexes, so there can be multiple secondary indexes on each table. When looking for data through the secondary index, InnoDB storage engine iterates through the secondary index and obtains the primary key to the primary key index through leaf level Pointers, and then finds a complete row record through the primary key index.

The lock

InnoDB storage engine implements the following two standard row-level locks:

  • A shared Lock (S Lock) allows a transaction to read a row of data.
  • An X Lock allows a transaction to delete or update a row of data.

When one transaction has acquired the shared lock on row R, another transaction can immediately acquire the shared lock on row R because the read does not change the data on row R. This situation is called lock compatibility. If a transaction wants to acquire an exclusive lock on row R, it must wait for the transaction to release the shared lock on row R, which is called lock incompatibility.

InnoDB storage engine supports two types of intent locks:

  • A transaction wants to acquire a shared Lock for rows in a table.
  • A transaction wants to acquire an exclusive Lock for rows in a table.

The transaction

Transactions in the InnoDB storage engine fully conform to ACID properties:

  • Atomicity: The entire database transaction is an indivisible unit of work. The entire transaction succeeds only if all database operations in the transaction are successfully executed. If any SQL statement in the transaction fails, the SQL statement that has successfully executed must also be destroyed, and the database state should revert to the state before the transaction was executed.
  • Consistency: A transaction changes a database from one state to the next consistent state. Database integrity constraints are not broken before and after a transaction.
  • Isolation: The impact of one transaction is not visible to other transactions until the transaction commits, which is achieved by locking.
  • Durability: Once a transaction is submitted, its results are permanent. Even in the event of a failure such as an outage, the database can recover data.

Transaction isolation level:

  • READ UNCOMMITTED
  • READ COMMITTED
  • REPEATABLE READ (InnoDB Storage Engine default isolation level)
  • SERIALIZABLE

InnoDB storage engine