4.1. Index organization table

In InnoDB, all tables are sorted by primary key. In InnoDB’s design, every table has a primary key. If a table is created without an explicit primary key, InnoDB uses the following rules:

  • Determine if there is a Unique index (Not Null, Unique), if so, the new primary key.
  • If these conditions are not met, InnoDB will automatically create a 6-byte pointer as the primary key.

For the first point, the primary key is chosen based on the order in which the index is created, not the order in which the columns are created.

4.2. InnoDB logical storage structure

2. The table space

The table space (top left) is the highest level of InnoDB’s logical storage structure. All data is stored in the table space, but it can be set so that each table has its own table space. Even then, indexes and bitmap, undo information, insert buffer index page, system transaction information, double write buffer data are still in the shared tablespace.

4.2.2. Period

Table Spaces are composed of multiple segments and segments are managed automatically by InnoDB

4.2.3 area

Extents consist of contiguous pages. In any case, the size of each extents is 1MB, 1024KB. To ensure continuity of pages, InnoDB usually requests 4 or 5 extents from the disk in succession. Because by default, one page =16KB, there are 64 consecutive pages in an extent.

Later versions of InnoDB introduced the concept of compressed pages, each of which could be 8KB or smaller, increasing the number of pages in an extent, but always 1MB in size.

In addition, in order to save space, 32 page size (32 * 16KB) fragmentation pages will be used to save the table when it is created, and then extones will be used if the space is insufficient.

Page 4.2.4.

Pages are the smallest unit InnoDB manages. You can set the page size, but once it’s set, you can’t change it unless you use the mysqldump import and export to reset it.

Common page types are:

  • Data page (B-tree Node)
  • Undo Log Page
  • System Page
  • Transaction System Page
  • Insert Buffer page
  • Insert Buffer Free List
  • Uncompressed BOLB Page Uncompressed BOLB Page
  • Compressed binary large object Page

Line 4.2.5.

InnoDB is column-oriented, which means data is stored in rows. There are also requirements for per-line records, such as 7992((16KB / 2) -200) lines at most.

4.3. InnoDB row record format

InnoDB’s row format means that the page holds rows of records.

4.3.1. Compact row record format

Compact, introduced with MySQL5.0+, is a format where the more data you can store on a page, the better the performance. Take a look at its composition format:

Now to the parts:

  • Variable-length field length list: Record the length of each variable-length field in reverse order. If the length is less than 255, use 1 byte, otherwise 2 bytes, so the length must not be greater than 65535. In fact, MySQL states that the total length of all variable-length columns should not be greater than 65535, rather than a single column.
  • For example, if the second and fourth columns of the data in column 5 are empty, the value 00001010 indicates that there is empty data. In addition, the bit size must be N bytes, in reverse order to record the empty column position.
  • Record header information, fixed to 5 bytes, each of the meaning of the table below.
  • Actual column data. Note that the NULL column does not take up any space, it only takes up a flag bit. In addition, there are two hidden columns: the transaction ID and the rollback pointer columns, which are 6/7 bytes respectively, and it is possible to add a row_ID column, which is 6 bytes, without the primary key.

Here’s a good article

N_owned and next_record are related to the sparse slot in the page_directory (see below), which is also related to the next record. The data in each page is sorted logically, not physically. In other words, each row of data is stored in a node, and the nodes are connected by a pointer. Traversal the list can get the primary key order, but the nodes before and after are not necessarily placed on adjacent disks. They are kept in order by the pointer, so it is logical order.

4.3.2. Redundant row record format

This format is supported by older versions of MySQL, and no tables are omitted here.

4.3.3. Row overflow data

If some column data is too large, it is called row overflow data and stored somewhere other than in the data page.

InnoDB generally stores engine data in the leaf of the primary key index (the data page) (the leaf of the primary key index is an index page, but it is still a data page because it holds data). When row overflow occurs, data is stored in an Uncompressed BOLB Page.

Now that you know what to do when a row overflow occurs, the question is, what does the data page put when a row overflow occurs?

We can see that the data page holds the first N data, and then a pointer to the BOLB page. What is the value of N? According to the experiment, this threshold is 8098. In addition, because each page of B+Tree has at least two records (only one becomes a linked list and loses the meaning of B+Tree), the basis for judging whether to put in data page or BOLB is whether the data page can put two records.

4.3.4. Compressed and Dynamic row formats

The new version of InnoDB introduces two new row record formats: Compressed and Dynamic. Both formats use full row overflow logging for BLOB data. That is, the data page only holds the 20-byte pointer, and the actual data is stored in the OffsetPage.

On the other hand, Compressed uses the Zlib algorithm to compress row data, thus storing large data efficiently.

Row structure store of type CHAR

Just to mention, InnoDB internally treats the storage of multi-byte CHAR data types as varied-length characters. That is, in the variable-length list, the length will be recorded. In the case of multi-byte characters, there is little difference between CHAR and VARCHAR.

4.4. InnoDB Data page structure

The b-Tree Node page stores the actual data of the rows in the table. These are the data pages. In general, the data page consists of the following sections:

The header, header, and end of the file are used to record page information, such as page index level in B+ tree, user record, free space, page directory, etc., is used to record the actual row record, so the size is dynamic.

4.4.1. FileHeader

Take a look at the components of the header:

4.4.2. PageHeader

Look at its composition:

4.4.3. Infimum and Supremum Record

These are two virtual records on each page that act as boundary Pointers, with Infimum being smaller than the smallest primary key on the current page and Supremum being larger than the largest. These two Pointers are created at page creation and are never deleted.

Page The data in the Page is out of order, but in logical order.

4.4.4. User Record and Free Space

UserRecord stores the actual row records, and every time the record is deleted, its space is released, and then added to FreeSpace, FreeSpace, FreeSpace, record which space is available.

4.4.5. Page Directory

The page directory, as the name suggests, is used to find the records in the current page. The Page Directory holds the relative location of the records. Inside this is A structure called A slot, which is A sparse directory, that is, each slot contains the location of multiple records (by holding A pointer to the record), not every record has A slot, maybe A slot also holds records B, C, D.

At the same time, the records placed in the slot are logically ordered, and the slot is also ordered between the slot, so you can find the page through B+ tree, find the slot through binary search, and find the required record through the next_record traversal of each record in the slot. Also, because each record has an N_OWNED field, this field indicates how many records are following the record, so you iterate to find it.

The official document explains it this way:

The Page Directory part of a page has a variable number of record pointers. Sometimes the record pointers are called “slots” or “directory slots”. Unlike other DBMSs, InnoDB does not have a slot for every record in the page. Instead it keeps a sparse directory. In a fullish page, there will be one slot for every six records.

The slots track the records’ logical order (the order by key rather than the order by placement on the heap). Therefore, if the records are ‘A”B”F”D’ the slots will be (pointer to ‘A’) (pointer to ‘B’) (pointer to ‘D’) (pointer to ‘F’). Because the slots are in key order, and each slot has a fixed size, it’s easy to do a binary search of the records on the page via the slots.

(Since the Page Directory does not have a slot for every record, binary search can only give a rough position and then InnoDB must follow the “next” record pointers. InnoDB’s “sparse slots” policy also accounts for the n_owned field in the Extra Bytes part of a record: n_owned indicates how many more records must be gone through because they don’t have their own slots.)

4.4.6. FileTrailer

This section is used to record whether the page has been fully written to disk,

4.5. The constraint

4.5.1. Data integrity

In general, there are three ways to ensure data integrity:

  • Entity integrity: There is a primary key in the table.
  • Field integrity (or field integrity) : Guarantees that the values of each column meet specific conditions.
  • Referential integrity: Ensures the relationship between two tables.

4.5.2. Creation and lookup of constraints

There are two types of constraint creation: select when creating the TABLE and update the ALTER TABLE statement.

For PRIMARY keys, the default constraint name is PRIMARY. For Unique keys, the default constraint name is the same as the column name.

4.5.3. Differences between constraints and indexes

Constraints are more of a logical concept to ensure data integrity; But an index is more like a data structure, with both logical concepts and physical organization.