This article is based on MySQL 8

In the previous article, we introduced what InnoDB row storage is and Compact row format. In this article, we continue to introduce the other three row formats.

Redundant row format

This is the oldest, simplest and most crude line format, and is now largely obsolete because it takes up the most space, resulting in the most memory fragmentation, and is the least efficient line format. In most cases, updating a VARCHAR field to change its length is to mark the original row as deleted and then create a new record where there is enough space. Redundant

InnoDB Record High-Altitude Picture InnoDB Record High-Altitude Picture InnoDB Record High-Altitude Picture

Create a table like the one in the previous article and insert the same data:

CREATE TABLE `record_test_2` (
  `id` bigint(20) DEFAULT NULL,
  `score` double DEFAULT NULL,
  `name` char(4) DEFAULT NULL,
  `content` varchar(8) DEFAULT NULL,
  `extra` varchar(16) DEFAULT NULL,
  `large_content` varchar(1024) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
Copy the code
INSERT INTO 'record_test_2' (' id ', 'score', 'name', 'content', 'extra', 'large_content') VALUES (1, 78.5,'hash'.'wodetian'.'nidetiantadetian'.'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmno pqrstuvwxyz'); INSERT INTO 'record_test_2' (' id ', 'score', 'name', 'content', 'extra', 'large_content') VALUES (65536, 17983.9812,'zhx'.'shin'.'nosuke'.'lex'); INSERT INTO 'record_test_2' (' id ', 'score', 'name', 'content', 'extra', 'large_content') VALUES (NULL, -669.996,'aa', NULL, NULL, NULL);
INSERT INTO `record_test_2`(`id`, `score`, `name`, `content`, `extra`, `large_content`) VALUES (2048, NULL, NULL, 'c'.'jun'.' '); INSERT INTO 'record_test_2' (' id ', 'score', 'name', 'content', 'extra', 'large_content') VALUES (-1, 26.75,'xxxx'.'aaaa'.'bbbb'.'cccc');

Copy the code

Let’s look directly at what the underlying data looks like:

All field lengths: 00 C1 00 3F 00 2F 00 27 00 23 00 1B 00 13 00 0C 00 06 Record header information: 00 00 10 12 01 65 Hide column DB_ROW_ID: 00 00 00 00 09 00 DB_TRX_ID: 00 00 00 03 CB 08 DB_ROLL_PTR: A8 00 00 01 1C 01 10 80 00 00 00 00 00 00 00 01 Score (78.5) : 00 00 00 00 00 00 00 00 a0 53 40hash77 6F 64 65 74 69 61 6E column extra(Nidetiantadetian) : 6e 69 64 65 74 69 61 6e 74 61 64 65 74 69 61 6e Column data large_content (abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxy Zabcdefghijklmnopqrstuvwxyz) : 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7A All field lengths: 34 31 2b 27 23 1B 13 0C 06 Record header information: 00 00 18 13 01 A8 Hide column DB_ROW_ID: 00 00 00 00 09 01 DB_TRX_ID: 00 00 00 03 CB 09 DB_ROLL_PTR: A9 00 00 02 01 01 10 column id: 65536 80 00 00 00 00 00 00 01 00 00 score(17983.9812) : B5 15 FB CB fe 8F D1 40 column data name(ZHX) : 7a 68 78 20 column data content(Shin) : 73 68 69 6E column data EXTRA (NOSUKE) : 6E 6F 73 75 6B 65 column data large_Content (lex) : 6C 65 78 A7 A7 A7 27 23 9B 13 0C 06 Record header information: 00 00 00 13 01 de Hide columns DB_ROW_ID: 00 00 00 09 02 Hide columns DB_TRX_ID: 00 00 00 03 CB 0e Hide columns DB_ROLL_PTR: Ac 00 00 01 00 01 10 row data ID (NULL) : 00 00 00 00 00 00 00 00 00 00 score(-669.996) : 87 16 D9 CE F7 EF 84 C0 name(AA) : 61 61 20 20 All field lengths: AB 2b 28 A7 A3 1B 13 0C 06 Record header information: 00 00 28 13 02 18 Hide columns DB_ROW_ID: 00 00 00 00 09 03 Hide columns DB_TRX_ID: 00 00 00 03 CB 0F DB_ROLL_PTR: AD 00 00 01 21 01 10 column ID (2048) : 80 00 00 00 00 00 00 00 08 00 column SCORE (NULL) : 00 00 00 00 00 00 name(NULL) : 00 00 00 00 00 content(c) : 63 extra(jun) : 6a 75 6e 33 2F 2b 27 23 1B 13 0C 06 Record header information: 00 00 30 13 00 74 Hide column DB_ROW_ID: 00 00 00 00 09 04 Hide column DB_TRX_ID: 00 00 00 03 CB 10 DB_ROLL_PTR: AE 00 00 01 22 01 10 column data ID (-1) : 7F ff ff ff ff ff ff ff ff FF SCORE (26.75) : 00 00 00 00 00 C0 3A 40 column data name(XXXX) : 78 78 78 78 62 62 62 62 LarGE_content (CCCC) : 63 63 63 63 63Copy the code

Redundant – List of all field lengths

Unlike the Compact row format, Redundant starts with a list of all field lengths rather than a list of variable-length fields + a list of NULL values. The format for this list of field lengths is:

  • Records length offsets for all fields, including hidden columns. The offset is, if the first field is of length A and the second field is of length B, then the first field in the list is a and the second field is a + b.
  • All fields are in reverse order

Redundant rules for length storage, whether it’s one byte or two bytes, and what to store are quite special:

  • Depending on the length of the entire line, one or two bytes are used for each field,Does each field use one byte or two bytes? There is a tag in the record header:
    • If the length is less than 128, it is stored in one byte
    • If greater than or equal to 128, two bytes are used for each field
  • For one-byte storage, the highest bit marks whether the field is NULL, if NULL, the highest bit is 1, otherwise 0. The remaining seven bits are used to store length, so 127 at most
  • For two-byte storage, the highest bit still marks whether the field is NULL. The second flag indicates whether the record is on the same page, 0 if it is, and 1 if it is not, which actually refers to the overflow page later. The remaining 14 bits represent the length, so the maximum is 16,383

To extrapolate the list of field lengths for the first line:

Since the first line actually stores more than 128 bytes, two bytes are required. The lengths of the first to last columns are: Db_roll_ptr-7 db_roll_ptr-7 db_roll_ptr-7 db_roll_Ptr-7 db_roll_Ptr-8 Column data content-vARCHAR – becomes 8 bytes long, column data extra-vARCHAR – becomes 14 bytes long, and large_content- becomes 130 bytes long. After converting to offset: 0x06, 0x0C, 0x13, 0x1B, 0x23, 0x27, 0x2F, 0x3F, 0xC1. 00 C1 00 3F 00 2F 00 27 00 23 00 1B 00 13 00 0C 00 06 in reverse order.

For r the third row, which contains NULL columns, the record length is less than 128 and is stored in one byte. The lengths of the first to last columns are: Db_roll_ptr-7 db_roll_ptr-7 db_roll_ptr-7 db_roll_Ptr-7 db_roll_Ptr-8 Column data content-vARCHAR – Variable length 0 bytes, column data extra-vARCHAR – variable length 0 bytes, and large_content- variable length 0 bytes. 0x06, 0x0C, 0x13, 0x1B, 0x23, 0x27, 0x27, 0x27. Since the first and last three columns are NULL, set the highest bit of 0x1b, the last three 0x27, 0x27, and 0x27 to 1, becoming 0x9b, 0xA7, 0xA7, and 0xA7. In reverse order: A7, A7, A7, 27, 23, 9b, 13, 0C, 06

Redundant – Records header information

The Redundant row format has more header information (48 bits) than Compact (40 bits) :

The name of the Size (bits) describe
A useless 2 It’s not available yet
deleted_flag 1 Whether the record is deleted
min_rec_flag 1 Minimum record mark of non-leaf node in B+ tree
n_owned 4 This record corresponds to the number of records in the slot
heap_no 13 The sequence number of the record in the heap can also be interpreted as the location information in the heap
n_field 10 The number of columns for the record, ranging from 1 to 1023
1byte_offs_flag 1 1 means that each field is 1 byte in length and 0 means 2 bytes
next_record pointer 16 The relative position of the next record in a page

The difference between the Redundant row format and Compact row format is that the record_type bit is missing and the n_field and 1byte_offs_flag bits are added.

N_field is used to indicate the number of columns for the record, ranging from 1 to 1023. Each row here is nine columns, so n_field is nine, which is 0000001001. 1byte_offs_flag indicates the number of bytes occupied by each column in the field length list. 1 indicates that each column is 1byte long, and 0 indicates that each column is 2 bytes long. There’s only two bytes in the first line, so this bit in the first line is 0

0000 10 12 01 65 Converted to base 2:00000000 0000000000000010010 00000001 01100101 N_field: 000 0001001 1byte_offS_flag: 0 The second line records header information: 00 00 18 13 01 A8 Converts to base 2: 00000000 00000000 00011000 00010011 00000001 10101000 N_field: 000 0001001 1BYte_offS_FLAG: 1 The third line records header information: 000000 13 01 DE Converted to base 2:00000000 00000000 0001100000010011 00000001 11011110 N_field: 000 0001001 1byte_offS_FLAG: 1 Line 4 record header information: 0000 28 13 02 18 Convert to base 2:00000000 00000000 00101000 00010011 00000010 00011000 N_field: 000 0001001 1byte_offs_flag: 1 The fourth line records header information: 00 00 30 13 00 74 Convert to base 2: 00000000 00000000 00110000 00010011 00000000 01110100 N_field: 000 0001001 1byte_offS_flag: 1Copy the code

Redundant – Specific column record storage is different from Compact

1. Handling NULL values

For NULL, unlike Compact, which has a list of NULL values, only the highest bit of each field length in the field length list is marked 1 to indicate that the field is NULL.

For fixed-length fields, the same amount of byte space is occupied, and each byte is padded with 00, for example, lines 3 and 4:

All field lengths: A7 A7 27 23 9B 13 0C 06 Record header information: 00 00 00 13 01 DE Hide columns DB_ROW_ID: 00 00 00 00 09 02 Hide columns DB_TRX_ID: 00 00 00 03 CB 0E DB_ROLL_PTR: AC 00 00 01 00 01 10 column ID (NULL) : 00 00 00 00 00 00 00 00 00 00 00 score(-669.996) : 87 16 D9 CE F7 EF 84 C0 Column Name (AA) : 61 61 20 20 Length of all fields: AB 2b 28 A7 A3 1B 13 0C 06 Record header information: 00 00 28 13 02 18 Hide columns DB_ROW_ID: 00 00 00 00 09 03 Hide columns DB_TRX_ID: 00 00 00 03 CB 0f Hide columns DB_ROLL_PTR: AD 00 00 01 21 01 10 column ID (2048) : 80 00 00 00 00 00 00 00 08 00 Score (NULL) : 00 00 00 00 00 00 00 00 00 00 00 00 name(NULL) : Extra (jun) : 6a 75 6ECopy the code

When bigInt is empty, it fills 8 bytes of 0x00. When double is empty, eight bytes of 0x00 are padded. When char(4) is NULL, four bytes of 0x00 are filled. Thus, changes to these fixed-length fields, whether from NULL to non-NULL, from non-NULL to NULL, or updated to a different length (but within the original limit), do not mark the original record as deleted. After looking for new space to rebuild the updated record, directly in the original record above the modification. For Compact, changing from NULL to non-NULL or from non-NULL to NULL requires this cumbersome update approach because NULL takes up no space.

For variable-length fields, Redundant and Compact are the same; NULL takes no space. As long as the length is changed, the original record will be marked as deleted, and the updated record will be reconstructed in a new space.

2. The storage type is CHAR

Regardless of whether the field is NULL or of any length, char(M) takes up as many bytes as the maximum length of the M * byte encoding. If it is NULL, 0x00 is filled in. If the length is insufficient, 0x20 is added at the end.

For example, line 4 above:

Column data name(NULL) : 00 00 00 00Copy the code

And the second line:

Column data Name (ZHX) : 7A 68 78 20Copy the code

We change the encoding of name to UTF-8:

ALTER TABLE `record_test_2` 
MODIFY COLUMN `name` char(4) CHARACTER SET utf8 NULL DEFAULT NULL AFTER `score`;
Copy the code

If you look at the data in the fourth row, it becomes:

Column data name(NULL) : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00Copy the code

Since utF8 has a maximum byte footprint of 3 bytes, 12 bytes are used here. Similarly, the second line:

Column data Name (ZHX) : 7a 68 78 20 20 20 20 20 20 20 20 20 20 20 20 20Copy the code

There is a clear difference between Compact and Redundant for handling different encodings in that Compact doesn’t take up that many bytes and is handled like a VARCHar in some cases:

  • NULL still takes no space
  • If all the characters in the field occupy 1 byte, fill the following 0x20 with the size of 1 byte
  • If there are other characters with different length of bytes, the characters are stored according to the actual size of bytes

For example, change the name column of the Compact row format table from the previous section to UTF8 and change the data:

ALTER TABLE `record_test_1` 
MODIFY COLUMN `name` char(4) CHARACTER SET utf8 NULL DEFAULT NULL AFTER `score`;
update `record_test_1` set name = "我们" where id = 2048;
Copy the code

Select * from ‘2048’ where id = 2048

Name (us) : E6 88 91 E4 BB ACCopy the code

Like vARCHAR, it takes 6 bytes, which is exactly the size of the data to store.

Other rows are stored unchanged, for example:

Column data Name (ZHX) : 7A 68 78 20Copy the code