Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.


Mysql has some hidden columns, such as row ids, transaction ids, rollback Pointers, etc. How do you actually see the values of these hidden columns?

In this article, we will focus on the row identifier DB_ROW_ID. In fact, calling the row identifier a hidden column is inaccurate because it is not a real column. DB_ROW_ID is actually an alias for a non-empty unique column. Before we demystify it, here’s what the official documentation says:

If a table has a PRIMARY KEY or UNIQUE NOT NULL index that consists of a single column that has an integer type, you can use _rowid to refer to the indexed column in SELECT statements

If there is a primary key or non-empty unique index in the table and it consists of only one column of integer type, then the SELECT statement can be used to query _ROwid directly, and the value of _ROwid will refer to the value of the index column.

Taking a look at some of the keywords mentioned in the documentation — primary key, unique index, non-empty, single column, numeric type — we’ll explore the mysterious hidden field _rowid from those perspectives.

1. Primary keys exist

Select * from primary key; select * from primary key; select * from primary key;

CREATE TABLE `table1` (
  `id` bigint(20) NOT NULL PRIMARY KEY ,
  `name` varchar(32) DEFAULT NULL
) ENGINE=InnoDB;
Copy the code

After inserting the three test data, execute the following query statement to query _rowid directly in the SELECT query statement:

select *,_rowid from table1
Copy the code

_rowid can be queried normally:

You can see that _rowid refers directly to the value of the primary key field when the primary key is set and the primary key field is numeric. This case, which can be queried by the SELECT statement, can be called an explicit ROWID.

Review a few keywords in the document mentioned earlier and analyze them separately. Select * from primary key (s) where primary key (s) are non-null;

CREATE TABLE `table2` (
  `id` varchar(20) NOT NULL PRIMARY KEY ,
  `name` varchar(32) DEFAULT NULL
) ENGINE=InnoDB;
Copy the code

SQL > select * from table2; SQL > select * from table2; SQL > select * from table2;

2. There is no primary key and a unique index

After testing both types of primary keys above, let’s look at what happens when there is no primary key in the table, but there is a unique index. First test is not empty unique index on numerical type fields, to build table is as follows:

CREATE TABLE `table3` (
  `id` bigint(20) NOT NULL UNIQUE KEY,
  `name` varchar(32)
) ENGINE=InnoDB;
Copy the code

The query executes normally, and _rowid references the value of a column with a unique index:

Unique indexes differ from primary keys in that the field in which a unique index resides can be NULL. In table3 above, we add the NOT NULL constraint on the unique index column. If we remove this constraint, can we still explicitly query _rowid? Create a new table with no non-null constraint on the column where the unique index resides:

CREATE TABLE `table4` (
  `id` bigint(20) UNIQUE KEY,
  `name` varchar(32)
) ENGINE=InnoDB;
Copy the code

Execute the query, in which case there is no way to explicitly query _rowid:

Similar to primary keys, we test the case where a unique index is added to a field that is not of a numeric type. Add a unique index to a field of character type and add a non-null constraint:

CREATE TABLE `table5` (
  `id` bigint(20),
  `name` varchar(32) NOT NULL UNIQUE KEY
) ENGINE=InnoDB;
Copy the code

Also unable to display query to _rowid:

Based on the test results in the above three cases, it can be concluded that when there is no primary key, but a unique index exists, _ROwid can be explicitly queried only if the unique index is added to a numeric field with a non-null constraint, and _ROwid references the value of the unique index field.

There is a federated primary key or federated unique index

In the above tests, we used a primary key or unique index on a single column, so what happens if we use a joint primary key or a joint unique index? Let’s take a look at the official documentation:

_rowid refers to the PRIMARY KEY column if there is a PRIMARY KEY consisting of a single integer column. If there is a PRIMARY KEY but it does not consist of a single integer column, _rowid cannot be used.

In simple terms, the value of _rowid refers to the primary key if it exists and consists only of a column of numeric type. If the primary key is composed of multiple columns, _rowid will not be available.

SQL > create table with two columns of numeric type as joint primary keys:

CREATE TABLE `table6` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  PRIMARY KEY(`id`,`no`)
) ENGINE=InnoDB;
Copy the code

_rowid:

Similarly, the same theory applies to unique indexes. If a non-empty unique index is not composed of a single column, then _rowid cannot be queried directly. This test process omitted, interested partners can try their own hands.

4. There are multiple unique indexes

In mysql, there can only be one primary key per table, but there can be multiple unique indexes. So if there are multiple unique indexes that conform to the rule, which one will be referenced as the value of _rowid? As usual, the answer to the official document is:

Otherwise, _rowid refers to the column in the first UNIQUE NOT NULL index if that index consists of a single integer column. If the first UNIQUE NOT NULL index does not consist of a single integer column, _rowid cannot be used.

If the first non-empty unique index in a table consists of only one field of integer type, then _rowid references the value of that field. Otherwise, if the first non-empty unique index does not satisfy this condition, then _rowid will not be available.

In the following table, create two unique indexes that both match the rule:

CREATE TABLE `table8_2` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  UNIQUE KEY(no),
  UNIQUE KEY(id)
) ENGINE=InnoDB;
Copy the code

Look at the result of executing the query:

You can see that the value of _rowid is the same as the value of the no column, proving that _rowid strictly takes the first unique index created as its reference.

So, can _ROwid be explicitly queried if the first unique index created in the table does not meet the reference rules for _ROwid and the second unique index does? Create a table where the first index is a joint unique index and the second index is a single unique index.

CREATE TABLE `table9` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL,
  `name` varchar(32),
  UNIQUE KEY `index1`(`id`,`no`),
  UNIQUE KEY `index2`(`id`)
) ENGINE=InnoDB;
Copy the code

SQL > select * from rowid where rowid = 1; SQL > select * from rowid;

If you reverse the order of the statements that created the unique index above, you can query explicitly to _rowid normally.

5. There are both primary keys and unique indexes

From the above example, you can see that the order in which unique indexes are defined determines which index is applied to _ROwid. Does the definition order affect references to both primary keys and unique indexes?

Create two tables in the same order as primary key and unique index:

CREATE TABLE `table11` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL.PRIMARY KEY(id),
  UNIQUE KEY(no)
) ENGINE=InnoDB;

CREATE TABLE `table12` (
  `id` bigint(20) NOT NULL,
  `no` bigint(20) NOT NULL.UNIQUE KEY(id),
  PRIMARY KEY(no)
) ENGINE=InnoDB;
Copy the code

View the run result:

It can be concluded that when there are both qualified primary keys and unique indexes, _rowid will reference the primary key field values first, regardless of the order of creation.

6. No primary key or unique index is available

Above, we call the _rowid that can be queried directly by the SELECT statement explicit. In other cases, _rowid is always present even though it cannot be queried explicitly. In this case, we can call it implicit _rowid.

In fact, innoDB generates an unsigned number of 6 bytes as an auto-growing _rowid without a default primary key, so it has a maximum of 2^48-1 and starts counting from 0. Let’s explore the implicit _rowid by creating a table without primary keys and unique indexes.

CREATE TABLE `table10` (
  `id` bigint(20),
  `name` varchar(32)
) ENGINE=InnoDB;
Copy the code

Mysql > select pid from pid;

ps -ef | grep mysqld
Copy the code

Mysql process PID = 2068

Before we start, a global variable dictsys.row_id is maintained in innoDB. Tables that do not define a primary key share this row_id and insert data using this global row_ID as their primary key and increment this global variable by 1.

Next we need to use the techniques associated with GDB debugging. GDB is a debugging tool under Linux that can be used to debug executable files. # yum install GDB # yum install GDB # yum install GDB

gdb -p 2068 -ex 'p dict_sys->row_id=1' -batch
Copy the code

Result:

Insert 3 rows into an empty table:

INSERT INTO table10 VALUES (100000001.'Hydra');
INSERT INTO table10 VALUES (100000002.'Trunks');
INSERT INTO table10 VALUES (100000003.'Susan');
Copy the code

_roWID = 1~3;

Then run the GDB command to set row_id to the maximum value 2^48, at which the dictsys.row_id reaches its maximum value:

gdb -p 2068 -ex 'p dict_sys->row_id=281474976710656' -batch
Copy the code

Result:

Insert three more data into the table:

INSERT INTO table10 VALUES (100000004.'King');
INSERT INTO table10 VALUES (100000005.'Queen');
INSERT INTO table10 VALUES (100000006.'Jack');
Copy the code

If you look at all the data in the table, you can see that two of the three data inserted for the first time are overwritten:

Why does data coverage occur? We analyze this result. First, _ROwiD is 1 before the first data insertion, and the _ROwiD corresponding to the three data inserts is 1, 2, and 3. As shown below:

If you manually set _ROWID to the maximum value, the _ROwiD of the next data insertion starts from 0. Therefore, the _ROWID of the second three data inserts should be 0, 1, and 2. The data to be inserted is as follows:

When the same _ROWID occurs, the newly inserted data will overwrite the original data according to _ROwiD, as shown in the figure:

So innoDB uses an implicit _rowid when the primary key or unique index in the table does not meet the requirements we mentioned earlier. Although 2^48 is a large value, it is possible to run out. When the _ROwiD runs out, the previous records will be overwritten. From this point of view, it is important to create a primary key when building a table, otherwise data overwriting may occur.

This article is tested based on mysql 5.7.31

The official document: dev.mysql.com/doc/refman/…

The last

If you think it is helpful, you can like it and forward it. Thank you very much

Public number agriculture ginseng, add a friend, do a thumbs-up friend ah