Why’s THE Design is a series of articles about programming decisions in THE field of computing. In each article, we present a specific question and discuss THE pros and cons of this Design and its impact on implementation from a different perspective. If you have a question you’d like to know more about, leave a comment below.

When we use relational databases, the Primary Key is a concept that cannot be avoided. The Primary Key is used to act as an identifier for records. We can use this identifier to locate unique records in a table. For those who are interested, the author has explained why you should never use meaningful fields as unique identifiers in why you Always need meaningless ids.

In a relational database, we will choose to record the minimum subset of multiple fields in as a unique identifier of the record in table 1, based on the definition of relational database on the primary key, we can choose a single column as the primary key, also can choose more than one column as the primary key, but a primary key must exist in the whole record and the only. The most common way to do this is of course to use MySQL’s default increment ID as the primary key, although using other policy Settings for primary keys is legal, but not common or recommended.

Figure 1 – Primary key of MySQL

The default AUTO_INCREMENT attribute in MySQL ensures the continuity of primary keys in most cases. Run the show create table command to view the current value of the AUTO_INCREMENT attribute in the table definition. It uses the value of this property as the primary key for the inserted record, and increments it each time it is fetched.

CREATE TABLE `trades` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  ...
  `created_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=17130 DEFAULT CHARSET=utf8mb4
Copy the code

In the cognition of many developers, MySQL is the primary key should be monotone increasing, but in our MySQL will encounter in the process of dealing with two problems, first is the record of the primary key is not continuous, followed by the same record may create multiple primary key, we will answer from the following two angles MySQL harmonic not only the reason of discontinuous:

  • Older versions of MySQL willAUTO_INCREMENTThe value is stored in memory. After the instance restarts, the value is reset according to the data in the table.
  • To obtainAUTO_INCREMENT, transaction lock will not be used, and concurrent insert transactions may have partial field conflicts leading to insert failure.

It is important to note that in this article we are discussing the InnoDB storage engine that is most common in MySQL. The implementation principle of AUTO_INCREMENT provided by other engines such as MyISAM is outside the scope of this article.

Delete records

Although the AUTO_INCREMENT attribute is common in MySQL, its implementation in earlier versions of MySQL is relatively primitive. InnoDB engine stores an integer in memory to indicate the next ID to be allocated. When a client inserts data into a table, the AUTO_INCREMENT value is fetched and incremented.

Figure 2 – Use of AUTO_INCREMENT

Since this value is stored in memory, after each MySQL instance restart, when the client inserts a record into table table_name for the first time, MySQL will use the following SQL statement to find the maximum number of ids in the current table and increment it as the primary key of the record to be inserted. As the initial value of the AUTO_INCREMENT counter in the current table 2.

SELECT MAX(ai_col) FROM table_name FOR UPDATE;
Copy the code

If you had asked the author to implement AUTO_INCREMENT, you would have used this method in the beginning. Although this implementation is very simple, if users do not strictly follow the design specifications of a relational database, data inconsistencies like the following can occur:

Figure 3-5.7 AUTO_INCMRENT before version 3

Since the MySQL instance was restarted, the AUTO_INCREMENT counter in memory was reset to the maximum value in the table. When inserting new trades into the table, 10 was used as the primary key again, so that the primary key was not monotonic. When the new trades were inserted, the record in the executions table incorrectly referred to the new trades. This was a serious error.

However, this is not entirely a problem with MySQL. If we strictly follow the design specifications of relational databases and use foreign keys to handle connections between different tables, this problem can be avoided. Because the current TRADES records still have external references, foreign keys prevent trades records from being deleted. However, most dbAs within companies do not recommend or prohibit the use of foreign keys, so there is a real potential for such problems.

In MySQL 8.0, however, the behavior of the AUTO_INCREMENT counter is changed. Each change in the AUTO_INCREMENT counter is written to the Redo log and stored in the engine’s private system table at each checkpoint.

In MySQL 8.0, this behavior is changed. The current maximum auto-increment counter value is written to the redo log each time it changes and is saved to an engine-private system table on each checkpoint. These changes make the current maximum auto-increment counter value persistent across server restarts.

When the MySQL service is restarted or in crash recovery, it can recover the latest AUTO_INCREMENT counter from persistent checkpoints and redo logs, avoiding monotonic primary keys and solving the problem mentioned here.

Concurrent transactions

To improve transaction throughput, MySQL can handle multiple transactions executed concurrently, but executing multiple SQL statements to insert new records concurrently can result in discontinuity of primary keys. As shown in the figure below, transaction 1 inserts a record with ID = 10 into the database, and transaction 2 inserts two records with ID = 11 and ID = 12 into the database:

Figure 4 – Concurrent transaction execution

However, if at the end of transaction 1, due to a unique key conflict in the inserted record, and transaction 2 commits normally without error, we will find that the primary key of the current table is discontinuous, and the subsequent newly inserted data will no longer use 10 as the primary key of the record.

Figure 5 – Discontinuous primary key

MySQL > alter table AUTO_INCREMENT increment lock AUTO_INCREMENT increment lock; MySQL > alter table AUTO_INCREMENT lock;

This behavior is controlled by the InnoDB_autoinc_lock_mode configuration provided by the InnoDB storage engine. This configuration determines the lock to be acquired when the AUTO_INCREMENT timer is acquired. There are three different modes for this configuration. Traditional mode, Consecutive mode, and Interleaved mode are used by MySQL as the default lock mode:

  • The traditional modelinnodb_autoinc_lock_mode = 0;
    • In containsAUTO_INCREMENTWhen data is inserted into the table of thealltheINSERTStatement will getTable leveltheAUTO_INCREMENTLock, which is released after the current statement is executed;
  • Continuous modeinnodb_autoinc_lock_mode = 1;
    • INSERT ... SELECT,REPLACE ... SELECTAs well asLOAD DATAThis parameter is required for insert operations in batchesTable leveltheAUTO_INCREMENTLock, which is released after the current statement is executed;
    • Simple insert statementA statement that knows in advance how many records to insertAUTO_INCREMENTThe mutex of the counter is released when the primary key is acquired without waiting for the current statement to complete;
  • Cross patterninnodb_autoinc_lock_mode = 2;
    • None of the insert statements need to be fetchedTable leveltheAUTO_INCREMENTLock, but when multiple statements insert an uncertain number of rows, there may be a risk of allocating the same primary key;

None of these modes can solve the problem of MySQL’s self-added primary key discontinuity. The ultimate solution to this problem is to serialize all transactions involving inserts using the highest isolation level of the database, Serialiable. Of course, it is relatively simple to change the isolation level of the database directly. Implementing fully serial inserts based on MySQL or other storage systems can also guarantee the continuity of the primary key during inserts, but still cannot avoid the discontinuity caused by deleting data.

conclusion

Early primary key is not monotonous, MySQL is not continuous, these are all on the project to make some choices, if strictly according to the design of the relational database standard, MySQL caused problems in the design of the initial probability is low, only when the deleted primary key referenced by the external system will influence the consistency of the data, However, the difference in usage today increases the possibility of errors, and MySQL persisted AUTO_INCREMENT in 8.0 to avoid this problem.

The discontinuous primary key in MySQL is another example of engineering sacrificing performance to support concurrent insertion of data at the expense of primary key continuity, which ultimately improves the throughput of MySQL services. The author encountered this problem when he first started using MySQL a few years ago, but did not investigate the cause at that time. It’s also interesting to re-understand the design decision behind this question today. Let’s briefly summarize the content of this article and return to today’s question — why MySQL’s increment primary key is not monotonous or continuous:

  • MySQL was stored in memory prior to version 5.7AUTO_INCREMENTCounter, the instance will reset according to the data in the table after the restart, delete the record after the restart may appear duplicate primary key, this problem is solved in version 8.0 using redo log, to ensure the monotony of primary key;
  • MySQL insert data to obtainAUTO_INCREMENTWhen, the transaction lock will not be used, but the mutex lock will be used. The concurrent insert transaction may cause the insert failure due to some field conflicts. To ensure the continuity of the primary key, the insert statement needs to be executed serially.

In the end, let’s take a look at some of the more open-ended questions that interested readers can ponder:

  • How do MyISAM and other storage engines storeAUTO_INCREMENTCounter?
  • In the MySQLauto_increment_incrementauto_increment_offsetWhat is it used for?

If you have questions about the content of this article or want to learn more about the reasons behind some design decisions in software engineering, you can leave a comment below on this blog. The author will respond to the questions in this article and select the appropriate topics for subsequent content.

Recommended reading

  • Why do I always need a meaningless ID
  • Why does MySQL use B+ trees
  • “Shallow in shallow out” MySQL and InnoDB

  1. Wikipedia: the Primary key en.wikipedia.org/wiki/Primar… ↩ ︎

  2. InnoDB AUTO_INCREMENT Counter Initialization · MySQL 5.7 Reference Manual / 14.6.1.6 AUTO_INCREMENT Handling in InnoDB Dev.mysql.com/doc/refman/… ↩ ︎

  3. InnoDB AUTO_INCREMENT Counter Initialization · MySQL 8.0 Reference Manual/AUTO_INCREMENT Handling in InnoDB Dev.mysql.com/doc/refman/… ↩ ︎

  4. Auto increment primary brigade gaps in counting stackoverflow.com/questions/1… ↩ ︎

  5. InnoDB AUTO_INCREMENT Lock Modes · MySQL 8.0 Reference Manual/AUTO_INCREMENT Handling in InnoDB Dev.mysql.com/doc/refman/… ↩ ︎

Transfer application





Creative Commons Attribution 4.0 International License agreement

The article images

Guide to illustration of technical articles