The affairs of the MySQL

A transaction is a set of operations that access a data resource in a controlled manner. Transactions themselves hold 4 qualified properties Atomicity, Consistency, Isolation and persistence which is often referred to as the ACID property of transactions.

The MySQL server layer does not manage transactions; transactions are implemented by the underlying storage engine. MySQL uses the AUTOCOMMIT mode by default. That is, if a transaction is not explicitly started, each query is committed as a transaction. In the current connection, you can enable or disable automatic commit mode by setting the AUTOCOMMIT variable: 1 or 0N for enabled, 0 or 0FF for disabled.

mysql> SHOW VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set

mysql> 
Copy the code

When AUTOCOMMIT=0, all queries are in a transaction until an explicit COMMIT or ROLLBACK is performed and the transaction ends and another new transaction is started.

Isolation level

MySQL can set the isolation level with the following command. The new isolation level takes effect when the next transaction starts.

Mysql > set Session Transaction Isolation Level Sets the transaction isolation levelCopy the code

There are four isolation levels defined in the SQL standard, each of which specifies which changes made within a transaction are visible and which are not, both within and between transactions. Lower levels of isolation generally allow for higher concurrency and lower system overhead.

  • READ UNCOMMITTED: Changes made in a transaction, even if they are not committed, are visible to other transactions. Transactions can Read uncommitted data, also known as Dirty reads. It is rarely used in practical applications.
  • READ COMMITTED: Any changes made by a transaction from start to commit are not visible to other transactions. This level is sometimes called nonrepeatableread because executing the same query twice may result in different results.
  • REPEATABLE READ: This level ensures that the results of multiple reads of the same record in the same transaction are consistent. But in theory, repeatable Read isolation levels still don’t solve another Phantom Read problem. Phantom reading means that when a transaction reads records in a range, another transaction inserts new records in that range, and when the previous transaction reads records in that range again, Phantom Row will be generated. The InnoDB storage engine addresses the illusion problem with the Multiversion Concurrency Control (MVCC). Repeatable reads are MySQL’s default transaction isolation level.
  • SERIALIZABLE: Highest isolation level. It avoids the problem of phantom reading by forcing transactions to be executed sequentially. SERIALIZABLE locks every row of data that is read, so it can cause a lot of timeouts and lock contention. This isolation level is also rarely used in practice and is only considered when there is a strong need to ensure data consistency and no concurrency is acceptable.

The transaction log

Transaction logging can help make transactions more efficient. With transaction logging, the storage engine only needs to modify the in-memory copy of a table’s data and record the modification to a transaction log that persists on disk, rather than persisting the modified data itself to disk each time. Transaction logging is appending, so the operation of logging is sequential I/O in a small area of the disk, unlike random I/O, which requires moving the head in multiple places on the disk, so transaction logging is relatively faster.

Transaction implementation

Transaction isolation is enabled by the following locks. Atomicity, consistency, and persistence are achieved through redo and Undo in the database.

Redo logs

In the InnoDB storage engine, transaction logging is implemented through redo Log files and the InnoDB storage engine Log Buffer. When a transaction is started, an LSN (LogSequence Number) of the transaction is recorded. When a transaction is executed, the transaction log is inserted into the log buffer of InnoDB storage engine. When a transaction commits, the InnoDB storage engine’s log buffer must be written to disk. After the transaction log is persisted, the modified data in memory can be slowly flushed back to disk in the background. Often referred to as write-ahead Logging, data changes require two disk writes.

InnoDB storage engine ensures transaction integrity through pre-logging. This means that data pages stored on disk are out of sync with pages in the memory buffer pool, and changes to pages in the memory buffer pool are written first to the redo log file and then to disk in an asynchronous manner.

Roll back (undo) logs

Redo logging records the behavior of transactions and is a good way to “redo” them. But sometimes transactions also need to be undone, and that’s when undo is needed. Undo is the opposite of redo. When you make a change to a database, it not only generates redo, but also a certain amount of undo, even if the transaction or statement you executed fails for some reason, or if you use a ROLLBACK statement to ROLLBACK, You can use this undo information to roll back the data to where it was before the modification. Unlike redo, redo is stored in redo log files, and undo is stored in a special segment within the database. This is called undo segment. Undo segments are stored in a shared tablespace.

There is a common misconception that undo is used to physically restore the database to where it was before a statement or transaction was executed — but this is not the case. The database simply logically reverts to its original state and all changes are logically cancelled, but the data structure itself can be quite different after a rollback, as there can be tens, hundreds, or even thousands of concurrent transactions in a multi-user concurrent system. The primary task of a database is to coordinate concurrent access to data records. For example, a transaction is modifying some records on the current page, while other transactions are modifying other records on the same page. Therefore, you cannot roll back a page to the beginning of a transaction, because this will affect the work being done by other transactions.

For example, our transaction executed an SQL statement with insert100,000records. If we perform ROLLBACK, the inserted transaction will be rolled back. So when the lnnoDB storage engine rolls back, it actually does the opposite of what it did before. For each INSERT, the InnoDB storage engine performs a DELETE; For each DELETE, the InnoDB storage engine performs an INSERT; For each UPDATE, the InnoDB storage engine performs a reverse UPDATE, putting back the row before the change.

Mix storage engines in transactions

The MySQL server layer does not manage transactions; transactions are implemented by the underlying storage engine. Therefore, it is not reliable to use multiple storage engines in the same transaction. If you use a mix of transactional and non-transactional tables (such as InnoDB and MyISAM tables) in a transaction, you won’t have a problem with normal commits. But if the transaction needs to be rolled back, the changes on the non-transactional table cannot be undone, leaving the database in an inconsistent state that is difficult to fix and the end result of the transaction uncertain. Therefore, it is important to choose the right storage engine for each table.

Implicit and explicit locking

InnoDB uses the two-phase locking protocol. Locks can be performed at any time during a transaction. Locks are released only when a COMMIT or ROLLBACK is performed, and all locks are released at the same time. The locks described above are implicit locks that InnoDB automatically locks when needed, depending on the isolation level.

Distributed (XA) transactions

InnoDB storage engine supports XA transactions, through which distributed transactions can be implemented. Distributed transactions allow multiple independent transactional resources to participate in a global transaction. Transactional resources are typically relational database systems, but can be other types of resources. A global transaction requires that all participating transactions either commit or roll back, which increases the ACID requirement for a transaction. Also, when using distributed transactions, the transaction isolation level of the InnoDB storage engine must be set to SERIALIABLE.

Distributed transactions consist of one or more Resource Managers, a Transaction Manager, and an Application Program.

  • Resource manager: Provides access to transactional resources. Usually a database is a resource manager.
  • Transaction manager: Coordinates transactions that participate in a global transaction. You need to communicate with all resource managers participating in a global transaction.
  • Application: Defines transaction boundaries and specifies operations within the global transaction.

Distributed transactions use two-phase commit. In the first phase, all nodes participating in the global transaction begin preparing, telling the transaction manager that they are ready to commit. In the second phase, the transaction manager tells the resource manager whether to ROLLBACK or COMMIT. If any node shows that it cannot commit, all nodes are told to roll back.

Actually, there are two types of XA transactions in MySQL. On the one hand, MySQL can participate in external distributed transactions; On the other hand, you can also coordinate storage engines and binary logs through XA transactions.

Internal XA transaction

MySQL’s own plug-in architecture results in the need to use XA transactions internally. Storage engines in MySQL are completely independent and unaware of each other’s existence, so a transaction across storage engines requires an external coordinator. Without the XA protocol, for example, transaction commits across storage engines simply sequentially require each storage engine to commit individually. If a system crash occurs during a storage commit, the nature of the transaction is broken.

If you think of the binary logging operations logged by MySQL as a separate “storage engine,” it is easy to understand why XA transactions are required even for transactions involving a storage engine. While the commit is stored by the engine, the “commit” information needs to be written to the binary log, which is a distributed transaction, but the binary log participant is MySQL itself. XA transactions cause a significant performance degradation for MySQL.

External XA transactions

MySQL can perform an external distributed transaction as a participant. However, XA support is incomplete. For example, XA requires multiple connections to be associated in a transaction, which is not supported in the current version of MySQL.

External XA transactions are more costly than internal because of communication delays and possible failure of the participants themselves. If you use XA transactions in a wan, they often fail because of unpredictable network performance. It is best to avoid XA transactions if there are too many uncontrollable factors, such as unstable network traffic or users waiting too long to commit. Anything that delays a transaction commit is costly, because it affects not only itself, but also all participants.

In general, there are other ways to implement high-performance distributed transactions. For example, data can be written locally, put on a queue, and then automatically distributed in a smaller, faster transaction. You can also use MySQL’s own replication mechanism to send data. We see many applications that can completely avoid using distributed transactions.

XA transactions are a way to synchronize data between multiple servers. Try it if for some reason replication of MySQL itself cannot be used, or if performance is not a bottleneck.

The lock

In a real database system, locking occurs all the time. When a user modifies a part of the data, MySQL locks the data to prevent other users from reading the same data. Most of the time, the internal management of MySQL locks is transparent.

The lock type

Shared and exclusive locks

Concurrency control issues arise whenever there are multiple queries that need to modify data at the same time. When dealing with concurrent reads or writes, the problem can be solved by implementing a lock system consisting of two types of locks. These two types of locks are commonly referred to as shared and exclusive locks, or read and writelock.

  • Read locks are shared, or do not block each other. Multiple customers can read the same resource at the same time without interfering with each other.
  • Write locks are exclusive, that is, one write lock blocks other write and read locks, for security policy reasons, to ensure that only one user can write at a given time and prevent other users from reading the same resource being written.

Intent locks

InnoDB storage engine supports multi-granularity locking, which allows row-level locks and table-level locks to exist simultaneously. To support locking at different granularity, InnoDB storage engine supports an additional type of locking, which we call intentional locking. Intent locks are table-level locks designed primarily to reveal the type of lock that will be requested for the next row in a transaction. InnoDB storage engine supports two types of intent locks:

  • Intent shared lock (ISLock), a transaction that wants to acquire a shared lock on rows of a table.
  • Intentional exclusive lock (IXLock), a transaction that wants to obtain an exclusive lock on rows in a table.

Because InnoDB storage engine supports row-level locks, intentional locks don’t actually block any requests except for full table sweeps.

Locking granularity

One way to improve concurrency for shared resources is to make locking more selective. Try to lock only some of the data that needs to be modified, not all of the resources. Ideally, only the pieces of data that will be modified will be precisely locked. The less data that is locked on a given resource at any one time, the more concurrency the system can have, as long as there is no conflict between them.

Table lock

Table locking is the most basic locking strategy in MySQL and the least expensive. It locks the entire table. Before a user can write to a table (insert, delete, update, etc.), the user needs to obtain a write lock, which blocks all read and write operations on the table by other users. Only when there is no write lock, other read users can obtain the read lock. Read locks do not block each other. Table locks can also have good performance in specific scenarios. For example, the READ LOCAL table lock supports certain types of concurrent writes. In addition, write locks have a higher priority than read locks, so a write lock request may be inserted in front of the read lock queue (write locks may be inserted in front of read locks in the lock queue, whereas read locks cannot be inserted in front of write locks). Although the storage engine can manage its own locks, MySQL itself uses a variety of valid table locks for different purposes. For example, the server uses TABLE locks for statements such as ALTER TABLE, ignoring the locking mechanism of the storage engine.

Row lock

Row-level locking maximizes concurrent processing (while incurring the maximum lock overhead). Row-level locking is implemented in InnoDB and storage engine. Row-level locking is only implemented at the storage engine layer, not at the MySQL server layer. The lock implementation in the storage engine is completely unknown to the server layer.

Multi-version concurrency control

Most of MySQL’s transactional storage engines do not implement simple row-level locking. In order to improve concurrency performance, they generally implement multi-version concurrency control (MVCC) simultaneously. Not only MySQL, but other database systems such as Oracle and PostgreSQL also implement MVCC, but their implementation mechanisms are different. MVCC can be considered a variant of row-level locking, but it avoids locking in many cases and is therefore less expensive. Non-blocking reads are implemented, and writes lock only the necessary rows. MVCC is implemented by saving a snapshot of the data at a point in time. That is, each transaction sees the same data no matter how long it takes to execute. Depending on when the transaction starts, each transaction may see different data for the same table at the same time.

InnoDB’s MVCC is implemented by storing two hidden columns at the end of each row. These two columns, one holds the creation time of the row and the other holds the expiration (or deletion) time of the row. Of course, the storage is not the actual time value, but the system version number. The system version number is automatically incremented each time a new transaction is started. The system version number at the start of the transaction is used as the version number of the transaction and is compared with the version number of each row of records queried. Let’s take a look at how MVCC operates in the REPEATABLE READ isolation level.

SELECT

InnoDB checks each row based on two criteria:

  • InnoDB only looks for rows whose version is earlier than the current transaction version (that is, rows whose system version number is less than or equal to the transaction’s system version number). This ensures that the transaction reads rows that either existed before the transaction started or were inserted or modified by the transaction itself.
  • The deleted version of the row is either undefined or greater than the current transaction version number. This ensures that rows read by the transaction are not deleted before the transaction begins. Only records that meet the preceding two conditions can be returned as the query result.

INSERT

InnoDB stores the current system version number as the row version number for each newly inserted row.

DELETE

InnoDB stores the current system version number as a row deletion identifier for each row that is deleted.

UPDATE

InnoDB inserts a new record, saves the current system version number as the row version number, and saves the current system version number to the original row as the row deletion mark.

summary

Save these two additional system versions so that most read operations are unlocked. This design makes reading data simple, performs well, and guarantees that only rows that meet the criteria will be read. The downside is that each row requires additional storage, more row checking, and some additional maintenance. MVCC works only at the repeatable read and committed read isolation levels. The other two isolation levels are incompatible with MVCC because uncommitted reads always read the latest rows, not rows that match the current version of the transaction. Serializability locks all rows that are read.

Self-growth and locking

Self-growth is a very common property in databases and is the preferred primary key method for many databases. In the memory structure of the InnoDB storage engine, there is an auto-increment counter for each table that contains a self-increment value. When a table with a self-growing counter is inserted, the counter is initialized; The insert operation assigns a self-growing column based on the self-growing counter value by 1. This implementation is called auto-inc Locking. In order to improve the performance of inserts, the lock is not released after the completion of a transaction, but immediately after the completion of the SQL statement for self-growth inserts.

Although auto-Inc Locking improves the efficiency of concurrent insertion to a certain extent, there are still some problems. First, the performance of concurrent inserts is poor for columns with self-increasing values, so you must wait for the completion of the previous insert (although not for the transaction). Second, for INSERT…… The insertion of a large amount of data in a SELECT affects the performance of the inserter because the insertion in another transaction is blocked.

Starting with MySQL5.1.22, InnoDB storage engine provides a lightweight mutex self-growth implementation mechanism that greatly improves the performance of self-growth inserts. The InnoDB storage engine provides a parameter, innodb_autoinc_lock_mode, which defaults to 1. There are three optional values:

  • 0: This is the implementation mode of auto-inc Locking before 5.1.22, that is, auto-inc Locking. Because of the new growth implementation, the 0 option should not be your first choice.
  • 1(default) : This is for this parameter. For simple queries, this value is used to accumulate the memory counters with a mutex. For the INSERT… SELECT. The auto-Inc Locking mode is used.
  • 2: In this mode, all INSERT self-growth values are generated by mutex instead of auto-inclocking. Obviously, this is the way with the highest performance. However, this can cause certain problems. Because of concurrent inserts, the self-growing values may not be continuous each time an insert is performed. In addition, and most importantly, master-slave replication presents problems. Therefore, with this pattern, row-based replication should be used. This ensures maximum concurrency performance and data synchronization.

A deadlock

A deadlock is a vicious cycle in which two or more transactions occupy the same resource and request to lock the resource occupied by the other. Deadlocks can occur when multiple transactions attempt to lock resources in different orders. Deadlocks can also occur when multiple transactions simultaneously lock the same resource.

In the InnoDB storage engine, deadlock loop dependencies are detected and an error is immediately returned. This works well, otherwise deadlocks can lead to very slow queries. InnoDB currently handles deadlocks by rolling back transactions that hold the least row-level exclusive locks (this is a relatively simple deadlock rollback algorithm).

reference

MySQL Technology Insider :InnoDB storage engine