“ACID”

Atomicity: ensure that a set of operations will either all succeed or all fail. If an error occurs during an operation, the database needs to return the data to the state before the operation was performed. (Combined with undolog)

(2) Isolation: access to the database of different state transformation, the corresponding database operation execution order has a certain rule, to ensure that each other does not affect (RE: combined with MVCC)

(3) consistency: data state in the database to maintain consistency, atomicity and isolation are a means to ensure consistency.

(4) Persistence: State transitions to the database should remain on disk. (RE: redolog, double write cache)

We refer to one or more database operations that meet the four properties of ACID as a transaction.


The redo log 】

When we access pages, we need to cache the pages in disk into the buffer pool in memory for operation. That is to say, the transaction first operates on the buffer pool in memory, and then it is flushed to disk. However, if the data is lost due to a sudden failure after the transaction is committed, the transaction operation has not been written to disk, and there is no guarantee of persistence. Therefore, the redo log is used to record the changes made by transactions to the database. Even if the data in the buffer pool is lost, mysql can use the redo log to recover the transaction. The redo log also records changes made to the change buffer. The change buffer is persistent, so the redo log is used to restore data that has not been persisted.

Redo log: When a log is inserted or updated, InnoDB writes the log to the Redo log buffer first. After a commit or transaction, InnoDB writes the commands in the buffer to the redo log file. Redo logs are written in a circular fashion. They record write ranges by writing and erasing Pointers. If the two Pointers overlap, InnoDB runs out of redo log space. InnoDB must stop and flush some data to disk to empty the redo log before starting receiving data.


The undo log 】

In order to achieve atomicity of transactions, we need to restore the data when the transaction fails, so we need undo log. A change to each record produces an Undo log that is used to roll back the transaction. (RE: Too many…)

Transaction Isolation Level

MySQL is a client server architecture software, a server can be connected to several clients at the same time. So the server may concurrently access a single piece of data with multiple transactions being processed at the same time. Transactions are isolated, which means that multiple transactions are accessed sequentially. This has a significant impact on performance, and we want to maintain transaction isolation while improving the performance of the server to handle transactions. You can’t have your cake and eat it. The transaction isolation level is also a trade-off between isolation and performance.

Uncommitted read:

Dirty read: a transaction reads uncommitted modified data from another transaction.

Submitted for reading:

Non-repeatable reads occur: a transaction reads data repeatedly with inconsistent results. That is, it reads values modified by other transactions.

Repeatable read (MySQL default)

Phantom read: a transaction reads an inconsistent number of records from a table, that is, new records inserted by other transactions are read. (If the number of records that have not been read before is not phantom read.)

How to resolve phantom reading: Phantom reading is when new records in a transaction affect the execution of the transaction. That is, records cannot be locked because they do not exist at the time of the transaction’s first read, so a gap lock is required. A gap lock on one record will lock this record to the gap directly on the previous record, blocking the insertion operation.

Before MySQL5.0, the binlog only records statements and not row numbers. Therefore, when the Master performs delete first and then delete, the binlog may be recorded as insert first and then delete, causing primary and secondary data errors. So MySQL uses repeatable reads by default. Now the binlog format is based on the row format, there is no execution order problem.

Why should Internet projects use submitted read?

When a conditional column does not match the index, committed reads lock only the columns that do not meet the criteria, while repeatable reads lock the entire table.

If no update/ DELETE record is hit, repeatable reads will be locked in a gap, increasing the probability of deadlock. Committed reads have semi-consistent reads to improve update concurrency: When InnoDB reads rows locked by update, InnoDB determines whether the last version of data is consistent with the update conditional statement. If they are the same, MySQL directly initiates a read operation to read the new version after modification. Repeatable reads only wait for updates.

serializable

【 MVCC 】

MVVC actually refers to the process of accessing the current record version chain while performing ordinary query operations using transactions at read Committed and REPEATable Read isolation levels.

For InnoDB, each cluster index record contains two hidden columns TRx_ID and roll_pointer. Every time a change is made to a cluster index record, an old version will be written into the undo log and linked with all previous versions into a version chain. The record roll_pointer is the pointer to the latest node in the version chain, and trx_id is the transaction ID of the change. In other words, we can directly get the old version and transaction ID of a record when querying, which is the premise of MVCC multi-version control.

For different isolation levels, the range of data that can be read is equivalent to which version in the record version chain can be read. So InnoDB provides a ReadView concept,

ReadView records the list of all transaction numbers that are present in InnoDB during the execution of a transaction, the minimum transaction number, and the next transaction number that should be allocated. The transaction number here refers to the virtual transaction number corresponding to each record in the version chain. When the value of the record is changed, the transaction number +1 remains unchanged when it is read.

When InnoDB queries a record, it compares it to ReadView along the record version chain. If a match is found, it finds the correct record. So for read Committed and REPEATable Read, the timing of ReadView generation becomes a major factor in determining the difference.

For read Committed reads, a new ReadView is generated before each read in a transaction, and the new ReadView matches the latest record in the version chain that belongs to the transaction, solving the non-repeatable problem. However, the generation of a new ReadView also means that it matches the records generated by the current transaction updated in the version chain, so it will scan more records generated by other transactions before this ReadView, that is, it will not solve the phantom read.

For repeatable Read operation, only a ReadView is generated when reading data for the first time during the whole transaction execution. All subsequent reading operations reuse this ReadView, that is, reuse the transaction ID in it, thus preventing new data submitted by other transactions from being read. Solved illusionary! So we want to achieve illusionary, then we need to use shared read lock!

“Lock”

Implementation principle of lock:

A lock is actually a memory structure, and when a transaction makes a change to a record, it first looks to see if there is a lock structure associated with the record. If the operation is not performed directly; Determines the is_WAITING attribute of the lock. A value of true indicates that the lock is occupied and other transactions need to wait on this record. When this transaction completes, it sets the IS_WAITING property to false and wakes up other transactions that are waiting. If the value is false, the lock is successful, the value is set to true, and the operation is performed.

InnoDB row lock (InndDB row lock) : InnoDB row lock (InndDB row lock) : InnoDB row lock (InndDB row lock)

Record lock: Locks a single record.

Gap lock: The key-value gap between this record and its predecessor (i.e. the record corresponding to the predecessor of the current leaf node in the linked list, their key-value interval is the key-value gap). For the right open range, there are two pseudo records in each data page, which are corresponding to the infinitesimal record and the infinite record of the page respectively. Then locking the infinite record means locking the right open range.

Key lock: combination of record lock and gap lock, will lock the current record left open right closed area, [solve magic read!

Insert intentional locks, implicit locks: (RE: too many…)

MySQL deadlock (RE geek)

① Show engine Innodb status Check deadlock logs

(2) In the RR isolation level, if no update/delete record is hit, A gap lock will be added to the left and right interval, and the single statement in each transaction will not release the lock immediately after completion, so the insert lock required by A transaction will be occupied by the temporary key lock of B transaction, and the same is true for B transaction.

③ Lock timeout mechanism, enable deadlock detection and roll back transactions (construct a directed graph with transaction as the fixed point and lock as the edge to determine whether there is a ring in the graph)

Multi-node transactions (RE geeks)

The implementation principle of gap lock, such as age=20, is the lock ID or age? : The principle of answer lock + gap lock + B+ tree to obtain records, index condition push, server layer judge size : Directly locks the cluster index, starts the transaction to save the consistency view, and restores the state before locking using MVCC.