A, the transaction

A transaction is a logical processing unit consisting of a set of SQL statements. It is a set of operations that meet ACID characteristics. A transaction can be committed by Commit or rolled back using Rollback. A transaction has the following four properties, often referred to simply as the ACID property of the transaction:

  • Atomicity: A transaction is an atomic unit of operation in which all or none of the modifications to data are performed. For example, SQL statements in the same transaction either all execute successfully or all execute fail. Rollback can be done using logs, which record the changes made by the transaction and perform them in reverse during rollback.
  • Consistent: Data must be Consistent at the beginning and completion of a transaction. This means that all relevant data rules must be applied to transaction modifications to preserve data integrity; At the end of the transaction, all internal data structures (such as b-tree indexes or bidirectional linked lists) must also be correct. For example, if A transfers money to B, assume that the sum of the two users’ money before the transfer is 2000, then after A transfers money to B, no matter how the two accounts are transferred, the sum of user A’s money and user B’s money together is still 2000. This is the consistency of the transaction.
  • Isolation: The database system provides some Isolation mechanism to ensure that transactions are executed in a “separate” environment that is not affected by external concurrent operations. Isolation means that when multiple users concurrently access the database, for example, when operating on the same table, the transactions initiated by the database for each user cannot be disturbed by the operations of other transactions, and multiple concurrent transactions must be isolated from each other. The effect is that for any two concurrent transactions T1 and T2, from transaction T1’s point of view, T2 either ends before T1 begins or starts after T1 ends, so that each transaction does not feel that any other transaction is executing concurrently.
  • Durable: Changes to data after transactions are finished are permanent and can be maintained even in the case of system failures. You can back up and restore the database. When the system crashes, you can use the backup database to restore data.

MySQL uses automatic commit mode by default. That is, if a TRANSACTION is not explicitly started using the START TRANSACTION statement, each query is automatically committed as a TRANSACTION.

These features are not a hierarchical relationship:

  • The execution result of a transaction is correct only if consistency is satisfied.
  • In the absence of concurrency, transactions are executed sequentially and isolation must be satisfied. As long as atomicity can be satisfied, consistency must be satisfied.
  • In the case of concurrency, where multiple transactions are executed concurrently, transactions must satisfy not only atomicity but also isolation to satisfy consistency.
  • Transaction persistence is designed to cope with database crashes.

Second, concurrency consistency problem

Lost Update

Both T1 and T2 transactions modify a data, T1 first, T2 later, and T2 overrides T1.

For example, two programmers modify the same Java file. Each programmer changes his copy independently, and then saves the changed copy, overwriting the original document. The editor who last saves a copy of his changes overwrites the changes made by the previous programmer.

This problem can be avoided if another programmer cannot access the same file until one programmer completes and commits the transaction.

Ⅱ, dirty reads

Bottom line: Transaction B reads the data that transaction A has modified but has not yet committed, and acts on that data. If transaction A rolls back, data read by transaction B is invalid and does not meet consistency requirements.

Solution: Change the transaction isolation level of the database to READ_COMMITTED

T1 modifies a piece of data, which T2 then reads. If T1 reverses the change, then T2 reads dirty data.

ⅲ. Non-repeatable Reads

Read the same data multiple times within a transaction. The same data is accessed by another transaction while the transaction is still active. So, between the two reads of the data in the first transaction. The data read by the first transaction may not be the same due to the modification of the second transaction, so that the data read by the first transaction is not the same, and therefore is called non-repeatable read, i.e. the original read is not repeatable.

In a word: two identical queries within a transaction scope return different data.

Transaction 1 reads the data during transaction 2 operation and after submission respectively, and the records read are inconsistent. Non-repeatable reads are when two identical queries return different results within the same transaction.

Workaround: You can avoid this problem if the data can only be read after the modification transaction has fully committed. Change the transaction isolation level of the database to REPEATABLE_READ

T2 reads a piece of data, and T1 modifies it. If T2 reads this data again, it will not read the same data as it did the first time.

Ⅳ, phantom reads

The phenomenon that one transaction T1 re-reads previously retrieved data according to the same query condition only to find that another transaction T2 inserts new data that meets its query condition is called “phantom read”. (Similar to repeatable reads, but transaction T2 only inserts and deletes data, not modifies data, and reads an inconsistent number of records.)

Transaction A reads the new data submitted by transaction B and does not comply with isolation.

Workaround: This problem can be avoided if no other transaction can add new data until the operational transaction has finished processing the data. Change the transaction isolation level of the database to SERIALIZABLE_READ.

T1 reads data in a range, T2 inserts new data in that range, and T1 reads data in that range again, with a different result from the first read.

Transaction isolation level

“Dirty read”, “unrepeatable read” and “phantom read” are actually database read consistency problems, which must be solved by the transaction isolation mechanism provided by the database.

The more stringent the transaction isolation of the database, the less concurrency side effects, but the higher the cost, because transaction isolation essentially serializes transactions to a certain extent, which is obviously contradictory to “concurrency.” At the same time, different applications have different requirements on read consistency and transaction isolation. For example, many applications are not sensitive to “unrepeatable reads” and “phantom reads” and may be more concerned with the ability of concurrent data access.

MYSQL: show variables like ‘tx_isolation’;

ⅰ, Read Uncommitted

The lowest isolation level, which allows other transactions to see uncommitted data, causes dirty reads.

ⅱ, Read Committed

The data being read can be modified by other transactions, which may result in unrepeatable reads. In other words, a transaction obtains a read lock and releases it immediately after reading (without waiting for the end of the transaction), whereas a write lock is released after the transaction commits. After the read lock is released, data can be modified by other transactions. This level is also the default isolation level of SQL Server.

Repeatable Read (Repeatable Read)

All data retrieved by the Select cannot be modified to avoid inconsistent reads before and after a transaction. However, there is no way to control phantom reading, because other transactions can not change the selected data, but can increase the data, i.e. the previous transaction had a read lock but no scope lock. Why is it called repeatable read level? That’s because this level solves the following unrepeatable read problem. (Extension: now mainstream databases use MVCC concurrency control, RR (repeatable read) isolation level is no phantom phenomenon.)

MYSQL default is REPEATABLE-READ.

ⅳ. Serializable

The execution of all affairs one by one, so that you can avoid phantom read (the phantom read), for the database to achieve concurrency control based on the lock, serialization requirements when perform range queries, need to get range locks, if not based on the database to achieve concurrency control lock, check to the violation of serial operation of transaction, The transaction needs to be rolled back.

Ⅴ,

  • Read uncommitted: Changes made by a transaction can be seen by other transactions before it is committed.
  • Read Commit: After a transaction commits, its changes are seen by other transactions.
  • Repeatable read: The data seen during the execution of a transaction is always the same as the data seen when the transaction is started. Of course, at the repeatable read isolation level, uncommitted changes are also invisible to other transactions.
  • Serialization: As the name implies, for the same row, “write” will add “write lock”, “read” will add “read lock”. When a read-write lock conflict occurs, the last accessed transaction must wait for the previous transaction to complete before execution can continue.

The four levels are progressively enhanced, with each level solving one problem. The higher the transaction level, the worse the performance. Most environments (Read Committed)

Isolation level Read Data Consistency Dirty read Unrepeatable read The phantom read
Uncommitted read The lowest level Square root Square root Square root
Submit to read statement-level x Square root Square root
Repeatable read The transaction level x x Square root
Through a serial of The highest level, transaction level x x x

Write in the last

  • First: after reading the “like”, thank you for your recognition of the author;
  • .
  • Second: readily forward, share knowledge, so that more people learn;
  • .
  • Third: remember to pay attention, update every day!!
  • .