Writing in the front

  • The article is on the basis of predecessors to summarize and sort out plus their own understanding, only as their own learning records, not for any commercial purposes!
  • If you find any errors or infringement problems in the article, please point them out. Thank you!

Prepend content

  • # 解析 MySQL and related interview questions 1

The transaction

What is a transaction
  • A transaction is a logical set of operations that either all or none of them are executed
Why transactions are needed
  • Imagine a scenario:
    • If A transfers 1000 yuan to B, it should be A reduction of 1000 yuan and B increases 1000 yuan. If the transfer system breaks down between the decrease and increase, and A’s balance decreases and B’s balance does not increase, obviously this is not reasonable
    • The transaction then guarantees that both operations will either complete or fail
Four characteristics of transactions
  • Atomicity (Atomicity)
    • Is the smallest unit of execution of a transaction. It is non-divisible. Atomicity ensures that either all transactions are executed or none of them are executed
  • Consistency (Consistency)
    • Database integrity constraints are not broken before and after a transaction begins
  • Isolation (Isolation)
    • When accessing the database concurrently, one transaction is isolated from the other transactions
  • Permanent (Durability)
    • When the transaction completes, the impact on the data is permanent. It doesn’t matter if the database fails
Problems caused by concurrent transactions
  • Why the problem
    • When multiple transactions are executed concurrently, they may operate on the same data, resulting in the following problems
  • Dirty read
    • When transaction A updates A record in the database, but it has not been committed to the database, transaction B accesses the data, and the updated data is the result of transaction A. If transaction A is rolled back, then transaction B’s data is dirty
  • Unrepeatable read
    • Transaction A accesses the same data multiple times, and transaction B updates and commits the data during transaction A’s access, resulting in inconsistent results for transaction A’s multiple access
  • Phantom read
    • The magic read and non-repeatable read are similar in that the results obtained by accessing data twice in the same transaction are inconsistent, but the difference is that the magic read focuses on adding or deleting data (row number), while the non-repeatable read focuses on modifying data (internal)
The isolation level of the transaction
  • Read uncommitted (READ_NUCOMMITTED)
    • Uncommitted data can be read, which may cause dirty read, non-repeatable read, and phantom read
  • Read committed
    • The dirty read problem can be solved by allowing the committed data to be read. However, the non-repeatable read and phantom read problems still exist
  • Repeatable read (REPETABLE_READ)
    • The result of multiple reads of a transaction is the same, that is, during the execution of the transaction, other transactions are forbidden to operate on the data, which can solve the dirty read and non-repeatable read problems, but still exists the magic read problem
    • MySQL database InnoDB storage engine default isolation level
  • Serialization (SERIALIZABLE)
    • A transaction reads the same row from a table for several times, that is, other transactions are forbidden to add, update, and delete the table during the execution, which can solve dirty read, non-repeatable read, and magic read problems
  • Pay attention to:
    • Transaction isolation levels and concurrency of data access are antithetical; the higher the transaction isolation level, the worse the concurrency
    • InnoDB storage engine uses next-key Lock(temporary Lock) algorithm under REPEATABLE READ transaction isolation level, which can avoid magic READ. Therefore, the transaction isolation requirements are fully guaranteed and good concurrency performance is preserved
A solution to the concurrency problem
  • Dirty reads, non-repeatable reads, and phantom reads are actually read consistency problems of the database. The database must provide a transaction isolation mechanism to solve these problems. The following two solutions are available
    • One is locking: locking data before it is read to prevent other transactions from modifying the data, such as shared and exclusive locks
    • One is MultiVersion Concurrency Control (MVCC or MCC), also known as multi-version database
MVCC(TODO: Deeper understanding)

Locking mechanism

  • In simple terms, database locking mechanism is a rule designed by the database to ensure the consistency of data and make all kinds of shared resources become orderly when being concurrently accessed
The classification of the lock

  • Concurrency control generally uses three methods, respectively optimistic lock and pessimistic lock and the above explained MVCC
Optimistic locking
  • An optimistic lock assumes that the data it uses will not be modified by another thread, so it does not lock the data, and only checks whether the data has been modified by another thread before updating it. If it is not modified, the thread writes the modified data successfully. If it is modified, different operations are performed depending on the implementation
    • Again, optimistic locking is just an idea, and CAS is one way to implement it, but CAS also has problems, which leads to the following two implementations
    • For more information on CAS, see my other article, CAS
  • Implementation: There are two types of optimistic locking
    • Add a version number to the database table for the field version
      • That is to add a version identifier for data, generally for the database table to add a numeric type version field to achieve
      • When reading the data, the version field value is read out and recorded. When submitting the data, the version field value is read again and compared with the value read last time. If the value is equal, it can be updated; otherwise, it will not be updated
      • The data is version+1 every time it is updated
    • Update the timestamp timestamp by row
      • Similar to the above detection method, that is, the current update time A is obtained before the update operation, and the update time B is obtained again when the update is submitted, and then it is judged whether the update time B is equal to the update time A obtained last time
  • advantages
    • Optimistic concurrency control is not implemented through the locking mechanism of the database without actual locking, so there is no extra overhead, but also good deadlock problem,Applicable to concurrent read and write scenariosBecause there is no overhead, it can greatly improve the performance of the database
  • disadvantages
    • Optimistic concurrency control is not suitable for concurrent scenarios with many writes and few reads, because there will be many version field write conflicts, resulting in multiple write waits and retries. In this case, the overhead is actually higher than pessimistic locking
Pessimistic locking
  • For concurrent operations on the same data, a pessimistic lock assumes that the data must be modified by another thread when it uses the data, so it adds a lock when it obtains the data to ensure that the data will not be modified by other threads

  • Implementation: In the database, the pessimistic lock process is as follows

    • Before modifying any record, try to attach an exclusive lock to the record (more on that later)
    • If the lock fails, the record is being modified and the current query may have to wait or throw an exception (depending on the developer’s needs)
    • If the lock is successful, then the record can be modified and the transaction will be unlocked after completion
    • In the meantime, any other operation that modifies the record or adds an exclusive lock will wait for us to unlock it or throw an exception
  • Pay attention to

    • If you want to use pessimistic lock in InnoDB storage engine, you must turn off MySQL database autocommit (set autocommit = 0).
    • How to write an SQL statement?
      //0.Start the transactionbegin;/begin work;/starttransaction; (Choose one of the three)//1.Query the product informationselect status from t_goods where id=1 for update;
      //2.Generate orders based on product informationinsert into t_orders (id,goods_id) values (null.1);
      //3.Change commodity status to2
      update t_goods set status=2;
      //4.Commit the transactioncommit;/commit work;
      Copy the code
    • The select... for updateMySQL InnoDB locks at row level by default.Row-level locking is based on indexes. If an SQL statement does not use an index, it does not use row-level locking. Instead, it uses table level locking to lock the entire table
  • advantages

    • Pessimistic locking is a conservative policy of “lock first and access later”, which guarantees the security of data processing. Therefore, it is suitable for writing more and reading less
  • disadvantages

    • Because of the need to lock, and may face lock conflict or even deadlock problems, increase the extra overhead of the system, reduce the efficiency of the system, but also reduce the parallelism of the system
  • Application scenarios of optimistic and pessimistic locks

    • Optimistic locking
      • Read more and write lessBecause if a large number of writes occur, the likelihood of write conflicts increases and the business layer needs to retry repeatedly, which can significantly degrade system performance
      • The data consistency is not high, but the response speed is very high
      • Cannot solve the dirty read, magic read, cannot repeat read, but can solve the update loss problem
    • Pessimistic locking
      • Write more and read lessBecause pessimistic locks are locked when data is read, multiple read scenarios require frequent locking and a lot of waiting time, while pessimistic locks can be used to ensure data consistency in severe write conflict situations
      • High data consistency requirements
      • It can solve the problems of dirty read, magic read, unrepeatable read, type 1 update loss and type 2 update loss
A Shared lock
  • Shared locks are also calledRead lockFor all read-only data operations. Shared locks are non-exclusive,Allows multiple concurrent transactions to read the locked resource
  • The nature of the
    • Multiple transactions can block the same shared page
    • No transaction can modify the page, and if a transaction modifies the shared lock, a deadlock is likely
    • The shared lock is usually released when the page is read
    • If transaction T attaches A shared lock to data A, other transactions can only attach A shared lock to data A, not exclusive locks
  • instructions
    • If you add LOCK IN SHARE MODE to the end of the query statement, MySQL will apply a shared LOCK to each row IN the query result set. If no other thread uses an exclusive LOCK to any row IN the query result set, MySQL will successfully apply a shared LOCK. Otherwise, MySQL will block
  • case
    • Add LOCK IN SHARE MODE to the end of the query statement
    To use pessimistic locking, you need to turn off autocommit first
    SET AUTOCOMMIT = 0  
    SELECT * FROM customers WHERE customer_id = 163 LOCK IN SHARE MODE
    Note that the COMMIT command is not executed
    Copy the code
    • Transaction B will now UPDATE the query result of transaction A
    UPDATE customers SET city = 'HangZhou1' WHERE customer_id = 163
    Copy the code
    • The execution result is as follows: Timeout occurs

    • Transaction C now selects the results of transaction A’s query
    SELECT * FROM customers WHERE customer_id = 163
    Copy the code
    • The execution result is as follows: The query is successful

    • The example above illustrates that the data of shared lock can only be added after the shared lock, but not the exclusive lock. Multiple transactions can block the same shared page
Exclusive lock
  • An exclusive lock, also called a write lock, is used to write data. If one transaction locks an object exclusively, no other transaction can lock it
  • The nature of the
    • Only one transaction is allowed to block the page and that transaction can read and write the page
    • Other transactions can read the page, not write it
    • Exclusive locks block all exclusive and shared locks
  • instructions
    • If you add FOR UPDATE to the end of the query statement, MySQL will place an exclusive lock on each row in the query result
  • case
    • Transaction A has added FOR UPDATE to the end of the query statement when auto-commit is turned off
    SET AUTOCOMMIT = 0  
    SELECT * FROM customers WHERE customer_id = '163' FOR UPDATE
    -- The COMMIT command is not executed
    Copy the code
    • Transaction B will now UPDATE the query result of transaction A
    UPDATE customers SET city = 'HangZhou' WHERE customer_id = 163
    Copy the code
    • The execution result is as follows: Timeout occurs

    • Transaction C now selects the query result of transaction A and adds LOCK IN SHARE MODE
    SELECT * FROM customers WHERE customer_id = '163' LOCK IN SHARE MODE
    Copy the code
    • The execution result is as follows: Query failed due to timeout

Table level rope, row level lock, and page lock
  • A class table
    • Small overhead, fast locking; Deadlocks do not occur; Large locking granularity,The probability of lock conflict is the highest and the concurrency is the lowest(MyISAM and MEMORY storage engines use table-level locking)
  • Row-level locks
    • High overhead, slow lock; Deadlocks occur; Minimum locking granularity,The probability of lock conflict is the lowest and the concurrency is the highest(InnoDB storage engine supports both row-level and table-level locking, but the default is row-level locking.)
  • Lock the page
    • Overhead and locking time are bounded between table locking and row locking. Deadlocks occur; Lock granularity is between table locks and row locks, and concurrency is mediocre
  • Applicable scenario
    • Table-level locking: it is more suitable for queries based applications with only a small number of data updates based on index conditions
    • Row-level locking: It is more suitable for applications that have a large number of concurrent updates of a small number of different data based on index conditions and concurrent queries
MyISAM table locks
  • MyISAM has two modes of table locking
    • Table Read Lock: It does not block Read requests from other users to the same Table, but blocks write requests to the same Table
    • Table Write Lock: Prevents other users from reading or writing to the same Table
  • MyISAM table reads and writes are in between, and writes are in betweenThe serial
    • When a thread acquires a write lock on a table, only the thread holding the lock can update the table. All other threads can read and write until the lock is released
  • By default, write locks have a higher priority than read locks
    • When a lock is released, it is given first to the waiting acquire requests in the write lock queue and then to the waiting acquire requests in the read lock queue
  • MyISAM lock and unlock mechanism
    • Before the query statement (SELECT -) is executed, read locks are automatically placed on all tables involved
    • Before an UPDATE operation (UPDATE, DELETE, INSERT, etc.) is performed, a write lock is automatically placed on the table involved without user intervention
InnoDB row locks
  • InnoDB implements the following two types of row locks
    • Shared lock: Allows one transaction to read a row, preventing other transactions from acquiring exclusive locks on the same dataset
    • Exclusive locks: Allows transactions that acquire exclusive locks to update data, preventing other transactions from acquiring shared read locks and exclusive write locks on the same dataset
  • Pay attention to
    • InnoDB storage engine row locks are applied to indexes, not records, and the index cannot be ininvalid, otherwise it will be upgraded from row locks to table locks
    • Row-level locks are based on indexes. If an SQL statement does not use indexes, row-level locks are not used. Table level locks are used
InnoDB locking mechanism
  • InnoDB has three algorithms for row locking
  • Record Locks
    • Locks on a single record. If a row with an index that meets the criteria is locked, other transactions cannot modify or delete the locked item
    SELECT * FROM table WHERE id = 1 FOR UPDATE;
    Copy the code
    • A record lock is placed on a record with id=1 to prevent other transaction updates and delete the row with id=1
    The id column is either a primary key column or a unique index column
    UPDATE SET age = 50 WHERE id = 1;
    Copy the code
    • A record lock is also applied to a row when an UPDATE is performed on the row using the primary key index and the unique index
  • Gap Locks
    • When we retrieve data using a range condition rather than an equality condition and request a shared or exclusive lock, InnoDB will lock the entry of the existing data record that meets the condition. For records whose key value is within the range of the condition but does not exist, InnoDB will also lock this GAP
    • For example: we have a total of 100 records with ids from 1-100
    Select * from  emp where empid > = 100 for update;
    Copy the code
    • InnoDB locks records with empID =100 when the empID is equal to 100. InnoDB locks records with empID >100 when the empID is equal to 100 when the empID is equal to 100
    • The purpose of using gap lock is to solve the phantom reading problemTo meet the requirements of the REPETABLE_READ isolation level, such as:
      • For the above example, if the gap lock is not used, if another transaction inserts any record with EMPID >100, then the transaction executes the above statement again, and the transaction will have a dummy read
    • Pay attention to
      • InnoDB will also use a gap lock if an equality condition is used to request a lock for a non-existent record
      • Gap locking mechanisms block concurrent inserts of key values within a qualified range, which often results in severe lock waits. Therefore, in the actual application development, especially in the application with more concurrent inserts, we should try to optimize the business logic, try to use equal conditions to access the updated data, and avoid using scope conditions
  • Next-key Locks
    • The adjacent key lock is composed of record lock and gap lock, and its blocking range includes both index records and index ranges, can be understood as a special clearance lock, can also be understood as an algorithm
    • There is an adjacent key lock on the non-unique index column of each data row. When a transaction holds the adjacent key lock of the data row, it will lock an open and close interval of data
    • InnoDB implements row-level lock based on index, and the adjacent key lock is only related to non-unique index columns. There is no adjacent key lock on unique index columns (including primary key columns)
    • The main purpose of critical locking is to avoid phantom readingIf the transaction isolation level is reduced to read-commited, the critical lock will also fail
Deadlocks (TODO: A deeper understanding)
  • What is a deadlock
    • Deadlock refers to the process in a set of processes in the possession of resources that will not be released, but because of each other’s request for resources that will not be released by other processes in a permanent wait
  • The four conditions that produce a deadlock
    • Mutually exclusive: A resource can be used only by one process
    • Request and hold conditions: a process is blocked by requesting resources, but will not release resources that are already occupied
    • Inalienable condition: Resources already occupied by a process will not be taken away until they are used up
    • Cyclic wait condition: Several processes form a circular wait resource relationship
  • Release the current deadlock state
    • Find the process ID that caused the deadlock and kill the process ID
  • How do I avoid deadlocks
    • Objects are accessed in the same order
    • Avoid user interaction in transactions
  • Pay attention to
    • In the case of automatic locking, MyISAM always obtains all the locks needed for the SQL statement at once,So the MyISAM table will not deadlock
    • Deadlocks affect performance rather than cause serious errorsBecause InnoDB automatically detects the deadlock and rolls back one of the affected transactions
Related Interview Questions
  • Q: Optimistic and pessimistic database locks?
    • Optimistic locking is when the database believes that a transaction will not have other transactions during its operation
    • Pessimistic locks assume that there will be other transactions to interfere with their operation, so they will apply the lock before the operation
  • Q: What are the types of locks in MySQL?
    • Optimistic lock and pessimistic lock
    • Shared and exclusive locks
    • Record lock, clearance lock and critical lock
  • Q: How is InnoDB engine row locking implemented in MySQL?
    • InnoDB storage engine has three implementation mechanisms for row lock: record lock, gap lock, and critical lock
    • A record lock is a lock on a query record with an index, which cannot be updated or deleted after being locked
    • When the query is a range, not an equivalent, and the query record is locked with an exclusive or shared lock, the storage engine will place a gap lock on the matching index record
    • The adjacent key lock is composed of a record lock and a gap lock. Its function, like the gap lock, is to prevent the InnoDB engine from phantom reading at the REPETABLE_READ isolation level
  • Q: Do you know how to write a SQL statement that causes a deadlock? How to resolve a deadlock? Does MySQL provide any mechanism to resolve a deadlock
    • Deadlock refers to an infinite loop state caused by multiple transactions claiming resources occupied by other transactions while owning their own resources
    -- Execute in order 1
    BEGIN;
    DELETE FROM user_table WHERE id = 10;    -- Theoretically, user_id ranges (-∞, 10] and (10, 20) are locked
    
    -- Execute in sequence 3
    INSERT INTO user_table (id) VALUES (24); Transaction A attempts to insert data, but the interval is locked by transaction B, so it blocks until transaction B releases the lock
    Copy the code
    -- Execution order 2
    BEGIN;
    DELETE FROM user_table WHERE id = 30;    The user_id ranges [20, 30) and [30, +∞) are locked by transaction B
    
    -- Execute in order 4
    INSERT INTO user_table(id) VALUES (13);  Transaction B attempts to insert data, but the interval is locked by transaction A, so it blocks until transaction A releases the lock
    Copy the code
    • Transaction A and transaction B wait for each other to release the lock, resulting in A deadlock
    • MySQL provides a monitoring mechanism for InnoDB, which is used to log InnoDB’s running status periodically (every 15 seconds). This is disabled by default
    • InnoDB currently handles deadlocks by rolling back transactions that hold the least row-level exclusive lock

Performance tuning (TODO: A deeper understanding)

SQL optimization
  • Overall, SQL statement optimization has the following aspects
    1. Avoid non-indexing scenarios
    2. SELECT statement other optimizations
    3. Add, delete and modify DML statement optimization
    4. Query condition optimization
    5. Build table optimization
  • reference
    • SQL optimization 2020 the most complete dry goods summary -MySQL

Partition partition table partition library

partition
  • concept
    • Partitioning does not create a new table, but rather distributes the table’s data evenly across different hard disks, systems, or server storage mesons, which is actually one table
  • When to Consider partitioning
    • When the query speed of a table is too slow to use
    • Data in a database is segmented
    • Often only part of the data in the database operation
  • Partitioning strategies
    • RANGE partition: Multiple rows are assigned to a partition based on column values that belong to a given contiguous interval
    • LIST partitioning: Similar to partitioning by RANGE, each partition must be clearly defined. The main difference is that each partition in the LIST partition is defined and selected on the basis that the values of a column are subordinate to a value in a LIST of values, whereas the RANGE partition is subordinate to a set of continuous interval values
    • HASH partition: A partition of selection based on the return value of a user-defined expression that is evaluated using the column values of the rows to be inserted into the table. This function can contain any expression valid in MySQL that produces a non-negative integer value
    • KEY partition: Similar to HASH partition, except that the KEY partition only supports the calculation of one or more columns, and MySQL server provides its own HASH function, which must have one or more columns containing integer values
  • Implementation (RANGE partition)
    CREATE TABLE sales (
        id INT AUTO_INCREMENT,
        amount DOUBLE NOT NULL,
        order_day DATETIME NOT NULL.PRIMARY KEY(id, order_day)
    ) ENGINE=Innodb 
    PARTITION BY RANGE(YEAR(order_day)) (
        PARTITION p_2010 VALUES LESS THAN (2010),
        PARTITION p_2011 VALUES LESS THAN (2011),
        PARTITION p_2012 VALUES LESS THAN (2012),
    PARTITION p_catchall VALUES LESS THAN MAXVALUE);
    Copy the code
  • advantages
    • Partitions can store more data than a single file system or hard disk
    • Partitioning can achieve the balance of table data to different places, improve the efficiency of data retrieval, reduce the database frequent OPERATION IO
  • disadvantages
    • Partition table, partitioning key design is not very flexible, if the partition key is not used, it is easy to appear full table lock
table
  • concept

    • A table is decomposed into N entity tables with independent storage space according to certain rules. The system needs to obtain the corresponding child table name according to the defined rules when reading and writing, and then operate on the child table
  • When to consider subscales

    • The query speed of a table has slowed to the point where it can’t be used.
    • SQL optimized
    • Large amount of data
    • Slow down with frequent inserts or federated queries
  • Table strategy

    • Horizontal sub-table: A table is copied to another table with the same table structure, and data is divided according to certain rules and stored separately in these tables to ensure that the capacity of a single table is not too large and improve performance. These tables with the same structure can be stored in the same database or in different databases

    • Vertical sub-table: The main and popular fields are grouped together as the main table according to the frequency of use of business functions, and the less commonly used fields are grouped into different secondary tables according to their respective business attributes

    • Pictures from the network, infringement please contact delete

  • Problem solved

    • After tables are divided, the concurrency of a single table, disk I/O performance, and write efficiency are improved
    • Query time is shorter
    • Data is distributed to different files, improving disk I/O performance
    • The amount of data affected by read/write locks becomes smaller
    • Less data is inserted into the database that needs to be re-indexed
  • implementation

    • The service system needs to be migrated or upgraded, and the workload is heavy
depots
  • concept
    • There are too many tables in a library, resulting in massive data and system performance degradation. Therefore, the tables originally stored in one library are split and stored in multiple libraries. Usually, the tables are divided according to functional modules and relationship degree and deployed in different libraries
  • When to consider splitting
    • The storage space of a single database is insufficient
    • With the increase of query volume, a single database server can no longer support
  • Depots strategy
    • The level of depots
    • Vertical depots
    • Reading and writing separation
  • Problem solved
    • Its main purpose is to break the I/O capability limit of the single-node database server and solve the problem of database scalability
Problems caused by sub-table and sub-library
  • Join operation

    • After the table is divided horizontally, the data is dispersed into multiple tables. If join queries with other tables are required, join queries need to be conducted in business code or database middleware for several times, and then the results are merged
  • The count operations

    • Level after the table, although the physical data across multiple tables, but some business logic or will these tables as a table, for example, access to records for page or display, total level before the table with a count () operation, will be finished after the table is not that simple, there are two kinds of common processing mode
      • The count ()
      • Create a new table to record the data related to count
  • The order by operating

    • The sorting operation cannot be completed in the database. Only the business code or database middleware can query the data in each sub-table, and then summarize the data for sorting
  • With transaction support, the database and table becomes a distributed transaction

  • In order to ensure strong consistency, distribution will inevitably bring delay, resulting in performance reduction and high system complexity

  • Solution: Usually with the use of third-party database middleware (Atlas, Mycat, TDDL, DRDS) to solve the corresponding problem

A master-slave replication

  • concept
    • MySQL master-slave replication is when one server acts as the master database server and one or more servers act as the slave database server. Data in the master server is automatically replicated to the slave server
    • Asynchronous replication is adopted by default, so that the slave node does not have to access the master server all the time to update its own data, data can be updated on the remote connection
  • use
    • Read/write separation: In the development work, sometimes some SQL statement needs to lock the table, so that the read service cannot be used temporarily, which will affect the existing business. Use master/slave replication, let the master library be responsible for write, and the slave library is responsible for read
    • Data backup: When a database of a node in the system is faulty, it can be easily restored
    • Architecture expansion: When a single database cannot accommodate massive amounts of data, master/slave replication can be used to relieve pressure on a single DB, reduce the FREQUENCY of I/O operations on a single disk, and improve the performance of a single machine
Master-slave replication
  • A master from
  • From more than a master
    • One master, one slave and one master, many slaves are the most common master and slave architectures, which are simple and effective to implement, and can not only achieve high availability, but also read and write separation, thus improving the concurrency capability of the cluster
  • More from Lord
    • Multiple MySQL databases can be backed up to a server with good storage performance
  • Double master replicates
    • Dual master replication Each master node is both a master node and a slave node of another server. In this way, changes made by either party are applied to the other party’s database through replication
  • Cascade copy
    • Some secondary nodes do not connect to the primary node for data synchronization, but to the secondary node
    • Since a master node with too many secondary nodes will lose performance for replication, we can have three to five secondary nodes connected to the master node, and other secondary nodes connected to the secondary node as a secondary or tertiary node, which not only relieves the stress on the master node, but also has no negative impact on data consistency
Principle of master-slave replication
  • MySQL master-slave replication involves three threads, one running on the master node(Log Dump Thread), the remaining two run on the slave node(I/O Thread, SQL Thread), as shown in the figure below

  • Log Dump Thread of the primary node

    • When the slave node connects to the master node, the master node creates a Log Dump Thread to send the contents of the bin-log. When reading the bin-log, the thread locks the bin-log on the primary node, and the lock is released when the read is complete, even before it is launched to the secondary node
    • Bin-log is the update event type in the primary library
  • Secondary node I/O threads

    • When executed on the slave nodestart slaveAfter this command is executed, the slave node creates an I/O Thread to connect to the master node and request the updated bin-log from the master database. The I/O Thread receives the updates from the master log Dump Thread and saves them in the local relay log
  • Slave node SQL Thread

    • SQL Thread is responsible for reading the contents in the relay-log, parsing them into specific operations and executing them to ensure the consistency of master and slave data
  • instructions

    • For each master-slave copy, three threads are required to complete. When a master node has multiple slave nodes, the master node creates a Log Dump Thread for each currently connected slave node, and each slave node has its own I/O Thread and SQL Thread
    • The binary log (bin-log) function must be enabled on the Master side to implement replication. Otherwise, the replication process cannot be implemented, because the Slave actually obtains the log from the Master side and then executes all operations recorded in the log on the Slave side in complete order
  • Replication process

    1. The I/O thread on the slave node connects to the master node and requests the log contents from the specified log file to the specified location (or from the original log)
    2. After the primary node receives the I/O request from the node, the I/O thread that is responsible for replication reads the log information at the specified log location based on the request and returns it to the secondary node. In addition to the information contained in the log, the returned information includes the bin-log file and the bin-log position of the returned information
    3. After receiving the content, the I/O process of the node updates the received log content to the local relay log, and saves the file name and location of the read binary log to the master-info file, so as to determine the next read location in the primary database
    4. After the SQL thread of the slave node detects the new content in the relay-log, it will parse the content of the relay-log into the actual operations performed on the master node and execute them in the database to achieve the consistency between the master and slave nodes
Primary/secondary replication policy
  • Synchronization policy: The Master waits for all slaves to respond before committing. This will seriously affect the performance of the Master/Slave synchronization
  • Semi-synchronous: The Master waits for at least one Slave to respond before committing
  • Asynchronous policy (default) : The Master can commit without waiting for a response from the Slave
  • Delay policy: The Slave must fall behind the time specified by the Master
    • For different business requirements, there are different policy solutions, but the final consistency is generally adopted, not strong consistency, after all, strong consistency will seriously affect performance
Master/slave replication implementation (TODO: not implemented yet)

Database connection Pool

Reference and Thanks

  • MySQL 30, 000 words essence summary + interview 100 questions, and the interviewer more than enough (collection series)
  • 100 classic interview Questions from MySQL
  • Relearning MySQL series I: Talk about the MySQL architecture
  • A Hash index
  • Mysql > create index (B+Tree)
  • Why B+ trees have a unique advantage over red-black trees in database indexes
  • Overwrite index, join index, index push down
  • MySQL index classification, 90% of developers don’t know
  • Count (1) is more efficient than count(*), right?
  • The difference between IN and EXISTS usage IN SQL
  • Details the execution sequence of SQL query statements
  • Interviewer: Do you know InnoDB’s solution to phantom reading?
  • [Database lock Mechanism] Understand the implementation mechanism of optimistic lock, pessimistic lock, and CAS optimistic lock
  • Fully understand mysql lock mechanism (InnoDB) and troubleshooting problems
  • MySQL Index push-down (ICP)
  • SQL optimization 2020 the most complete dry goods summary -MySQL
  • Explore the principle of master-slave replication in MySQL