From the interview question: One and a half years of interview sharing (including Ali Meituan toutiao jingdong Didi)

译 文 : About master-slave delay, an article tells you clearly!

preface

In the actual production environment, a single MySQL as an independent database is completely unable to meet the actual needs, no matter in terms of security, high availability and high concurrency

Therefore, data is synchronized through master-slave replication, and mysql-proxy is used to improve the concurrent load of the database for deployment and implementation

MySQL master/slave cluster

  • Improve the database load capacity, the primary database to perform read and write tasks (add, delete, change), the standby database only do query.
  • Improve system read and write performance, scalability, and high availability.
  • In data backup and DISASTER recovery, the standby database is in a remote location. If the primary database does not exist, the standby database can take over immediately without recovery time.

When it comes to master-slave synchronization, you can’t do without binlog, so let’s talk about binlog

biglog

What’s a binlog? What does it do?

Used to record information about write operations (excluding queries) performed by the database, in binary form, on disk. This can be simply interpreted as a record of SQL statements

Binlog is the logical log of mysql and is logged by the Server layer. Binlog is logged by mysql databases using any storage engine

In practice, binlog can be used in two main scenarios:

  • The binlog is used for master-slave replication. In the master-slave structure, the binlog is sent from the master to the slave as operation records. The slave server saves the logs received from the master to the relay log.

  • Used for data backup. After a database backup file is generated, binlog stores the details of the database backup so that the next backup can start from the backup point.

Log format

There are three formats for binlog logs: STATMENT, ROW, and MIXED

Before MySQL 5.7.7, the default format was STATEMENT. After MySQL 5.7.7, the default value was ROW

The log format is specified in binlog-format.

  • STATMENT: Replication based on SQL statements. Each SQL statement that modifies data is recorded in a binlog
  • ROW: row-based replication
  • MIXED: MIXED replication based on STATMENT and ROW. For example, general data operations are stored in ROW format, and some table structure changes are recorded in statement

You can also view the contents of a file using the mysqlbinlog viewing tool provided by mysql, for example

mysqlbinlog mysql-bin.00001 | more
Copy the code

The size and number of binlog files will increase continuously. The file name extensions will increase in sequence, for example, mysql-bin.00002.

Principle of master-slave replication

The mysql database requires three threads: master (binlog dump thread), slave (I/O thread), and slave (SQL thread).

  • Binlog dump thread: When data is updated in the master database, it writes the updated event type to the binlog file of the master database based on the specified binlog format and creates a log dump thread to notify the slave of data updates. When the I/O thread requests the log content, both the current binlog name and the current update location are passed to the SLAVE I/O thread.

  • I/O thread: this thread connects to the master, requests a copy of the specified binlog file location from the log dump thread, and stores the requested binlog in a local relay log.

  • SQL thread: after detecting a relay log update, this thread reads and performs redo operations on the local database. Events that occurred in the primary database are reexecuted on the local database to ensure data synchronization between the primary and secondary databases.

Summary of basic process

  1. The master library writes data and generates a binlog file. In this process, MySQL writes the transaction sequentially to the binary log, even though the statements in the transaction are executed interspersed.
  2. After the event is written to the binary log, the master notifies the storage engine to commit the transaction.
  3. The IO thread on the slave server connects to the Master server and requests that the binlog be read from the specified location in the execution binlog log file to the slave server.
  4. After the master library receives the request from the Slave I/O thread, the IO threads copied on it will read the binlog file in batches according to the request information of the Slave library and then return it to the Slave I/O thread.
  5. After the I/O thread of the Slave server obtains the Log content, Log file, and location points sent by the I/O thread of the Master server, it writes the binlog content to the end of the Slave Relay Log file and records the new binlog file name and locationmaster-infoFile, so that the next time the new binlog is read on the master side, the master server can be told to start reading the new binlog content from the specified file and location.
  6. The SQL thread from the library server will detect in real time that the Log content is added to the local RelayLog, and then translate the RelayLog Log into SQL and execute the SQL sequentially to update the data from the library.
  7. From the library inrelay-log.infoTo record the file name and location of the current application relay log for the next data replication.

The parallel copying

Prior to MySQL 5.6, there were two I/O threads and SQL threads on the Slave server.

The I/O thread receives the binary log, and the SQL thread plays back the binary log. If parallel replication is enabled in MySQL 5.6, the SQL thread becomes a coordinator thread, which is responsible for the previous two parts

The red box in the figure above is the key to parallel replication

This means that rather than just sending logs to the worker thread, the coordinator thread can play back the logs itself, but all concurrent operations are delivered by the worker thread.

Coordinator threads and workers are typical producer and consumer models.

The main reason for this is that playback on the slave server is the same as that on the master server. That is, parallel playback on the slave server is the same as that on the master server. There are no parallel copy restrictions for libraries and no special requirements for binary log formats.

In order to be compatible with MySQL 5.6 library-based parallel replication, 5.7 introduced a new variable slave-parallel type, which can be configured with the following values:

  • DATABASE: default value, library-based parallel replication
  • LOGICAL_CLOCK: parallel replication based on group submission

The following two parallel replication modes are introduced

According to the library in parallel

Each worker thread corresponds to a hash table that holds the libraries involved in transactions that are currently in the worker’s execution queue. The key in the hash table is the database name and is used to determine the distribution policy. The advantage of this strategy is that it builds hash values quickly, requires only the library name, and has no requirement for the binlog format.

However, the effect of this strategy is good only when there are multiple DB on the main database and the pressure of each DB is balanced. Therefore, it does not make much difference if the tables on the main database are all in the same DB or the hot spots of different DB are different

Group submission optimization

The features are as follows:

  1. Transactions that can be committed in the same group will not change the same row;

  2. Transactions that can be executed in parallel on the master library must also be executed in parallel on the slave library.

How to achieve:

  1. All transactions committed together in the same group will have the same commit_id, and the next group will be commit_id+1. The COMMIT_id will be written directly to binlog.

  2. When used by the slave library, transactions of the same commit_id are distributed to multiple workers for parallel execution until the execution of the same commit_id is completed and the coordinator takes the next batch.

Can go to the website to see more detailed content: dev.mysql.com/doc/refman/…

Let’s start with the master-slave delay

Master-slave delay

What is master-slave delay?

According to the principle of master-slave replication, there is a certain period of data inconsistency between the two, which is the so-called master-slave delay.

Let’s look at the point in time that causes the master-slave delay:

  • Primary library A completes A transaction and writes to the binlog at A time marked T1.
  • The moment it receives the binlog from library B is called T2.
  • The moment the transaction completes from library B is denoted as T3.

So the so-called master-slave delay is the difference between the time of the completion of the slave library and the time of the completion of the master library for the same transaction, namely T3-T1.

Seconds_behind_master is returned by executing show slave status on the slave library, indicating how many seconds the slave library is currently delayed.

How is seconds_behind_master calculated?

  • Each transaction’s binlog has a time field that records the time of the write on the primary library
  • Take the time field of the currently executing transaction from the library, subtract it from the current system time, and getseconds_behind_masterThat’s t3-T1, as I described earlier.

Primary/secondary delay cause

Why the master-slave delay?

Normally, if there is no network latency, the time for logging from the master to the slave is quite short, so T2-T1 can be largely ignored.

The most direct impact is the time period for relaylog consumption from the library, and the reasons are generally as follows:

1. The machine performance of the slave library is worse than that of the master library

For example, put 20 master libraries on four machines and one slave library on one machine. When the update is performed at this time, a large number of read operations are triggered. As a result, multiple slave libraries on the slave library machine compete for resources, resulting in master/slave delay.

However, most deployments today are on machines with the same specifications for master and slave.

2, from the storage pressure is large

According to the normal policy, read and write are separated, with the master library providing write capability and the slave library providing read capability. Placing a large number of queries on the slave library consumes a large amount of CPU resources on the slave library, which in turn affects synchronization speed and leads to master-slave latency.

For this kind of situation, can be through one master more from, share reading pressure; You can also take binlog output to an external system, such as Hadoop, and let the external system provide query capability.

3. Execution of big transactions

Once a large transaction is executed, the main library must wait until the transaction is complete before writing to the binlog.

For example, the primary library performs an insert… Select a very large insert operation that caused hundreds of gigabytes of binlog files to be transferred to the read-only node, thus causing binlog latency to be applied to the read-only node.

As a result, DBAs often warn developers not to try delete statements to delete a large amount of data at once, but to do so in batches if possible.

DDL(alter, drop, create)

DDL synchronization between the read-only node and the master is serial. If the DDL operation takes a long time in the master, it will take the same time in the slave. For example, if it takes 10 minutes to add a field to a 500W table in the master, it will take 10 minutes to add a field to the master.

2, there is a query on the slave node that takes a very long time to execute, so the query will block the DDL from the master library and the table will be locked until the end of the query, resulting in data delay on the slave node.

5. Lock conflicts

Lock conflict issues can also cause slow execution of SQL threads from the slave node, such as some select…. on the slave machine SQL for update, or using MyISAM engine etc.

6. Replication capability of the slave library

In a typical scenario, if the slave library is delayed for a few minutes by accident, it will catch up with the master library after the slave library recovers. However, if the slave is slower than the master, and the master is under constant pressure, it will result in long master/slave latency, which is most likely due to the replication capability of the slave.

Execution on the slave library, called SQL_thread update logic, was single-threaded prior to version 5.6, which resulted in significant master-slave latency when the master library had high concurrency and HIGH TPS.

As a result, MySQL has supported parallel replication since version 5.7. You can set slave_parallel_workers to a number greater than 0 on the slave service, and then set the slave_parallel_type parameter to LOGICAL_CLOCK

mysql> show variables like 'slave_parallel%';
+------------------------+----------+
| Variable_name          | Value    |
+------------------------+----------+
| slave_parallel_type    | DATABASE |
| slave_parallel_workers | 0        |
+------------------------+----------+
Copy the code

How to reduce master-slave latency

The master-slave synchronization problem is always a trade-off between consistency and performance, depending on the actual application scenario. To reduce master-slave latency, you can do the following:

  1. Reduce the concurrent probability of multi-threaded large transactions and optimize business logic
  2. Optimize SQL to avoid slow SQL and reduce batch operations. You are advised to write scripts in the form of update-sleep.
  3. Improve the configuration of the slave library machine, reduce the efficiency difference between the master library writing and the slave library reading binlog.
  4. Try to use short links, that is, the distance between the master library and the slave library server should be as short as possible to improve port bandwidth and reduce the network delay of binlog transmission.
  5. In real time, the service read is forcibly removed from the primary database, and the secondary database only performs DISASTER recovery and backup.