Three modes of binlog

1. The statement level pattern

Every SQL that modifies data is logged in the master’s bin-log. When the slave is replicated, the SQL process parses the same SQL as the original SQL executed by the master and executes it again. Advantages: The advantages of the statement level are as follows: The statement level eliminates the disadvantages of the row level. It does not need to record the changes of each row, reducing the amount of bin-log logs, saving I/O, and improving performance. Because he only needs to record the details of the statement being executed on the master, and information about the context in which the statement was executed. Faults: since it is recording the execution of the statement, so in order to make these statements on the slave side can also be executed correctly, then he must also record each statement at the time of execution of some relevant information, namely the context information, to ensure that all statements on the slave side is executed to be implemented and in the master when the same results. In addition, due to the rapid development of mysql, a lot of new functions are added, which makes the replication of mysql encounter no small challenge. The more complex the content involved in natural replication, the more likely bugs will appear. At the statement level, there are many cases of mysql replication problems, mainly caused by the use of certain functions or functions when modifying data. For example, sleep() cannot be copied correctly in some versions.

2. Rowlevel mode

The modified data of each row is recorded in the log, and the same data is modified on the slave side. Advantages: Bin-log does not record the context-related information of the EXECUTED SQL statement, but only records which record is modified and how it is modified. So the contents of the row level log clearly record the details of each row. And there are no specific cases where stored procedures, or functions, or trigger calls and triggers cannot be copied correctly. Disadvantages: At row level, all executed statements are logged as changes per row, which may generate a large amount of log content, such as the update statement: Update product set member_id=’d’ where owner_member_id=’a’; The log does not record the events that correspond to the update statement (mysql records bin-log logs as events), but the changes to each record that is updated by the update statement, thus recording many records as many events that are updated. Naturally, the volume of bin-log logs is large.

3. The mixed mode

In mixed mode, mysql will choose between statement and row according to the specific SQL statement it executes. Statement level is the same as before and only records statements executed. Not all changes are recorded in row level mode. For example, when a table structure changes, the statement mode is recorded in statement mode. If the SQL statement is an update or delete statement, the row level mode is optimized. All row changes will still be logged.

What format should we use for binlog

Binlog_format STATEMENT can save I/O and speed up synchronization in some scenarios. However, InnoDB is a transaction engine. When READ-COMMITTED or READ-UNCOMMITTED isolation level or innodb_locks_unsafe_for_binlog is set to ON, write to binlog_format=statement is prohibited. For binlog_format=mixed, which is the default statement format for non-transaction engines and other isolation levels, only row format will be recorded.

> select @@tx_isolation;
+----------------+
| @@tx_isolation |
+----------------+
| READ-COMMITTED |
+----------------+

> create table t(c1 int) engine=innodb;

> set binlog_format=statement;

> insert into t values(1);
ERROR 1665 (HY000): Cannot execute statement: impossible to write to binary log since BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine limited to row-based logging. InnoDB is limited to row-logging when transaction isolation level is READ COMMITTED or READ UNCOMMITTED.

> set binlog_format='mixed';

> show binlog events in 'mysql-bin.000004'\G
*************************** 3. row ***************************
   Log_name: mysql-bin.000002
        Pos: 287
 Event_type: Gtid
  Server_id: 3258621899
End_log_pos: 335
       Info: SET @@SESSION.GTID_NEXT= 'ed0eab2f-dfb0-11e7-8ad8-a0d3c1f20ae4:9375'
*************************** 4. row ***************************
   Log_name: mysql-bin.000002
        Pos: 335
 Event_type: Query
  Server_id: 3258621899
End_log_pos: 407
       Info: BEGIN
*************************** 5. row ***************************
   Log_name: mysql-bin.000002
        Pos: 407
 Event_type: Table_map
  Server_id: 3258621899
End_log_pos: 452
       Info: table_id: 124 (test.t)
*************************** 6. row ***************************
   Log_name: mysql-bin.000002
        Pos: 452
 Event_type: Write_rows_v1
  Server_id: 3258621899
End_log_pos: 498
       Info: table_id: 124 flags: STMT_END_F
*************************** 7. row ***************************
   Log_name: mysql-bin.000002
        Pos: 498
 Event_type: Xid
  Server_id: 3258621899
End_log_pos: 529
       Info: COMMIT /* xid=18422 */
Copy the code

Why cannot statement binlog be used when READ-COMMITTED(RC) or READ-UNCOMMITTED? This is because while the statement is executing in a transaction, you can see the data committed or being written by other transactions. After the transaction is committed, the binlog is written, and then played back from the slave library, and you see that the data is not the same as when the master library was written. For example: have a table:

+------+------+
| a    | b    |
+------+------+
|   10 |    2 |
|   20 |    1 |
+------+------+
Copy the code

We do the following:

  1. Session1 does update in transaction,UPDATE t1 SET a=11 where b=2;A record in line (10,2) that meets the condition is not committed.
  2. Session2 also does an update, updating row (20,1) to row (20,2) and committing.
  3. The previous sesssion1 then commits the update to row (10,2).

If the Statement is recorded in the binlog format, the update in session2 will update the row (20,1) to (20,2) due to the reference and playback first during slave playback. UPDATE T1 SET a=11 where b=2; The statement will update the actions (10,2) and (20,2) (11,2). This results in master library behavior (11,2), (20,2), slave (11,2).

Third, problem analysis

The above is explained by a specific example. The essential reason is that the RC transaction isolation level does not meet the requirements of serialized transaction execution and does not solve the problem of non-repeatability and phantom read.

For repetable-read and Serializable isolation levels, the Statement format records. This is because for RR and Serializable, repeatable reads are guaranteed. In addition to locking corresponding rows, a GAP Lock is applied when it is possible to insert rows that meet the condition. When session1 updates a row b =2, it locks all rows and ranges, so that session2 waits for updates. From the perspective of the isolation level, Serializable allows serialization of transactions, so the binlog serial record transaction statement format is acceptable. At the same time, InnoDB’s RR isolation level has actually solved non-repeatable reads and phantom reads, meeting the transaction isolation requirements of ANSI SQL standard.

The read-committed, read-uncommitted binlog_format limit applies to all transaction engines.

Fourth, expand the content

Can we guarantee binlog Statement format for InnoDB RR and Serializable isolation levels? Not necessarily. Innodb_locks_unsafe_for_binlog Controls GAP Lock. This parameter is OFF by default:

mysql> show variables like 'innodb_locks_unsafe_for_binlog'; +--------------------------------+-------+ | Variable_name | Value | +--------------------------------+-------+ | Innodb_locks_unsafe_for_binlog | OFF | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- + 1 row in the set (0.01 SEC)Copy the code

RR level and above will add GAP Lock in addition to row Lock. However, if this parameter is set to ON, GAP Lock is not added to current reads, that is, next-key Lock must be added to current reads to be COMMITTED in RR isolation. So if this parameter is set to ON, even if the transaction isolation level used is repetable-read, the correctness of the slave database data is not guaranteed.

Five, the summary

For online services, if transaction engines such as InnoDB are used, do not set binlog_format to STATEMENT unless RR or higher isolation levels are written. Otherwise, services cannot be written. If binlog_format is Mixed, the transaction engines below RR isolation level must also write ROW events.