preface

Recently, WANT to have a further understanding of MySQL, see how to ensure the consistency of cache and database, in charge of the business also need this aspect of optimization, some articles mentioned the use of monitoring MySQL binlog implementation, want to try, this article out of curiosity.

  • MySQL + Canal + Kafka

1, kafka:kafka.apache.org/quickstart (… 2, Canal: github.com/alibaba/can… 3, MySQL version is as follows:

+-------------------------+-----------------------+ | Variable_name | Value | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | innodb_version | 8.0.12 | | protocol_version 10 | | | Slave_type_conversions | | | tls_version | TLSv1 TLSv1.1, TLSv1.2 | | version | 8.0.12 | | version_comment | Homebrew | | Version_compile_machine | x86_64 | | version_compile_os | osx10.14 | | version_compile_zlib | 1.2.11 | +-------------------------+-----------------------+Copy the code

MySQL binlog profile

  • Binlog is a binary log maintained by MySQL Server. It is completely different from redo/undo log in storage engines such as InnoDB. It is used to record SQL statements that are updated or potentially updated to MySQL data and is stored on disk in the form of “transactions”. So, the binlog does the following:

Incremental backup of primary and secondary replication data recovery

  • Viewing binlog configurations:show variables like '%log_bin%';
+---------------------------------+-----------------------------------+
| Variable_name                   | Value                             |
+---------------------------------+-----------------------------------+
| log_bin                         | ON                                |
| log_bin_basename                | /usr/local/var/mysql/binlog       |
| log_bin_index                   | /usr/local/var/mysql/binlog.index |
| log_bin_trust_function_creators | OFF                               |
| log_bin_use_v1_row_events       | OFF                               |
| sql_log_bin                     | ON                                |
+---------------------------------+-----------------------------------+
Copy the code
  • Check the binlog directory:show binary logs;
+---------------+-----------+
| Log_name      | File_size |
+---------------+-----------+
| binlog.000036 |       155 |
| binlog.000037 |      1066 |
| binlog.000038 |      3075 |
+---------------+-----------+
Copy the code
  • Check the status of binlog:show master status;You can view the status of the current binary log file, the binary file being written, and the current position.
| binlog.000038 |     3075 |              |                  |                 |
Copy the code
  • So, let’s look againbinlog.000038Log content,./mysqlbinlog /usr/local/var/mysql/binlog.000038.
  • ROWSQL statements need to be decoded with decoding options../mysqlbinlog --base64-output=decode-rows -v /usr/local/var/mysql/binlog.000038.
# at 3311
#200410 17:29:47 server id 1  end_log_pos 3386 CRC32 0xac866698 	Write_rows: table id 65 flags: STMT_END_F
### INSERT INTO `zacblog`.`t_zb_article`
### SET
###   @1=5
###   @2=22121
###   @3='dada'
###   @4='dadwad'
###   @5=0
###   @6='2020-04-10 17:29:31'
###   @7=1586510981
Copy the code
  • Of course, the format of the binlog can also be set, respectivelyROW,STATEMENT,MIXEDOptions.

Cache and database consistency

  • First of all, according to CAP theory, it is impossible for A system to satisfy C, A and P at the same time. Giving up C does not mean giving up consistency, but giving up strong consistency and pursuing final consistency.
  • Before, the project also encountered the problem of how to keep the Cache consistent with the database. There were various answers on the Internet, but I saw some great heroes often refer to the article cache-Aside pattern, which describes the pattern of updating the database first and then deleting the Cache.

The order of the steps is important. Update the data store before removing the item from the cache. If you remove the cached item first, there is a small window of time when a client might fetch the item before the data store is updated. That will result in a cache miss (because the item was removed from the cache), causing the earlier version of the item to be fetched from the data store and added back into the cache. The result will be stale cache data. 原 文 : If the cache is deleted first, there is a short window in which the client accesses the old value of the database, causing all requests under that key to be sent to the database, and the old value is also saved back to the cache.

  • So, after updating the database, deleting the cache fails (the probability cannot say no), there will still be a problem…

1. We can use MQ retry mechanism, when remove throws an exception, we can use MQ asynchronous retry delete. 2. Use monitoring to catch and retry exceptions.

  • Writing here, I almost feel that I am off topic, indeed this part of the Internet to express their views, the above is purely personal views. To satisfy my curiosity, there is an article about using MQ listeningMySQL binlogDelete the cache asynchronously. If the cache fails to be deleted, retry by placing it in MQ.

Introduction of Canal

  • In order to monitorMySQL binlogI searchMeituan DB data synchronization to data warehouse architecture and practiceThe article writes about usingCanalimplementationbinlogtoKafkaThe connection.
  • Canal, which translates to a channel, is used to provide incremental data subscriptions and consumption based on incremental log parsing of the MySQL database.
  • Logging based incremental subscription and consumption services include

Database mirroring 2. Real-time database backup 3. Index construction and real-time maintenance (splitting heterogeneous indexes and inverted indexes) 4. Service cache refresh 5

  • And then, we’re going to start based onQuickStartCommand to start, most are based on the official website operation, just encountered a few pits.
  • I won’t go over the specific commands, but the order is

Properties, which contains db, user information, kafka cluster information, and topic information

Record on pit

  • canal.propertiesThe key configuration
. canal.serverMode = kafka ... canal.mq.servers = localhost:9092,localhost:9093,localhost:9094Copy the code
  • instance.propertiesThe key configuration
Canal. The instance. The master. The address = 192.168.1.20:3306# username/password, the database username and password. canal.instance.dbUsername = canal canal.instance.dbPassword = canal ...# mq config
canal.mq.topic=example
Copy the code
  • caching_sha2_password Auth failed
Caused by: java.io.IOException: The connect / 127.0.0.1:3306 failure at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.connect(MysqlConnector.java:83) ~ [canal. Parse. Driver - 1.1.4. Jar: na] the at Com. Alibaba. Otter. Canal. Parse. The inbound. Mysql. MysqlConnection. Connect (MysqlConnection. Java: 89) ~ [canal. Parse - 1.1.4. Jar: na]  at com.alibaba.otter.canal.parse.inbound.mysql.MysqlEventParser.preDump(MysqlEventParser.java:86) ~ [canal. Parse - 1.1.4. Jar: na] at the alibaba. Otter. Canal. The parse. The inbound. AbstractEventParser$3. The run (AbstractEventParser. Java: 183) ~ [canal. Parse - 1.1.4. Jar: na] at Java. Lang. Thread. The run (748) Thread. Java: [na: 1.8.0 comes with _181] Caused by: java.io.IOException: caching_sha2_password Auth failed at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.negotiate(MysqlConnector.java:257) ~ [canal. Parse. Driver - 1.1.4. Jar: na] the at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.connect(MysqlConnector.java:80) ~ [canal. Parse. Driver - 1.1.4. Jar: na]... 4 common frames omittedCopy the code
  • withmysql -hlocalhost -P3306 -ucanal -p123 -DzacblogLog in to the database and restart itcanalCan.

The final result

  • Based on the modification to the database table, Canal sends a message via Kafka, which is received by Consumer in the following format.
  • The INSERT statement,dataIs the key field
{
    "data":[
        {
            "article_id":"3"."author_id":"1121212"."article_title":"dadad"."article_content":"dadad"."status":"0"."create_time":"The 2020-04-10 16:30:53"."update_time":"The 2020-04-10 16:31:00"}]."database":"zacblog"."es":1586507462000."id":1."isDdl":false."mysqlType": {"article_id":"bigint(20)"."author_id":"bigint(20)"."article_title":"varchar(50)"."article_content":"text"."status":"tinyint(3)"."create_time":"datetime"."update_time":"timestamp"
    },
    "old":null."pkNames": ["article_id"]."sql":""."sqlType": {"article_id":- 5."author_id":- 5."article_title":12."article_content":2005."status":- 6."create_time":93."update_time":93
    },
    "table":"t_zb_article"."ts":1586507463069."type":"INSERT"
}
Copy the code
  • UPDATE statement, used to UPDATE fieldsoldIs represented by a list of
{... "old":[ { "article_title":"dadad", "update_time":"2020-04-10 16:31:00" } ], ... "type":"UPDATE" }Copy the code
  • The DELETE statement,dataIs the key field
{..."data":[
        {
            "article_id":"5"."author_id":"22121"."article_title":"dada"."article_content":"dadwad"."status":"0"."create_time":"The 2020-04-10 17:29:31"."update_time":"The 2020-04-10 17:29:41"."ext":null
        }
    ],
    "database":"zacblog"."es":1586517722000,
	...
    "type":"DELETE"
}
Copy the code
  • ALTER TABLE
{
    "data":null,
    "database":"zacblog",
    "es":1586511807000,
    "id":6,
    "isDdl":true,
    "mysqlType":null,
    "old":null,
    "pkNames":null,
    "sql":"ALTER TABLE `zacblog`.`t_zb_article` 
ADD COLUMN `ext` varchar(255) NULL AFTER `update_time`",
    "sqlType":null,
    "table":"t_zb_article",
    "ts":1586511808016,
    "type":"ALTER"
}
Copy the code

summary

  • This article from theMySQL binlogStarting with a discussion on the path of cache-database consistency, and finally to satisfy my curiosity, I simply started listeningbinlogChanged Kafka message.
  • In fact, in the actual development, we can also refer to this message format, add, delete and change the database, issued the corresponding business message, this message format, I think there is a reference significance.

Refer to the article

  • zhuanlan.zhihu.com/p/33504555
  • zhuanlan.zhihu.com/p/59167071
  • Docs.microsoft.com/en-us/azure…
  • www.usenix.org/system/file…
  • Tech.meituan.com/2018/12/06/…
  • Github.com/alibaba/can…