From: juejin. Cn/post / 684490…

How to ensure data consistency between mysql and Redis?

In high-concurrency business scenarios, databases are the weakest link for concurrent access in most cases. Therefore, you need to use Redis to do a buffer operation, so that the request first access redis, rather than directly access Mysql and other databases. This can greatly relieve the strain on the database. Loading of Redis cache data can be divided into lazy loading and active loading modes. The following describes how to deal with data consistency in these two modes respectively.

Lazy loading

The read cache step is generally fine, but when it comes to data updates: database and cache updates, data consistency between the cache and the database can easily occur. Whether it’s writing to the database first, then deleting the cache; Or delete the cache first, and then write to the library, there may be data inconsistency. Here’s an example:

  1. If you delete the cache Redis, another thread will read it before you have time to write MySQL, and find that the cache is empty, it will read data from the database and write it into the cache, and the cache will be dirty.
  2. Data inconsistencies can also occur if the thread that wrote the library is down before the cache is deleted.

Because the writes and reads are concurrent, there is no guarantee of order, which can lead to inconsistencies between the cache and the database. How to solve it?

So, in combination with the two delete cases in the previous example, we consider the double delete and lazy load mode. So what is lazy loading? When services read data, they load it from the storage layer instead of actively refreshing it after update. The business process involved is as follows:

After understanding the lazy loading mechanism, combined with the above service flow chart, we explain how to do the double deletion.

Double delete delay

The redis.del(key) operation is performed both before and after the library is written, and the second deletion is delayed.

Plan 1 (a train of thought, not rigorous) specific steps are:

1) Delete the cache first;

2) Write database again;

3) Sleep 500 ms (depending on the specific business time);

4) Delete the cache again.

So, how do we determine this 500 milliseconds, and how long should we sleep?

You need to evaluate the time it takes to read data business logic for your own project. The purpose of this operation is to ensure that the read request ends and the write request can delete the cache dirty data caused by the read request.

Of course, this strategy also takes into account the time taken for redis and database master-slave synchronization. Finally write data sleep time: then read data business logic on the basis of the time, plus a few hundred ms. For example, sleep for 1 second.

Scheme 2: Asynchronous delay deletion:

1) Delete the cache first;

2) Write database again;

3) Trigger asynchronous writer serialization MQ (also can take a key+version distributed lock);

4) MQ accepts removing the cache again.

Asynchronous deletion has no impact on online services. Serial processing ensures correct deletion in concurrent cases.

Why double delete?

  • Db update is divided into two stages, before and after the update, the deletion before the update is easy to understand, in the process of DB update because of the concurrency of reading operations, there will be cache rewrite data, then the deletion after the update is needed.

How can I handle a dual-delete failure?

1. Set the cache expiration time

In theory, setting an expiration time for the cache is a solution to ensure ultimate consistency. All write operations are based on the database, and as soon as the cache expiration time is reached, subsequent read requests will naturally read new values from the database and backfill the cache.

Combined with the dual-delete policy and the cache timeout setting, the worst case is that data is inconsistent within the timeout period.

2. Retry scheme

The retry scheme has two implementations, one in the business layer and the other in the middleware.

Business layer implementation retry as follows:

1) Update database data;

2) The cache fails to be deleted due to various problems;

3) Send the key to be deleted to the message queue;

4) Consume messages and obtain the key to be deleted;

5) Retry the deletion until the deletion succeeds.

However, this scheme has a disadvantage of causing a lot of intrusion into the line of business code. This leads to plan 2, in which a subscriber is started to subscribe to the binlog of the database to obtain the data to be manipulated. In the application, start another program, get the information from the subscriber, and delete the cache.

Middleware implementation retry as follows:

Process description:

1) Update database data;

2) The database will write the operation information to the binlog;

3) The subscriber extracts the required data and key;

4) Create another non-business code to obtain the information;

5) Failed to delete the cache.

6) Send the information to the message queue;

7) Retrieve the data from the message queue again and retry the operation.

Active load

The active loading mode is to update the cache synchronously or asynchronously during DB update. The common modes are as follows:

Writing process:

The first step is to delete the cache, then update the DB, and then asynchronously flush the data back to the cache.

Reading process:

The first step is to read from the cache. If the cache does not read, the DB is read, and the data is asynchronously flushed back to the cache.

This mode is simple and easy to use, but it has the fatal disadvantage of concurrently producing dirty data.

Imagine that there are multiple servers multiple threads at the same time step 1.2 update DB, update the DB is completed, they’re about to brush the cache asynchronously, we all know that many server asynchronous operations, is no guarantee that the order, so the refresh operation exist behind the concurrency issues covered by each other, that is, to update the DB operation, Instead, they refresh the cache too late, and by that time, the data is wrong.

Read/write concurrency: Imagine that server A is performing A ‘read operation’ and server B is performing A ‘write operation’ just as server A has completed 2.2. Suppose that server B’s 2.3 is executed after server 1.3 has completed, which is equivalent to writing old data to the cache before update. The final data is still wrong.

The reason for the dirty data is that the active flush of the cache in this mode is a non-idempotent operation, so how to solve this problem?

  1. The preceding dual-delete operation scheme is idempotent because each deletion operation is stateless.
  2. Will refresh operation serial processing.

Here is a description of the refresh operation scheme based on serial processing:

Writing process:

The first step is to delete the cache, and then update the DB. We listen to the binlog of the secondary library (if the resources are small, the main library is ok). By analyzing the binlog, we resolve the data identification that needs to be refreshed, and then write the data identification to MQ.

For serialization of MQ, you can go to Kafka partition mechanism, which is not detailed here.

Reading process:

The first step is to read the cache, and if the cache does not read the DB, then asynchronously write the data identity to MQ (where MQ is the same as the MQ of the writing process), then consume MQ and parse the MQ message to read the library for the corresponding data to flush the cache.

conclusion

  1. Lazy load mode cache can be implemented by dual deletion +TTL invalidation.
  2. When the dual-deletion fails, you can retry services using MQ or components consuming the mysql binlog and writing it to MQ again.
  3. Since the active loading operation itself does not have idempotency, it is necessary to consider the ordering of loading. Mq partitioning mechanism is adopted to realize serialization processing and achieve the final consistency of cache and mysql data. At this time, the cache loading events of read and write operations go through the same MQ.