preface

In a distributed system, when both the cache and the database exist, should the database or the cache come first when a write is performed? Think about what the problems might be and then move on. Below I divide several plan elaborate.

Cache maintenance Solution 1

Suppose there is A write operation (thread A) and A read operation (thread B), operating on the cache before operating on the database. , as shown in the following flow chart:

1) Thread A initiates A write operation, the first step is del cache

2) The second step of thread A is to write the new data to DB

3) Thread B initiates a read, cache miss,

4) Thread B gets the latest data from DB

Request B to set the cache at the same time

Look at it that way, there’s no problem. Let’s look at the second flow chart, as follows:

1) Thread A initiates A write operation, the first step is del cache

Thread B initiates a read operation, cache miss

3) Thread B continues to read DB and reads old data

4) Then old data is cached

5) Thread A writes the latest data

The old data is in the cache, and every time it is read, it is old data. The cache and data do not match the database data.

Cache maintenance Solution 2

Double write operation, operation cache first, operation database.

1) Thread A initiates A write operation and sets the cache

2) The second step of thread A is to write the new data to DB

3) Thread B initiates a write operation, set cache,

4) The second step of thread B is to write the new data to DB

Look at it that way, there’s no problem. , but sometimes things don’t work out as expected. Let’s look at the second flow chart, as follows:

1) Thread A initiates A write operation and sets the cache

2) Thread B initiates a write operation and setCache is the first step

3) Thread B writes database to DB

4) Thread A writes database to DB

After the operation is complete, the cache stores the data after operation B, and the database stores the data after operation A. The cache and database data are inconsistent.

Cache maintenance Solution 3

A write (thread A) read (thread B) operation operates on the database first and then the cache.

1) Thread A initiates A write operation, and the first step is write DB

2) The second step of thread A del cache

Thread B initiates a read operation, cache miss

4) Thread B gets the latest data from DB

5) Thread B simultaneously sets the cache

This scheme has no obvious concurrency problem, but it may fail to delete the cache in step 2. Although the probability is relatively small, it is better than scheme 1 and Scheme 2. Scheme 3 is also used in daily work.

To sum up, we generally adopt plan 3, but is there a perfect way to solve all the disadvantages of Plan 3?

Cache maintenance Solution 4

This is the improvement of plan 3, which is to operate the database first and then the cache. Let’s look at the flow chart:

The binlog of the database is used to asynchronously eliminate the key. For example, mysql can use Ali’s Canal to send the binlog log collection to the MQ queue, and then confirm and process the update message through the ACK mechanism to delete the cache and ensure the consistency of data cache.

But there’s another question, what about master slave databases?

Cache Maintenance Solution 5

Primary/secondary DB: There is a delay time for the synchronization between the primary and secondary DB. If dirty data is read from the secondary database after the cache is deleted and before data is synchronized to the secondary database, how do I solve this problem? The solution flow chart is as follows:

Summary of Cache Maintenance

To sum up, in a distributed system, when the cache and database exist at the same time, if there is a write operation, the database operation first, then the cache operation. As follows:

(1) Read whether there is relevant data in the cache

(2) If there is relevant data value in the cache, return

(3) If there is no relevant data in the cache, read the relevant data from the database into the cache key->value, and then return

(4) If there is updated data, update the data first and then delete the cache

(5) In order to ensure the success of the fourth step cache deletion, use binlog asynchronous deletion

(6) If it is a master slave database, binglog is taken from the slave database

(7) If there is one master with multiple slaves, each slave library will collect the binlog, and then the consumer will delete the cache after receiving the last binlog data