Reference:

Here are a dozen pictures that tell you which cache or library to write first

Resolution of distributed database and cache double-write consistency schemes

1. Update the database first and then the cache

  • Is not workable.

  • Cause analysis:

    • Frequent cache updates waste resources (write too much read too little)

    • Complex calculation of cached data and wasted performance (may involve calculation of multiple tables)

    • Thread concurrency safety issues. (Update cache in inconsistent order, resulting in dirty data)

2. Update the database before deleting the cache

Cache Aside Pattern

  • use

    • Basic read: reads from the cache. If there is no data in the cache, reads from the database, stores it in the cache, and returns a response.

    • Update the database first, then delete the cache.

  • Existing problems:

    • Database update succeeded, cache deletion failed. Procedure The data in the database is up to date, but the data in the cache is old.

    • Concurrency issues

      • The cache was invalidated when a read request was made to query it.

      • Read the request to query the database and get the old value.

      • The write request writes the new value to the database, and the write request deletes the cache.

      • The read request writes the old value to the cache.

    • Analyze the probability of concurrent problems

      • The probability is very low because the condition requires read cache invalidation and concurrent write operations. The read operation must enter the database operation before the write operation and update the cache after the write operation. But actually the write operation is much slower than the read operation, so the probability is very small.
  • Improvement:

    • Provide a guaranteed retry mechanism. There are two options.

    • Scheme 1: Message queues can be used to ensure cache consistency.

      • Process:

        • Updating the database

        • Cache deletion failed

        • Sends the key that needs to be deleted to the message queue

        • Consume the message yourself and get the key that needs to be deleted.

        • Continue to retry the deletion until it succeeds.

      • But this approach leverages message-oriented middleware, increasing complexity and incursion into business code. For some scenarios that require less consistency, this is not necessary.

    • Plan 2: Start a subscriber program to subscribe to the database binlog and get the data to be operated on. In the application, start another program, get the information from the subscriber, and delete the cache.

3. Delete the cache first and then update the database

Delete the cache first and update the database later.

  • Suppose the cache deletion succeeds, but the database update fails. When a new request comes in, it reads the old data from the database and updates it into the cache.

  • This is applicable when the concurrency is very low.

Cache delay dual-delete policy

  • Analysis of problems in deleting cache first and updating database later:

  • Enter two requests at the same time, a write request and a read request.

  • The write request first deletes the data in Redis and then goes to the database to update it.

  • The read request determines whether there is data in Redis. If there is no data, it requests the database and writes the data into the cache. (At this time, the write request has not updated the DB successfully, so the read request gets the old data)

  • Inconsistency between the cache and the database occurs after a write request to update the DB succeeds.

  • MySQL uses the read-write separation architecture, causing the inconsistency of double-write:

    • Enter two requests at the same time, a write request and a read request.

    • The write request deletes the cache and updates the primary database successfully. But it’s not synchronized to the slave repository yet.

    • The read request determines whether there is data in Redis. If there is no data, it requests the library and writes the data to the cache. (This is the old value)

    • The database completes the master/slave synchronization, and the slave library changes to the new value.

  • Solution: A delayed dual-delete policy

    • Delete the cache first

    • Rewrite database

    • After waiting some time asynchronously, the cache is deprecated again. (The time setting here is mainly to ensure that the read request ends. The write request can delete the cache dirty data generated by the read request. It needs to be evaluated by yourself.)

  • This scheme solves the problem of inconsistency caused by read request and write request under high concurrency. It is fast to read, but may be dirty for a short time.

  • If the second deletion fails, you need to add a retry mechanism to ensure that the deletion succeeds.