Redis and database data consistency problem

“This is the 26th day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

Perhaps the most common scenario for Redis is to use it as a cache when data is read too much and written too little. When using Redis as a cache, the general flow looks like this.

If the cache exists in Redis, that is, a cache hit, the data is returned directly

If there is no corresponding cache in Redis, you need to query the database directly, then store the data in Redis, and finally return the data

In general, we will set a key value for a cache and set an expiration time for the key value. If the key corresponding to the queried data expires, we will directly query the database, save the queried data to Redis, reset the expiration time, and finally return the data, with the pseudo-code as follows:

/** * Get user details based on the user name *@authorPublic account [cicada mu wind] */
public User getUserInfo(String userName) {
      User user = redisCache.getName("user:" + userName);
      if(user ! =null) {
          return user;
      }

      // Search directly from the database
      user = selectUserByUserName(userName);
      // Write data to Redis and set expiration time
      redisCache.set("user:" + userName, user, 30000);
      // Return data
      return user;
}
Copy the code

Consistency problem

However, in the case that the Redis key value is not expired, the user modifies personal information, we need to operate on both database data and Redis data. Now we are faced with two choices:

Operate Redis data first, and then operate database data
Operation database data first, then operation Redis data

Depending on which method to choose, ideally both operations will either succeed or fail at the same time, otherwise Redis and the database will be inconsistent.

Unfortunately, there is currently no framework to ensure complete consistency between Redis data and database data. Depending on the scenario and the amount of code required, we can only take measures to reduce the probability of data inconsistencies and achieve a compromise between consistency and performance.

Let’s discuss some options for Redis and database inspection data consistency.

Scheme selection

Delete the cache or update the cache?

When the database data changes, the Redis data also need to carry out the corresponding operation, so this “operation” in the end with “update” or “delete”?

To “update”, call Redis’s set method and replace the old value with the new value. “Delete” directly deletes the original cache, re-reads the database for the next query, and then updates Redis.

Conclusion: It is recommended to use the delete operation directly.

With the update operation, you have two options

~~Update the cache first, then the database~~
Update the database first, then the cache

Forget the first option, let’s discuss the “update database first, update cache later” scenario.

If thread 1 and thread 2 update simultaneously, but each thread executes in the same order as shown in the figure above, the data will be inconsistent, so we recommend deleting the cache directly from this perspective.

In addition, “delete cache” is recommended for two reasons.

If there are more database write scenarios than data read scenarios, this scheme will result in frequent cache writes, wasting performance.
If the cache is a complex set of computations, then it is a waste of performance to evaluate the cache again after each write to the database.

Having identified this problem, we are left with only two choices:

Update the database first, then delete the cache
Delete the cache first, then update the database

Update the database first, then delete the cache

There are two possible exceptions to this approach

If the database fails to be updated, the program can catch the exception and return the result directly without continuing to delete the cache, so there will be no data inconsistency problem
Failed to delete the cache while updating the database. Procedure As a result, the database is the latest data, but the cache is old data, and the data is inconsistent

What about case two? We have two approaches: retry on failure and asynchronous update.

Failure to retry

If the cache deletion fails, we can catch the exception and send the key to the message queue. Create your own consumer consumption and try to delete the key again until the deletion succeeds.

The disadvantage of this approach is that it first incurs business code and secondly introduces message queuing, which increases the uncertainty of the system.

Asynchronous update cache

Since logs are written to the binlog when the database is updated, we can start a service that listens for binlog changes (such as using Ali’s Canal open source component) and then delete the key on the client side. If the deletion fails, it is sent to the message queue.

conclusion

In summary, in the case of a cache deletion failure, the practice is to retry the deletion until it succeeds. Whether it’s retry or asynchronous deletion, it’s the idea of ultimate consistency.

Delete the cache first, then update the database

There are two possible exceptions to this approach

If the cache fails to be deleted, the program can catch the exception and return the result directly without continuing to update the database, so there will be no data inconsistency problem
Description Deleting the cache succeeded, but updating the database failed. Data inconsistencies can occur in multiple threads

In this case, the old data stored in Redis, the database value is new data, resulting in data inconsistency. In this case, we can adopt the policy of delayed double deletion, that is, after updating the database data, delete the cache again.

In pseudocode:

/** * Delayed double delete *@authorPublic account [cicada mu wind] */
public void update(String key, Object data) {
    // Delete the cache first
    redisCache.delKey(key);
    // Update database
    db.updateData(data);
    // Sleep for a period of time, depending on how long it takes to read data
    Thread.sleep(500);
    // Delete the cache again
    redisCache.delKey(key);
}
Copy the code

I leave the reader with two questions:

why~~Update the cache first, then the database~~It won’t work?
Why should the delayed double delete method sleep for a period of time?

Please leave your comments in the comments section.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Redis and database data consistency problem

Consistency problem