Discussion on Redis cache update strategy

This is the seventh day of my participation in the August More text Challenge. For details, see: August More Text Challenge

High concurrency

High Concurrency is a condition where a large number of operation requests are encountered in a short period of time during the operation of a system. This is what happens when a large number of visits to a web system are concentrated and a large number of requests are received, for example, during events like Taobao’s Double 11 or JD.com’s 618. When this happens, the system performs a large number of operations (resource requests, database operations, etc.) during this period of time.

The common indicators related to high concurrency are response time, throughput, query rate per second (QPS), and number of concurrent users. Response time: The time the system takes to respond to a request. For example, the system needs 600ms to process an HTTP request. This 600ms is the response time of the system. Throughput: The number of requests processed per unit of time. QPS: Number of requests per second. Number of concurrent users: Indicates the number of users that can use the system functions. For an instant messaging system, for example, the number of concurrent users is a measure of the number of concurrent users of the system.

In simple terms, the basic representation of high concurrency is the number of requests a system can process simultaneously per unit of time. High concurrency there is no specific scope for how much concurrency is high. For example, if you develop a system with a maximum concurrency of 1000, then 1001 concurrency is high concurrency for you, but this amount of concurrency is nothing on Taobao.

If high concurrency is not handled properly, the user experience will be reduced (such as long request response time), and the system may break down or stop working.

Use the cache

You all know that MySQL plus Redis is a classic combination. Using Redis as the pre-cache of MySQL can block most query requests for MySQL, which can greatly relieve the pressure of MySQL concurrent requests. Redis is a high performance K-V database that uses memory to store data. Its high performance mainly comes from: simple data structure and using memory to store data. However, we need to know that memory itself is a volatile storage, so using Redis can not guarantee reliable storage of data. By design, Redis trades data reliability for high performance. But it is these features that make Redis particularly suitable for MySQL’s front cache.

Even if we only use Redis as a cache, we must consider the data unreliability of Redis when we design the Redis cache. In other words, our system should be compatible with the data loss of Redis when we use Redis, so that even if the data loss occurs in Redis, It does not affect the data accuracy of the system.

Regarding the cache update strategy, the most used is alsoThe Read/Write Through modelandThe Cache value mode.The Read/Write Through modelWhen querying data, first go to the cache query, if the cache is hit, then directly return data; If it doesn’t, it goes to the database, gets the result, writes the data to the cache, and returns it. When updating data, first update the database, and if the update is successful, then update the cache data.Let’s think about it, is this a good way to use caching? Most of the time it’s probably ok. However, in the case of high concurrency, there is a certain probability that this will happenDirty dataThe data in the cache may be incorrectly updated to old data.

To the same record, for example, at the same time created a read request and a written request, the two requests are assigned to two different threads parallel execution, read thread attempts to read cache missed it, read about the order data to database, and then may write another thread in the process of dealing with written request, has updated the data and cache, this time have a problem, The first reader thread with the old data updates the cache to the old data, and dirty data is generated.

This is a kind of situation, and the thread of the same article such as two order data concurrent writes, may also cause dirty data in the cache, and the probability of dirty data is the amount of data and system as well as the number of concurrent positive correlation, the amount of data when the system is large enough and concurrent enough cases, this kind of dirty data is almost inevitable. Is it possible to avoid or reduce the generation of dirty data? Yes, it is. Here is another pattern.

The Cache value modeThe Cache Aside mode is very similar to the Read/Write Through mode above. The logic for handling Read requests is exactly the same. The only small difference is that when updating data, the Cache Aside mode does not attempt to update the Cache.Using this mode to update the Cache can effectively avoid dirty data problems caused by concurrent reads and writes. Although the Cache Aside mode reduces the probability of dirty data generation compared to the Read/Write Through mode, it is more likely to cause avalanches due to high concurrency.

conclusion

Using Redis as a pre-cache for MySQL is a very effective way to improve the system’s ability to handle high concurrency and reduce request response time. In most cases, using the Cache Aside mode to update the Cache is the best choice. It is easier than using the Read/Write Through mode and greatly reduces the likelihood of dirty data. In particular, the problem of avalanche caused by large cache penetration is that you need to choose the right solution for your particular business scenario.

Finally, I would like to thank my girlfriend for her tolerance, understanding and support in work and life.

Discussion on Redis cache update strategy

High concurrency

Use the cache

conclusion

Related Posts

Validation validation has a bug in SpringBoot. I tried it

[Java collection source] several questions about ArrayList

How do I get the APPID and key?