Redis cache breakdown, penetration, avalanche concepts and solutions

Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities

This article explains Redis cache breakdown, penetration, avalanche concepts and solutions.

I. Cache breakdown

Cache breakdown concept

A cache breakdown is a request to access data that is not in the cache but is in the database.

This is generally the case when the cache expires. However, there are a lot of concurrent users accessing the cache. This is a hot key, so many users’ requests come at the same time, but there is no data in the cache, so they access the database to fetch data at the same time, causing the database traffic surge, the pressure increases instantly, and directly crash to show you.

So a data cache, each request quickly returns data from the cache, but at some point the cache expires, a request in the cache does not request data, then we say that the request “breached” the cache.

Cache breakdown solution

1, scheme 1 mutex

The idea behind the mutex scheme is that if you don’t get data from Redis, one thread will go to the database and query the data, then build a cache, and the other threads will wait and fetch it from Redis later.

The pseudocode is as follows:

String get(String ycf) {
   String music = redis.get(ycf);
   if (music == null) {
   //nx sets a key=ycf_lock,
   //value=y_lock Data, which will expire in 60 seconds
    if (redis.set("ycf_lock"."y_lock"."nx".60)) {
        // Get data from the database
        music = db.query(ycf);
        // Build data, expire after 24*60*60s
        redis.set(ycf, music,24*60*60);
        // The mutex can be removed if the build is successful
        redis.delete("ycf_lock");
    } else {
        // Other threads rest for 100ms and retry
        Thread.sleep(100);
        If the previous setting is successful within 100ms, then there is datamusic = redis.get(ycf); }}}Copy the code

This solution solves the problem, but while one thread is building the cache, the other threads are sleeping or polling.

And in an age of high concurrency and low latency, you’ve made your users wait for precious 100ms. It’s possible that someone is 100ms faster than you and steals a lot of users.

2. Scheme 2 Background renewal

The idea of the backend continuation scheme is to open a scheduled task in the background to actively update the data that is about to expire.

For example, when the program monkey set jay as the hot key, and set the expiration time to 60 minutes, the background program at the 55th minute, will go to the database to query the data and put it back in the cache, and set the cache again to 60 minutes.

3. Plan 3 never expires

This solution is a little bit more crude.

As the name suggests, if you look at the actual scenario you can almost imagine that this key is a hot key, and there are a lot of requests to access this data. Why would you set an expiration date on data like that? It goes straight in. It never expires.

But on a case-by-case basis, there is no plan to go all over the world.

For example, what if the key was cooked alive with tap water? Just like Nezha, you don’t expect the key to make such a fuss. How do you handle this situation?

So, case by case. But the idea should be clear, the final plan is the combination or variation of the conventional plan.

Second, cache penetration

The concept of cache penetration

Cache penetration refers to a request to access the data, neither in the cache nor the database, and users short time, high density of such requests, each time to the database service, causing pressure on the database. Generally, such requests are malicious requests.

According to the picture, there is no data in the cache, and then request the database, unexpectedly there is no data in the database.

Cache penetration solution

1. Scheme 1 caches empty objects

Caching an empty object means that even if we find an empty object in the database, we cache the empty object.

The empty object is cached, and the next time the same request hits the empty object, the cache layer processes the request without any strain on the database.

This is simple to implement and cheap to develop. However, the following problems must be noted:

First question: What do you do if, at some point, an empty record in the cache has a value in the database?

I know of three solutions:

Solution 1: When setting the cache, set an expiration time at the same time, so that after the expiration, the database will query the latest data and cache it.

Solution 2: If the real time requirements are very high, then write database, write cache. This ensures real-time performance.

Solution 3: If the real-time requirements are not so high, then write to the database to send a message queue data, let the message queue then notify the processing cache logic to fetch the latest data from the database.

Second question: for malicious attacks, when the request is often different keys, and only one request, then you want to cache these keys, because each key is only requested once, that is still each request database, no protection to the database ah?

At this point, you tell him, “Bloom filter, learn about it.”

2, Scheme two Bloom filter

What is a Bloom filter?

Essentially, the Bloom filter is a kind of data structure, a clever probabilistic data structure that inserts and queries efficiently and tells you “something definitely doesn’t exist or might exist.”

It is more efficient and takes up less space than traditional List, Set, Map and other data structures, but the disadvantage is that the results it returns are probabilistic rather than exact.

When a Bloom filter says a value exists, it may not exist; When it says it doesn't exist, it definitely doesn't exist.

So the bloom filter returns probabilistic results, so it relieves the strain on the database, but it doesn’t completely block it, so be clear.

The Guava component can implement a Bloom filter out of the box, but Guava is memory based, so it is not suitable for distributed environments.

In order to use Bloom filters in a distributed environment, redis has to be used to implement bloom filters.

See, Redis isn’t just for caching. This is a knowledge point.

Cache avalanche

The concept of cache avalanche

Cache avalanche is when most of the data in the cache reaches expiration time at the same time, and the volume of query data is large, and then again, the cache does not have the data in the database. Requests are sent to the database, causing a surge in database traffic, a sudden increase in pressure, and a direct crash to show you.

Unlike the cache breakdown mentioned above, a cache breakdown refers to a large number of concurrent queries for the same data.

Cache avalanche is when different data reaches the expiration date, causing the data not to be queried in the cache.

Or the cache service just died, so the cache is gone.

Anyway, the requests go to the database. For the database, traffic avalanche, very image.

Caching avalanche solutions

1. Plan one plus mutex

If a large number of buffers expire at the same time, we can add a mutex.

Wait, wasn’t mutex introduced earlier?

Yes, cache avalanches can be viewed as multiple cache breakdowns, so a mutex solution can also be used, which I won’t go into here.

2. Scheme 2 “Staggered peak” expires

If a large number of caches expire at the same time, another solution is to add a short random expiration time to the key expiration time to avoid the cache avalanche caused by a large number of caches expire at the same time.

For example, set the expiration time of a class of keys to 10 minutes, and add 60 seconds of random events to the 10 minutes, like this:

redis.set(key,value,10*60+RandomUtils.nextInt(0.60),TimeUnit.SECONDS)
Copy the code

3. Plan 3 Cache cluster

If the cache service fails, it is mostly due to a single point of application. Then we can introduce redis cluster, using master slave plus sentinel. Using Redis Cluster to deploy Cluster is very convenient, can understand.

Of course, this is a pre-emptive solution, and when using a single point, you have to consider the problems caused by service outages. Therefore, advance cluster deployment improves service availability.

4. Scheme 4 current limiter + local cache

So what if the Cluster also dies? In fact, this is the service robustness consideration:

Robustness is the robustness of the system. It refers to a program that fully considers all possible situations that may cause the program to crash, and handles them accordingly, so that the program can still work properly when it encounters abnormal conditions, rather than crash.

At this point, consider introducing a current limiter, such as Hystrix, and then implementing service degradation.

Let’s say your current limiter is set to get 5,000 requests a second at most, and 8,000 requests come in at that time, and the additional 3,000 go through a demotion process, a user-friendly reminder.

If Redis fails, it is possible to drain the database. If Redis fails, it is possible to query the local cache, such as EchCache or Guava Cache.

5. Plan 5: Restore as soon as possible

Is there nothing more to say about this?

Hey, big brother, are you out of service? Let’s get the service back on.

Recommended: SpringBoot integrates Redis and encapsulates the RedisUtils tool