“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!”

preface

Hello, I’m Hurricane

The index of previous articles is as follows:

00-Redis do you really understand? 01-Redis datatypes You know more than that 02-Redis hash table doorways 03- Why Redis so fast 04-Redis persistent AOF Do you really know? 05- The RDB mystery of Redis persistence

In the previous article, we mainly talked about the basic knowledge of Redis. We have not had any actual combat or actual problems, so we will be boring. Today, I will talk about actual combat.

  • Cache avalanche
  • Cache breakdown
  • The cache to penetrate

I believe these three questions, there have been a lot of partners on the Internet, but today I still want to say, will draw more pictures, let you deepen the impression, these three questions are also high-frequency interview questions, but to clarify these questions, but also need skills.

When talking about these three questions, first talk about the normal request process, look at the picture and say:

The picture above means something like this:

First, in your code, whether it’s Tomcat or your RPC service, you will determine whether the data you want exists in the cache. If it exists, you will return it to the caller. If it does not exist, you will query the database, find the result, and then continue to cache it. The results are then returned to the caller, and the next time the query is performed, the cache is hit.

Cache avalanche

define

I remember when I was making the recommendation system, some data was calculated by offline algorithm. The requirement was to see which similar products would be recommended after this product was calculated. After this calculation, it would be stored in hbase and redis at the same time. If a large number of keys fail at the same time, then a large number of requests will be sent to the database. Because the throughput of the database is limited, it is likely to crash the database. This situation is called cache avalanche.

This is to illustrate a cache avalanche scenario. When you set the cache in batches for scheduled tasks, pay attention to the expiration time.

How to prevent avalanches

Is actually very simple, also in your batch Settings cache cache time, to set the cache time, set a Random number (such as Random number can be 10 minutes, the generation of Random Numbers can use Java’s Random generation), in this way, you won’t get a lot of key, the same time the collective failure again, look at the picture:

What if there is an avalanche?

Not a lot of traffic, the database can withstand, OK, congratulations, you survived.

The volume of traffic exceeded the limit of the number of requests the database could handle. The database was down, and congratulations on your receiving a P0 incident list.

Traffic is high. If your database has a limited-flow scheme, it will reject the request when it reaches the limit set, thus protecting the backend DB. A few more words about current limiting here.

Can be limit by setting the number of requests per second, a large number of requests to the db side, pay attention to the number of requests per second, or concurrency, not the current number of requests per second, the data can be set to query a key corresponding to the number of requests per second, the aim, is to prevent a lot of the same key request arrives at the backend database, That should intercept most of the requests.

Look at the picture and say:

In this way, the same key will be restricted to most requests, thus protecting the database DB.

In fact, there are two types of traffic limiting: local traffic limiting and distributed traffic limiting. In later articles, I will introduce local traffic limiting and distributed traffic limiting implemented by Redis.

Cache breakdown

define

Such as in a web site in seconds kill one double tenth or doing while waiting for the operational activities, so as traffic usually great, one a commodity for promotion will become critical, the flow of super large, if the goods, at this time, for some reason, within the cache invalidation, then the flow of this key moment will flock to the database, Then DB eventually can not stand, down, the consequences can be imagined ah, normal other data can not be queried.

Look at the picture and say:

The Key Huawei Pro in Redis suddenly becomes invalid. It may expire or be eliminated due to insufficient memory. In this case, a large amount of traffic requests will arrive in Redis. Now DB can’t hold up. It’s down.

How to solve

In fact, in the final analysis, we still can’t let more traffic to DB, so we just want to limit the traffic to DB.

1, the current limit

Similar to the above, it mainly restricts the traffic of a certain key. When this key is penetrated, only one traffic is restricted to the DB, and the others are rejected, or wait for the redis query again.

The diagram of current limiting can refer to the diagram of cache breakdown current limiting.

There will also be local and distributed traffic limiting.

Local traffic limiting refers to limiting the traffic of this key within the scope of a single local instance. The limit applies only to the current instance. What is distributed traffic limiting? In a distributed environment, within the scope of multiple instances, the sum of the limiting traffic of this key is the traffic from multiple instances. When the limit is reached, all instances will limit the traffic to DB.

2. Use distributed locks

Here is a simple definition of distributed lock. In the concurrent scenario, the lock needs to be used to ensure mutually exclusive access to shared resources to ensure thread safety. Similarly, in the distributed scenario, a mechanism is also needed to ensure mutually exclusive access to multi-node shared resources. The implementation mechanism is distributed locking.

Here the shared resource is huawei Pro in the example, that is, when accessing HUAWEI Pro in DB, only one thread or one traffic should be guaranteed to access, so as to achieve the effect of distributed lock.

Look at the picture and say:

To rob the lock:

A large number of requests are prepared to obtain data from DB after the key value of Huawei Pro is not obtained. At this time, the code for obtaining DB is added with a distributed lock, so each request and each thread will obtain the distributed lock of Huawei Pro (redis is used to realize the distributed lock in the figure. I will cover the implementation of distributed locking in a separate article later, not limited to Redis).

After obtaining the lock:

In this case, thread A obtains the distributed lock of Huawei Pro, then thread A goes to DB to load data, and then thread A sets Huawei Pro to cache again, and then returns data.

Other threads, there is no access to, one way is to direct return null to the client, there is a waiting for 50-100 ms, because the query db and add redis will soon, wait for right now, the query again, the results may have, if there is no direct returns null, also can try again, of course, of course, in the big concurrent scenarios, Again, you want to be able to return results quickly without too many retries.

3. Update the hotspot key on a scheduled task

A scheduled task periodically monitors the timeout period of hot keys and extends the cache time of hot keys when they are about to expire.

A single thread polling method is used to check and update the expiration time.

If a thread has a large number of hotspot keys enabled, you can use a thread pool. See the figure below:

Delay queue implementation

The above way to put it bluntly, whether single or multiple threads, thread is to adopt the way of polling (CPU) each time wasted, to check whether the key will soon expire, check will check time is not accurate, this way may lead to time delay or inaccurate, waiting for you to check the next time, I lost the key, They have made a breakdown at this time, though this happening probability is low, but also some, so how can we avoid it, we can actually use the queue delay (circular queue to do this, I am not here to speak into the queue principle, you can baidu or Google), the so-called delayed queue is you send a message to the queue, I hope to consume according to the time set by you. I will not consume until the time is up. When the time is up, I will consume.

1. Expiration time when the program is first started to obtain keys in the list. 2. Set the delay time of key consumption in turn. Note that the consumption time should be earlier than the expiration time. 3. Delay the expiration of the queue, and the consumer end consumes the key. 4. The consumer consumes the message and delays the expiration time of the key to the cache. 5. Send the new expiration time of the key to the delay queue again and wait for the next cache expiration time.

4. Set the key to not invalid

It’s also possible for a key to become obsolete because of a lack of memory, so you can think about situations where a key might become obsolete.

The cache to penetrate

define

The so-called penetrate, is to access a cache does not exist, the database does not exist key, then equivalent to traffic directly to the DB, then some rogue can use this vulnerability, crazy brush your interface, and then your DB beat, your business will not be able to run normally.

How to solve it?

1. Set null or special values

We can set a null value or a specific value to redis without expiration, so the next time we come back, we can just get the null or a special value from redis.

This solution does not solve the fundamental problem, if the traffic can mimic a large number of useless keys, no matter how many null or special values you set, then what should we do?

2. Bloom filter

The English name of the Bloomfiler is bloomfiler. Here we just make a brief introduction. Due to the space, there will be a separate article to introduce it later. For example, if we have a database with tens of millions of SKU data, our current requirement is to query Redis if the database has the SKU, and query the database if Redis doesn’t, and then update Redis. The first thing we want to do is put the SKU data into a Hashmap. The key is the SKU, because there are a lot of SkUs, so this hashmap will take up a lot of memory space, it might burst the memory, and the gain is not worth the loss. So how to save memory, we can use an array of bits to store whether the SKU exists, 0 means no, 1 means exists, We can use a hash function, and compute the hash value of sku, then sku hash value for bit modulus array to find the location of the array, and then is set to 1, when the request came, we will calculate the sku hash value corresponding to the array location whether is 1, 1 shows that there are, 0 that doesn’t exist. Bloomfiler has an error rate. You can consider increasing the array length and the number of hash functions to provide accuracy. The specific can be Baidu or Google, today here will not talk about.

Here’s a look at the process of using BloomFiler to prevent cache penetration:

Bloomfiler is initialized by a scheduled task that reads db, initializes the size of the bit array, which defaults to 0, indicating that it does not exist, calculates the array position for each hash value, and inserts it into the bit array.

Request flow, see diagram:

If you do not use the BloomFiler filter, for a key that does not exist in the database, in fact, two useless I/OS are wasted, one is to query redis and the other is to query DB. With bloomFiler, these two useless I/OS are saved and the waste of back-end Redis and DB resources is reduced.

conclusion

Today we talked about redis cache’s high frequency interview and practical problems encountered and solutions.

  • Cache avalanche

Solution:

  1. When setting the invalidation period, add a random number of time, which can be within a few minutes.
  2. And what to do if an avalanche does occur, by limiting the flow.
  • Cache breakdown

Solution:

  1. Current limiting
  2. A distributed lock
  3. Update the hotspot key regularly, focusing on the delay queue.
  4. The setting time is not invalid
  • The cache to penetrate

Solution:

  1. Set null or a specific value to redis
  2. Use bloomFiler

That’s all for today’s sharing. It’s not easy to code and draw pictures. Look forward to your likes, concerns and retweets.

Your likes and attention are the biggest motivators for hurricane creation.

If you have questions, please leave a message for discussion and errata.

Welcome to Github

Add to wechat: Zookeeper0