Redis cache: elimination mechanism, cache avalanche, data inconsistencies, all in one step

In real work projects, caching becomes a key component of a high-concurrency, high-performance architecture, so why is Redis used as a cache? The first can be regarded as two main characteristics of caching:

In the hierarchical system in memory /CPU access performance is good,
Cache data saturation, a good data elimination mechanism

Because Redis has these two characteristics naturally, Redis is based on memory operation, and it has a perfect data elimination mechanism, which is very suitable as a cache component.

Based on memory, the capacity can be 32-96 gb, and the average operation time is 100ns. The operation efficiency is high. In addition, there are many data elimination mechanisms. After Redis 4.0, there are 8 kinds of data elimination mechanisms, which make Redis as cache can be applied to many scenarios.

So why does Redis cache need a data flushing mechanism? What are the eight data elimination mechanisms?

Data elimination mechanism

The Redis cache is based on memory, so its capacity is limited. When the cache is full, what should Redis do?

Redis For the case that the cache is full, Redis needs the cache data elimination mechanism, through certain elimination rules to select some data to delete, so that the cache service can be used again. So what elimination strategies does Redis use to scrub and delete data?

After Redis 4.0, there are 6+2 Redis cache elimination strategies, including three broad categories:

Non-elimination data
- Redis does not service to return errors directly when the cache is written full.
In the key-value pair that sets the expiration time,
- Volatile -random, which randomly deletes key value pairs with expiration dates
- Volatile – TTL: Deletes the key value pairs whose expiration time is set based on the expiration time. The earlier the expiration time is, the earlier the key value pairs are deleted.
- Volatile – lRU: uses the Least Recently Used lRU algorithm to filter key/value pairs with expiration time
- Volatile – LFU: The Least Frequently Used algorithm (LFU) is Used to select the key-value pairs with expiration time, and the Least Frequently Used key-value pairs are Used to filter data.
In all of the key-value pairs,
- Allkeys-random, randomly selects and deletes data from all key-value pairs
- Allkeys-lru, which uses the LRU algorithm to filter through all data
- Allkeys-lfu, using lFU algorithm to filter all data

Note: LRU maintains a bidirectional linked list. The head and tail of the linked list represent the MRU end and the LRU end respectively, representing the most Recently Used data and the Least Recently Used data respectively.

When the LRU algorithm is actually implemented, it needs to manage all the cached data with a linked list, which incurs additional space overhead. In addition, when data is accessed, the data needs to be moved to the MRU side of the linked list. If a large amount of data is accessed, a lot of linked list movement operations will occur, which will be time-consuming and reduce Redis cache performance.

LRU and LFU are based on the LRU and Refcount attributes of Redis object structure redisObject.

typedef struct redisObject { unsigned type:4; unsigned encoding:4; // The last time the object was accessed unsigned lRU :LRU_BITS; /* LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency // reference count * and most significant 16 bits access time). */ int refcount; void *ptr; } robj;Copy the code

The LRU of Redis will use the LRU of redisObject to record the time when it was last accessed. The number of parameter Maxmemory-samples configuration is randomly selected as the candidate set, and the data with the smallest LRU attribute value is selected and eliminated.

In a real project, how do you choose a data elimination mechanism?

Allkeys-lru algorithm is preferentially selected to store the most recently accessed data in the cache to improve application access performance.
The volatile- LRU algorithm is used for the overhead data. The cache expiration time is not set for the overhead data, and the expiration time for other data is set and filtered based on the LRU rules.

Now that you understand the Redis cache flushing mechanism, how many modes does Redis have as a cache?

Redis cache mode

Redis cache mode can be divided into read-only cache and read/write cache based on whether write requests are received or not:

Read-only cache: Only read operations are handled and all update operations are in the database so there is no risk of data loss.

The Cache value mode

Read/write cache. Read/write operations are performed in the cache. Data is lost when the system is down. There are two types of synchronous and asynchronous writing data back to the database:

Synchronization: The access performance is low, but data reliability is more important
- The Read – Throug mode
- The Write – Through mode
Asynchronous: There is a risk of data loss and it focuses on providing low latency access
- The Write – Behind mode

The Cache value mode

Query data first reads data from the Cache. If the data does not exist in the Cache, the data is read from the database and updated to the Cache Cache. However, the data update operation will first update the data of the database type and then invalidate the data of the Cache type.

In addition, there are concurrent risks in Cache Aside mode: if a read fails to hit the Cache, then the database is queried for data, but the data has not been put into the Cache, and an update write invalidated the Cache, and then the read loaded the queried data into the Cache, resulting in dirty data in the Cache.

The Read/Write – Throug mode

Both query and update data are directly accessed by the cache service, which synchronously updates the data to the database. The probability of dirty data is low, but it strongly depends on the cache, which has great requirements on the stability of the cache service. However, synchronous update will lead to poor performance.

The Write Behind mode

Both query data and update data directly access the cache service, but the cache service uses asynchronous mode to update data to the database (through asynchronous tasks) fast, high efficiency, but the data consistency is poor, there may be data loss, and the implementation logic is complicated.

In the actual project development, the cache mode is selected according to the actual business scenario requirements. So with that in mind, why do we need to use the Redis cache in our application?

Using Redis cache in applications can improve system performance and concurrency, mainly in the following aspects

High performance: based on memory query, KV structure, simple logic operation
High concurrency: Mysql can only support around 2000 requests per second, while Redis easily supports more than 1W requests per second. Let more than 80% of the queries go to the cache and less than 20% of the queries go to the database, so that the system throughput has been greatly improved

Although the use of Redis cache can greatly improve the performance of the system, but the use of cache, there will be some problems, such as cache and database two-way inconsistency, cache avalanche and so on, how to solve these problems?

Common problems with caching

Using cache, there are some problems, mainly reflected in:

The cache is inconsistent with the database double-write
Cache avalanche: Redis cache is unable to handle a large number of application requests, and the shift to the database layer leads to a surge of pressure on the database layer;
Cache penetration: Access data does not exist in the Redis cache and database, resulting in a large number of access penetration cache directly to the database, resulting in a surge of pressure on the database layer;
Cache breakdown: The cache cannot process high-frequency hotspot data, leading to direct high-frequency access to the database, resulting in a surge of pressure on the database layer.

Cache and database data inconsistent

Cache Aside mode

For a Cache Aside mode, all reads occur in the Cache, data inconsistency occurs only for deletions (not for additions, because additions are only processed in the database), and when deletions occur, the Cache marks the data as invalid and updates the database. Therefore, in the process of updating the database and deleting cached values, no matter which of the two operations is executed first or last, as long as one operation fails, data inconsistency will occur.

It is concluded that the retry mechanism is used when there is no concurrency (message queue is used), and the delayed double delete is used when there is high concurrency (after the first deletion, the deletion is carried out after a certain period of sleep), as follows:

Operating sequence

High concurrency

A potential problem

The phenomenon of

Response to

Delete the cache first, then update the database

Description Cache deletion succeeded, but database update failed

Read old values from the database

Retry mechanism (database update)

Update the database first, then delete the cache

Database update succeeded, cache deletion failed. Procedure

Old value read from cache

Retry mechanism (cache deletion)

Delete the cache first, then update the database

After the cache is deleted, the database is not updated and there are concurrent read requests

Concurrent read requests read the old database value and update it to the cache, causing subsequent read requests to read the old value

Delayed Double Delete ()

Update the database first, then delete the cache

The database was updated successfully and the cache has not been deleted

Old value read from cache

The inconsistency is short-lived and has little impact on business

NOTE:

Delayed double delete pseudocode:

redis.delKey(X)

db.update(X)

Thread.sleep(N)

redis.delKey(X)

Read/Write cache (Read/ write-throug, Write Behind mode)

For read/write caches, write operations occur in the cache and then update the database. As long as one operation fails, data inconsistency will occur.

In conclusion, retry is used when there is no concurrency (message queuing is used) and distributed locking is used when there is high concurrency. Details are as follows:

Operating sequence

High concurrency

A potential problem

The phenomenon of

Response to

Update the cache first, then the database

Cache update succeeded, database update failed

The latest value is read from the cache, with little short-term impact

Retry mechanism (database update)

Update the database first, then the cache

Database update succeeded, cache update failed. Procedure

Old values are read from the cache

Retry mechanism (cache deletion)

Update the database first, then the cache

Write + read concurrency

Thread A updates the database first, then thread B reads the data, then thread A updates the cache

B hits the cache and reads the old value

A Services are temporarily affected before cache update

Update the cache first, then the database

Write + read concurrency

Thread A succeeds in updating the cache, and then thread B reads data. At this point, thread B hits the cache and returns after reading the latest value. Then thread A succeeds in updating the database

B hits the cache and reads the latest value

Business is not affected

Update the database first, then the cache

Write + write concurrency

Thread A and thread B update the same data at the same time. The database is updated in the order of A and then B, but the cache is updated in the order of B and then A, which results in inconsistency between the database and cache

Inconsistency between database and cache

Write operation with distributed lock

Update the cache first, then the database

Write + write concurrency

Thread A and thread B update the same data at the same time. The cache is updated first from THREAD A and then from thread B, but the database is updated first from thread B and then from thread A, which also results in inconsistency between the database and cache

Inconsistency between database and cache

Write operation with distributed lock

Cache avalanche

Cache avalanche. Because a large amount of data in the cache is expired or the cache goes down at the same time, a large number of application requests cannot be processed in the Redis cache, and then sent to the database layer, resulting in a surge of pressure on the database layer, and even causing database downtime.

For a large amount of data in the cache expires at the same time, resulting in a large number of requests cannot be processed, the solution is as follows:

Data preheating: Manually triggers the loading of different cache keys before large concurrent access, avoiding database query when users request data
Set different expiration times so that the cache expires as evenly as possible
The two-tier cache policy adds copy cache to the original cache. When the original cache is invalid, the copy cache can be accessed. The original cache expiration time is set to short-term, and the copy cache is set to long-term
Service degradation. When a cache avalanche occurs, different degradation schemes are used for different data, such as non-core data returning predefined information, null values, or error messages

For cache outages, solutions are as follows:

Service circuit breaker or request flow limiting mechanism is implemented in the service system to prevent database downtime caused by a large number of accesses

The cache to penetrate

Cache penetration, the data does not exist in the database and cache, so the query data, the cache can not find the corresponding key value, must go to the database again query, and then return null (equivalent to two useless query).

When there is a large number of access requests, and it bypasses the cache to directly query the database, resulting in a surge of pressure on the database layer, which may cause database downtime.

For cache penetration, the solution is:

Cache null value or default value. When a query returns null data, the null result is also cached and its expiration time is set to be short. The next access is directly taken from the cache, avoiding sending a large number of requests to the database for processing, which may cause database problems.
BloomFilter hashes all possible query data keys into a large enough bitmap. During the query, BloomFilter first checks whether the key exists. If it does not, BloomFilter directly returns the key. This avoids outages caused by a surge of pressure on the database layer.

Cache breakdown

Cache breakdown refers to the expiration and invalidity of a frequently accessed hot data, which leads to the failure of the access to be processed in the cache, and then leads to a large number of direct requests to the database, which makes the pressure on the database layer surge, and even leads to database downtime.

For cache breakdown, the solution is:

Do not set the expiration time. For hotspot data that is frequently accessed, do not set the expiration time.

conclusion

In most business scenarios, the Redis cache is used as a read-only cache. For read-only caches, updating the database first and then deleting the cache is preferred to ensure data consistency.

Among them, cache avalanche, cache penetration, cache breakdown three causes and solutions

The problem

why

The solution

Cache avalanche

Large amounts of data expire simultaneously and the cache goes down

Data preheating Set different expiration times Two-layer cache policy Service degradation service fusing current limiting mechanism

The cache to penetrate

The data does not exist in the database or cache

Cache null or default values BloomFilter

Cache breakdown

Hotspot data that is frequently accessed expires

For the hotspot data that is accessed frequently, the expiration time is not set

Source: segmentfault.com/a/119000003…

Redis cache: elimination mechanism, cache avalanche, data inconsistencies, all in one step

Data elimination mechanism

Redis cache mode

The Cache value mode

The Read/Write – Throug mode

The Write Behind mode

Common problems with caching

Cache and database data inconsistent

Cache avalanche

The cache to penetrate

Cache breakdown

conclusion

Related Posts

[A Thought against hackers] Some of the most unusual address request headers in Http

Causes of an HTTP2 communication failure

Use YOLO(You Only Look Once) to train and recognize grayscale images (single channel)