The cache to penetrate

  • Cache penetration is a query for data that does not exist and is not hit by either the cache layer or the storage layer. Usually for fault tolerance reasons, if the data cannot be retrieved from the storage layer, it is not written to the cache layer. Cache penetration will cause non-existent data to be queried in the storage tier every time a request is made, losing the significance of caching to protect back-end storage. There are two basic reasons for cache penetration: first, there is a problem with its own business code or data. Second, some malicious attacks, crawlers and so on caused a large number of empty hits.

Cache penetration solution:

1. Cache empty objects

String get(String key) {  // Retrieve data from the cache
String cacheValue = cache.get(key); // The cache is empty
if (StringUtils.isBlank(cacheValue)) { 
// Fetch from storage
String storageValue = storage.get(key); 
cache.set(key, storageValue); 
// Set an expiration time (300 seconds) if the stored data is empty.
if (storageValue == null) { 
cache.expire(key, 60 * 5); 
} 
return storageValue; 
} else { 
// The cache is not empty
    returncacheValue; }}Copy the code

2, Bloom filter

  • For malicious attacks, requests to the server for a large number of non-existent data caused by the cache penetration, you can also use the Bloom filter to do a filtering, for non-existent data bloom filter can generally filter out, do not let the request sent back end. When a bloom filter says a value exists, it may not exist; When it says it doesn’t exist, it definitely doesn’t exist.

  • A Bloom filter is a large array of bits and several different unbiased hash functions. Unbiased is the ability to compute the hash values of elements evenly.
  • When a key is added to a Bloom filter, multiple hash functions are used to hash the key to an integer index value and then modulo the length of the bit array to get a position. Each hash function evaluates a different position. Add by setting each of these bits to 1.
  • When you ask a Bloor filter for the existence of a key, like add, it calculates all the hash positions to see if all the bits in the hash array are 1. If any of the bits is 0, the bloor filter does not contain the key. If they’re all 1’s, that doesn’t necessarily mean that the key exists, it just means that it probably does, because these bits are set to 1’s because other keys exist. If the bit array is sparse, the probability is high, and if the bit array is crowded, the probability is low. This method is suitable for the application scenarios where the data hit is not high, the data is relatively fixed, the real-time performance is low (usually the data set is large), the code maintenance is complex, but the cache space is small
  • Note: The Bloom filter cannot delete data. To do so, the data must be reinitialized.

Cache failure (breakdown)

  • Because a large number of cache failures at the same time may lead to a large number of requests penetrating the cache to the database at the same time, which may cause the database to be overburdened or even hang up instantly. In this case, it is best to set the cache expiration time of this batch of data to a different time within a period when we increase the cache in batches.
  • Sample pseudocode:
String get(String key) { 
// Retrieve data from the cache
String cacheValue = cache.get(key);
// The cache is empty
if (StringUtils.isBlank(cacheValue)) { 
    // Fetch from storage
    String storageValue = storage.get(key);
    cache.set(key, storageValue); 
    // Set an expiration time (a random number between 300 and 600)
    int expireTime = new Random().nextInt(300) + 300; 
    if (storageValue == null) {
        cache.expire(key, expireTime); 
    } 
    return storageValue; 
} else {  
// The cache is not empty
    returncacheValue; }}Copy the code

Cache avalanche

  • A cache avalanche is when the cache layer fails or goes down and traffic hits the back-end storage layer like a runaway buffalo.
  • Because the cache layer carries a large number of requests, it effectively protects the storage layer. However, if the cache layer cannot serve for some reason (for example, it is too large and sends over, the cache layer cannot support, or the concurrency supported by the cache decreases dramatically due to poor cache design, such as a large number of requests accessing bigkey), As a result, a large number of requests will be sent to the storage layer, and the call usage of the storage layer will soar, causing the storage layer to cascade down.
  • There are three ways to prevent and solve cache avalanche.
  • 1) Ensure high availability of cache layer services, such as using Redis Sentinel or Redis Cluster.
  • 2) Dependent isolation components fuse and degrade for back-end current limiting. Such as using Sentinel or Hystrix stream limiting degradation components. For example, service degradation, we can treat different data differently. When a business application accesses non-core data (such as e-commerce product attributes, user information, etc.), it temporarily stops querying these data from the cache and directly returns predefined default degradation information, null value or error message. When the business application is accessing core data (such as e-commerce inventory), the query cache is still allowed, and if the cache is missing, it can continue to be read from the database.
  • 3) Rehearse in advance. Before the project goes online, the load situation of the application and back-end and possible problems after the breakdown of the cache layer are rehearsed, and some pre-plan Settings are made on this basis.

Hotspot cache key reconstruction optimization

  • Developers use a “cache + expiration time” strategy to speed up data reads and writes while ensuring that data is updated on a regular basis, which is basically enough to meet most requirements. But there are two problems that can be deadly if they happen together:
  • The current key is a hot key (for example, a hot entertainment news key) with a large number of concurrent requests.
  • Rebuilding the cache cannot be done in a short time and may be a complex calculation, such as complex SQL, multiple IO, multiple dependencies, etc.
  • In the event of a cache failure, there are a large number of threads to rebuild the cache, resulting in increased back-end load, and may even crash the application.
  • The main solution to this problem is to avoid a large number of threads rebuilding the cache simultaneously.
  • We can solve this problem by using a mutex, which allows only one thread to rebuild the cache, while the other threads wait for the rebuild thread to finish and retrieve data from the cache again.
  • Sample pseudocode:
 String get(String key) { 
     // Get data from Redis
     String value = redis.get(key); 
     // If value is null, the cache reconstruction begins
     if (value == null) { 
     // Only one thread is allowed to rebuild the cache, using nx and setting the expiration time ex
     String mutexKey = "mutext:key:" + key;
 if (redis.set(mutexKey, "1"."ex 180"."nx")) { 
     // Get data from the data source
     value = db.get(key); 
     // Write back to Redis and set the expiration time
     redis.setex(key, timeout, value); 
     / / remove key_mutex
     redis.delete(mutexKey); 
 }
 // Other threads rest for 50 ms and retry
 else { 
     Thread.sleep(50); get(key); }}return value;
 }
Copy the code

The cache is inconsistent with the database double-write

  • In the case of large concurrency, operating the database and cache at the same time can cause data inconsistency problems
  • 1, double write inconsistent situation

  • 2. Inconsistent read/write concurrency

Solution:

  • 1. For data with very small concurrency probability (such as personal dimension order data, user data, etc.), it is almost unnecessary to consider this problem, and cache inconsistency rarely occurs. You can add expiration time to cache data and trigger active update of read at intervals.
  • 2, even if the concurrency is high, if the business can tolerate a short period of cache data inconsistency (such as item name, item category menu, etc.), the cache plus expiration time can still solve the requirements of most businesses for cache.
  • 3, if you can’t tolerate inconsistent cache data, you can add read/write locks to ensure that concurrent read/write or write lines up in order, which is equivalent to no lock when reading.
  • 4. Ali’s open source Canal can also be used to timely modify the cache by monitoring the binlog of the database, but new middleware is introduced to increase the complexity of the system

  • Conclusion:
  • Above we are aimed at the situation of read more write less add cache to improve performance, if the situation of write more read and can not tolerate inconsistent cache data, then there is no need to add cache, you can directly operate the database. The data in the cache should not have high requirements for real-time and consistency. Don’t over-design and over-control your system in order to use caching and still ensure absolute consistency.

Redis has three cleanup strategies for expired keys:

  1. Passive deletion: When an expired key is read or written, the lazy deletion policy is triggered and the expired key is deleted directly
  2. Active deletion: Because the lazy deletion policy cannot ensure that cold data is deleted in a timely manner, Redis periodically actively removes a batch of expired keys
  3. When the current used memory exceeds the maxMemory limit, the active cleanup policy is triggered
  • Before Redis 4.0, a total of 6 memory flushing strategies were implemented. After 4.0, 2 more strategies were added, totaling 8 strategies:

A) Handle the key with the expiration time set:

  1. Volatile – TTL: Deletes the key value pairs with expiration time. The earlier the key value pairs expire, the earlier the key value pairs are deleted.
  2. Volatile -random: As its name suggests, random deletion of key/value pairs with expiration dates.
  3. Volatile – LRU: Deletes key and value pairs with expiration dates filtered using the LRU algorithm.
  4. Volatile – lFU: the lFU algorithm is used to filter and delete key/value pairs whose expiration time is set.

B) Handle all keys:

  1. Allkeys-random: Randomly selects and deletes data from all key-value pairs.
  2. Allkeys-lru: Filters and deletes all data using the LRU algorithm.
  3. Allkeys-lfu: Filters and deletes all data using the LFU algorithm.

C) Not processed:

  1. Noeviction: Won’t exclude any data, reject all write operations and return client error message “(error) OOM command not allowed when used memory”, Redis will only respond to read operations.

LRU algorithm (Least Recently Used)

  • Weeding out data that has not been accessed for a long time, using the last accessed time as a reference.

LFU algorithm (Least Frequently Used, Least Frequently Used)

  • Weed out the data that has been accessed least frequently in the recent period and use the number as a reference.
  • The efficiency of LRU is very good when there is hot data, but the occasional and periodic batch operation will lead to a sharp drop in LRU hit ratio and serious cache contamination. It might be better to use LFU.
  • Maxmemory-policy (noeviction by default) is configured based on its business type, volatile- LRU is recommended. If the maximum memory is not set, when the Redis memory exceeds the physical memory limit, the data in the memory will start to swap frequently with the disk, which will cause the performance of Redis to decline sharply.
  • When Redis is running in master/slave mode, only the master node performs an expired delete policy and then steps the delete operation “del key” to delete data from the slave node.