Hello everyone, I am the breeze! I’ve shared it beforeRedis high concurrency scenario design, interview ask all in this!andAn Interview guide for Java Engineers at A Major Internet company — RedisToday, talk to your partner about the frequently asked points (cache penetration, avalanche and other questions)! Feel good friends can focus on like, thank you for your support!

Writing in the front

A detailed and illustrated review guide is available on Github to guide you to the most advanced Java back-end interview questions you must have in 2020. Making address:Github.com/Java-Ling/J…

One, cache avalanche

1. What is a cache avalanche?

Cache avalanche means that the same expiration time is used when we set the cache, which causes the cache to expire at a certain time. All requests are forwarded to DB, and the DB is overloaded and avalanches instantly. Due to the invalidity of the original cache, all the requests that should have accessed the cache went to query the database before the arrival of the new cache, which caused great pressure on the DATABASE CPU and memory, and seriously caused the database to break down.

2. Check the cache avalanche problem

  1. In a short period of time, more key sets in the cache expire
  2. In this period, Redis requests to access expired data, but redis does not hit the data. Redis obtains data from the database
  3. The database receives a large number of requests at the same time and cannot be processed in a timely manner
  4. A large number of Redis requests are backlogged and the timeout phenomenon begins
  5. Database traffic surges, database crashes
  6. No data is still available in the cache after the restart
  7. The Redis server resources were seriously occupied and the Redis server crashed
  8. The Redis cluster collapses. The cluster collapses
  9. The application server cannot receive data in time to respond to requests. As the number of requests from clients increases, the application server crashes
  10. The application server, Redis and database are all restarted, but the effect is not satisfactory

3. What are the solutions to prevent cache avalanches?

  1. More page statics
  2. Build a multi-level cache architecture: Nginx cache + Redis cache + EhCache
  3. Detect services that consume too much time. Optimize services to eliminate database bottlenecks, such as timeout queries and time-consuming transactions
  4. Disaster warning mechanism
    • Monitor performance indicators of the Redis server
    • CPU usage and CPU usage
    • Memory capacity
    • Query the average response time
    • The number of threads
  5. Traffic limiting and degradation In a short period of time, some customer experience is sacrificed, some request access is restricted, and the pressure on application servers is reduced. Access is gradually released when services run slowly
  6. Switch between the LRU and LFU 2. Adjust the data validity period policy
    • According to the validity period of service data, the peak deviation is 90 minutes for class A, 80 minutes for class B, and 70 minutes for class C
    • Expiration time is in the form of fixed time + random value to dilute the number of expired keys in the set
  7. The permanent key is used for super hot data
  8. Regular maintenance (automatic + manual) analyzes the traffic volume of the data about to expire to confirm whether it is delayed, and makes the delay of hot data with the statistics of traffic volume 5. lock

4. To summarize

A cache avalanche is when the amount of data that expires instantly becomes so large that it strains the database server. If the expiration time is not set, avalanche (about 40%) can be effectively avoided. This policy can be used together with other policies. In addition, the system monitors the server running data and makes quick adjustments according to the running records.

Two, cache warm-up

1. What is cache warm-up

Cache warm-up is to load the relevant cache data directly into the cache system after the system goes online. This avoids the problem of first querying the database and then caching the data when the user requests it. The user directly queries the cached data that has been preheated. As shown in the figure:

If it is not preheated, the Redis initial status data will be empty. At the early stage of the system going online, the high concurrent traffic will be accessed to the database, resulting in traffic pressure on the database.

2. Troubleshoot the problem

  1. High number of requests
  2. The data throughput between the master and slave is high, and the data synchronization frequency is high

3. What’s the solution?

Preparatory work:

  1. Collect routine data access records, including hotspot data that is frequently accessed

  2. Using LRU data deletion strategy, build data retention queue

    For example: Storm works with KafkaCopy the code

Preparations: 3. Classify the data in the statistics result. According to the level, Redis preferentially loads the hotspot data with higher level. Distributed multiple servers are used to read data at the same time to speed up the data loading process. 5. The master and slave of hot data are preheated at the same time

Implementation: 6. Use scripts to trigger the data preheating process. 7. If conditions permit, use CDN (Content distribution network) for better results

4. To summarize

Cache warm-up is to load related cache data directly into the cache system before the system starts. Avoid the problem of querying the database and then caching the data when the user requests it! The user directly queries the cached data that has been preheated

Three, cache penetration

1. What is cache penetration?

Cache penetration is when a user queries data that is not present in the database and certainly not in the cache. As a result, when the user queries, the corresponding key value cannot be found in the cache. Each time, the user has to go to the database to query again, and then returns empty (equivalent to two useless queries). The request bypasses the cache and goes directly to the database

2. What is the solution to prevent cache penetration?

1. Cache empty values

If a query returns empty data (whether the data does not exist or the system fails) we will still cache the empty result, but the expiration time will be very short, no more than 5 minutes. The default value is stored in the cache with this setting, so that the second fetch in the cache will have the value instead of continuing to access the database

2. Use the BloomFilter

** Advantages: ** small memory space, bit storage; High performance. Use the hash of the key to determine whether the key exists

All possible data is hashed into a large enough bitmap, so that a certain non-existent data will be intercepted by the bitmap, thus avoiding query pressure on the underlying storage system

A layer of BloomFilter is added before the cache. During the query, BloomFilter is first used to query whether the key exists. If it does not exist, it is returned directly

3. Summary

Cache breakdown to access non-existent data, skipping the redis data cache phase of legitimate data, each access to the database, resulting in pressure on the database server. Usually, the amount of such data is a low value. When such a situation occurs, it is necessary to timely alarm. The response strategy should be more in the area of temporary contingency prevention. Whether blacklist or whitelist, is the pressure on the whole system, after the alarm is removed as soon as possible.

4. Cache degradation

In the case of degradation, which is when the cache is invalidated or the cache service is down, we do not access the database. We can either directly access the partial data cache in memory or directly return the default data.

For example:

The home page of an application is usually a place with a large number of visits. It often contains the display information of some recommended products. These recommended products will be stored in the cache. Meanwhile, in order to avoid the abnormal situation in the cache, we also store the data of hot products in the memory. At the same time, some default product information is kept in memory. As shown in the figure below:

Downgrading is usually a lossy operation, so minimize the impact of downgrading on the business.

Five, cache breakdown

1. What is a cache breakdown?

In a normal high-concurrency system, when a large number of requests query a key at the same time, the key happens to be invalid, resulting in a large number of requests to the database. This phenomenon is called a cache breakdown

2. Troubleshooting

  1. A key in Redis expires and the number of visits to the key is large
  2. Multiple data requests were pressed directly from the server to Redis, but none was hit
  3. Redis initiates a large number of access to the same data in the database in a short period of time

3. How to solve it

1. Use a mutex key

The solution is simple: let one thread build the cache, and the other threads wait for the building thread to finish and then retrieve the data from the cache. Synchronized or lock can be used for single-node applications, and distributed locks can be used for distributed environments (memcache’s Add, Redis’s setnx, ZooKeeper’s add).

2. Use mutex keys “in advance”

Within value, set a timeout value (timeout1), which is smaller than the actual Redis timeout(timeout2). When timeout1 is read from the cache and found to have expired, it is immediately extended and reconfigured to the cache. The data is then loaded from the database and set to the cache

3. “Never Expire”

  • Redis does not have an expiration date, which ensures that hot keys do not expire.
  • Functionally, if it’s not expired, isn’t it static? Therefore, we store the expiration time in the value corresponding to the key. If it is found to be expired, the cache is built through a background asynchronous thread, which is the “logical” expiration

4. Cache barriers

class MyCache{

    private ConcurrentHashMap<String, String> map;

    private CountDownLatch countDownLatch;

    private AtomicInteger atomicInteger;

    public MyCache(ConcurrentHashMap
       
         map, CountDownLatch countDownLatch, AtomicInteger atomicInteger)
       ,> {
        this.map = map;
        this.countDownLatch = countDownLatch;
        this.atomicInteger = atomicInteger;
    }

    public String get(String key){

        String value = map.get(key);
        if(value ! =null){
            System.out.println(Thread.currentThread().getName()+"\t thread gets value value="+value);
            return value;
        }
        // If no value is obtained
        // First try to obtain the token, then query the DB, initialize the cache;
        // If the token is not obtained, the wait times out
        if (atomicInteger.compareAndSet(0.1)){
            System.out.println(Thread.currentThread().getName()+"\t thread gets token");
            return null;
        }

        // Other threads time out to wait
        try {
            System.out.println(Thread.currentThread().getName()+"\t thread did not obtain token, waiting...");
            countDownLatch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        // Cache initialization succeeded, waiting for thread to wake up
        // The waiting thread will wake up automatically
        System.out.println(Thread.currentThread().getName()+"\t thread wakes up to get value ="+map.get("key"));
        return map.get(key);
    }

    public void put(String key, String value){

        try {
            Thread.sleep(2000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        map.put(key, value);

        // Update the status
        atomicInteger.compareAndSet(1.2);

        // Notify other threads
        countDownLatch.countDown();
        System.out.println();
        System.out.println(Thread.currentThread().getName()+"\t thread initializes cache successfully! value ="+map.get("key")); }}class MyThread implements Runnable{

    private MyCache myCache;

    public MyThread(MyCache myCache) {
        this.myCache = myCache;
    }

    @Override
    public void run(a) {
        String value = myCache.get("key");
        if (value == null){
            myCache.put("key"."value"); }}}public class CountDownLatchDemo {
    public static void main(String[] args) {

        MyCache myCache = new MyCache(new ConcurrentHashMap<>(), new CountDownLatch(1), new AtomicInteger(0));

        MyThread myThread = new MyThread(myCache);

        ExecutorService executorService = Executors.newFixedThreadPool(5);
        for (int i = 0; i < 5; i++) { executorService.execute(myThread); }}}Copy the code

4. To summarize

Cache breakdown refers to the moment when a single high-heat data expires. The data access volume is large. After the Redis is not hit, a large number of database accesses to the same data are initiated, resulting in pressure on the database server. The coping strategy should be carried out in the aspects of service data analysis and prevention, in conjunction with the operation monitoring test and real-time adjustment strategy. After all, it is difficult to monitor the expiration of a single key, and it should be combined with the avalanche processing strategy.

Six, summarized

These are some problems that may be encountered in the actual project, and also the knowledge points that are often asked in the interview. In fact, there are many, many kinds of problems. The solution in the paper is not possible to meet all the scenarios, and is relatively only an introductory solution to the problem. General formal business scenarios are often much more complex, different application scenarios, methods and solutions are also different, because of the above scheme, the problem is not very comprehensive, so it is not suitable for formal project development, but can be used as a concept to understand the entry, the specific solution to determine according to the actual situation!