1. The background

This article is to go to the technology salon last week to listen to iQiyi Java cache road feeling out. Let’s first briefly introduce the development of IQiyi’s Java cache path.

  • Phase 1: Data synchronization plus Redis

The advantage of this phase is that the data is synchronized to Redis through the message queue, and then the Java application directly retrieves the cache. The distributed cache is used, so the data updates quickly. The disadvantages are also obvious: the stability of Redis is dependent, once Redis fails, the entire cache system is unavailable, causing a cache avalanche and all requests to DB.

  • Phase two and three :JavaMap to Guava Cache

This phase uses in-process cache as level 1 cache and Redis as level 2 cache. Advantage: not affected by external systems, other systems are down, still can use. Disadvantages: In-process caches cannot be updated in real time like distributed caches. Because Java memory is limited, the cache must be set to size, and then some of the caches will be eliminated, resulting in hit ratio problems.

  • Phase 4: Guava Cache refresh

To solve the above problem, you can set the refresh time after write using Guava Cache. Fixed the problem of not updating all the time, but still did not solve the real-time refresh.

  • Phase 5: External cache asynchronous refresh

This phase extends the Guava Cache to use Redis as a message queue notification mechanism to notify other Java applications of a refresh.

Here is a brief overview of the five stages of iQiyi’s cache development, as well as some other optimizations such as GC tuning, cache penetration, some optimizations for cache coverage, and so on. Interested students can follow the public account, contact me for communication.

Primitive society – Check the database

What is said above is an evolution line of iQiyi, but in your general development process, the first step is generally not Redis, but directly check library.

When the traffic is not large, the database or read files is the most convenient, but also can fully meet our business requirements.

Ancient Society – HashMap

When our application has a certain amount of traffic or queries to the database are particularly frequent, we can invoke our own Java HashMap or ConcurrentHashMap. We can write this in our code:

public class CustomerService {
    private HashMap<String,String> hashMap = new HashMap<>();
    private CustomerMapper customerMapper;
    public String getCustomer(String name){
        String customer = hashMap.get(name);
        if ( customer == null){
            customer = customerMapper.get(name);
            hashMap.put(name,customer);
        }
        returncustomer; }}Copy the code

But the problem with this is that HashMap can’t be weeded out, and the memory grows indefinitely, so HashMap quickly becomes obsolete. Of course, this is not to say that it is completely useless, just as not everything in our ancient society is obsolete. For example, the traditional virtues of our Chinese people are never outdated. This hashMap can be used as a cache in certain situations, when the elimination mechanism is not needed, for example, we use reflection, If we had to search for methods and fields every time through reflection, the performance would be inefficient, so we can cache them with HashMap, and the performance would be much better.

Modern society – LRUHashMap

The problem that hobbled us in ancient societies did not allow data obsolescence, which would lead to infinite memory expansion, which is clearly unacceptable to us. Some people say THAT I put some data to eliminate bai, so not right, but how to eliminate? Random selection? Of course not, just imagine that you just load A into the cache, the next time you want to access it, it will be eliminated, and then it will access our database, so why do we need to cache?

So smart people invented several elimination algorithms, here are the three common FIFO,LRU,LFU (and some ARC,MRU interested in can search) :

  • FIFO: First in, first out. In this elimination algorithm, those who enter the cache first are eliminated. This is the simplest, but it leads to a low hit rate. Imagine if we had a very frequently accessed data that was accessed first of all, and those that were not very frequently accessed later, that would crowd out our first data that was frequently accessed.
  • LRU: Least recently used algorithm. In this algorithm, the above problems are avoided. Every time we access the data, it will be placed at the end of our queue. If we need to eliminate the data, we only need to eliminate the first one. However, there is still a problem with this. If a data is accessed 10,000 times in the first 59 minutes of an hour (it can be seen that this data is hot data), and the data is not accessed in the next minute, but other data are accessed, our hot data will be eliminated.
  • LFU: indicates the least recent frequency of use. In this algorithm, the above is optimized by using extra space to record the usage frequency of each data, and then the lowest frequency is selected for elimination. This avoids the problem of LRU not being able to handle time periods.

There are three elimination strategies listed above, and for each of them, the implementation cost is higher than the other, and the hit rate is better than the other. And we generally choose the program in the middle, that is, the cost is not too high, and the hit rate is also good LRU, how to achieve a LRUMap? We can create a simple LRUMap by inherits LinkedHashMap and overwriting the removeEldestEntry method.

class LRUMap extends LinkedHashMap { private final int max; private Object lock; Public LRUMap(int Max, Object lock) {super((int) (Max * 1.4f), 0.75f,true); this.max = max; this.lock = lock; } /** * Override the removeEldestEntry method of the LinkedHashMaptrueAnd you delete the oldest * @param eldest * @return
         */
        @Override
        protected boolean removeEldestEntry(Map.Entry eldest) {
            return size() > max;
        }

        public Object getValue(Object key) {
            synchronized (lock) {
                return get(key);
            }
        }
        public void putValue(Object key, Object value) {
            synchronized (lock) {
                put(key, value);
            }
        }

       

        public boolean removeValue(Object key) {
            synchronized (lock) {
                returnremove(key) ! = null; } } public booleanremoveAll(){
            clear();
            return true; }}Copy the code

A LinkedHashMap maintains a linked list of entries (objects used to hold keys and values). Each get or PUT puts the new entry inserted or the old entry queried at the end of our list. Note that in the constructor, we set the size of the map to Max *1.4. In the removeEldestEntry method below, we only need size> Max to remove the map, so we will never be able to expand the map. We implemented our LruMap in a few simple ways.

Modern society – Guava Cache

LRUMap has been invented in recent times to eliminate cached data, but there are several problems:

  • Lock contention is serious, you can see in my code, Lock is a global Lock, above the method level, when the call volume is large, the performance is bound to be low.
  • Expiration time is not supported
  • Automatic refresh is not supported

So the guys at Google couldn’t help but invent Guava Cache. You can use Guava Cache as easy as:

public static void main(String[] args) throws ExecutionException { LoadingCache<String, String> cache = cacheBuild.newBuilder ().maximumSize(100) // expireAfterWrite(30L, ExpireAfterAccess (30L, TimeUnit.MILLISECONDS) // Refresh (20L, TimeUnit. TimeUnit.MILLISECONDS) // When garbage collection is enabled, this cache is also reclaimed. WeakKeys ().build(createCacheLoader()); System.out.println(cache.get("hello"));
        cache.put("hello1"."I am hello1");
        System.out.println(cache.get("hello1"));
        cache.put("hello1"."I am hello2");
        System.out.println(cache.get("hello1"));
    }
    public static com.google.common.cache.CacheLoader<String, String> createCacheLoader() {
        return new com.google.common.cache.CacheLoader<String, String>() {
            @Override
            public String load(String key) throws Exception {
                returnkey; }}; }Copy the code

I will explain how guava Cache solves several problems of LRUMap from the guava Cache principle.

Lock contention

Guava Cache uses the same idea as ConcurrentHashMap. It is locked in segments, and each segment is responsible for its own elimination. In Guava according to certain algorithm segmentation, it must be clear, if too little that competition is still very serious, if too much will be prone to random selection, such as the size of 100, to 100, he points that also is to let each data are dominant in a segment, and each section will handle process of elimination, so there will be a random selection. Figure out how to segment in guava Cache with the following code.

    int segmentShift = 0;
    int segmentCount = 1;
    while(segmentCount < concurrencyLevel && (! evictsBySize() || segmentCount * 20 <= maxWeight)) { ++segmentShift; segmentCount <<= 1; }Copy the code

The above segmentCount is our last segment number, which guarantees that each segment has at least 10 entries. If concurrencyLevel is not set, the default value is 4, and the number of segments is at most 4. For example, if size is 100, there will be 4 segments, and the maximum size of each segment is 25. The guava cache locks write operations directly. For read operations, if the read data has not expired and is ready to be loaded, it does not need to be locked. If the read data has not expired, it will be locked again for a second read. What I have configured here is to return the Key directly. In business, it is usually configured to query from the database. As shown in the figure below:

Expiration time

ExpireAfterWrite (expireAfterAccess) expireAfterWrite (read) expireAfterAccess (read) expireAfterAccess (read) It is interesting to note that expired entries in the Guava cache are not immediately expired (i.e. no background thread is scanning). Instead, they are expired during read/write operations. This has the advantage of avoiding global locking while the background thread is scanning. Look at the following code:

public static void main(String[] args) throws ExecutionException, InterruptedException { Cache<String, String> cache = cacheBuild.newBuilder ().maximumSize(100) // expireAfterWrite(5, TimeUnit.MILLISECONDS) .concurrencyLevel(1) .build(); cache.put("hello1"."I am hello1");
        cache.put("hello2"."I am hello2");
        cache.put("hello3"."I am hello3");
        cache.put("hello4"."I am hello4"); Thread. Sleep (5); System.out.println(cache.size()); cache.put("hello5"."I am hello5"); System.out.println(cache.size()); } output: 4 1Copy the code

From this result we know that the expiration processing is only done at put time. Note that I set concurrencyLevel(1) to a maximum of 1, otherwise this effect would not occur. As mentioned in the previous section, we expire in segment units. Two queues are maintained in each Segment:


    final Queue<ReferenceEntry<K, V>> writeQueue;

  
    final Queue<ReferenceEntry<K, V>> accessQueue;

Copy the code

WriteQueue maintains a writeQueue, with the head of the queue representing the data written earlier and the tail of the queue representing the data written later. The accessQueue maintains the accessQueue, which, like the LRU, is used to eliminate access time. If the Segment exceeds the maximum capacity, such as 25, the first element of the accessQueue will be eliminated.

void expireEntries(long now) {
      drainRecencyQueue();

      ReferenceEntry<K, V> e;
      while((e = writeQueue.peek()) ! = null && map.isExpired(e, now)) {if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
          throw new AssertionError();
        }
      }
      while((e = accessQueue.peek()) ! = null && map.isExpired(e, now)) {if(! removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) { throw new AssertionError(); }}}Copy the code

Above is the process of guava Cache processing expired Entries. Peek operation will be performed on both queues at one time, and if expired Entries are deleted. Generally, we can process expired Entries before and after the PUT operation, or when reading data, and then process the expiration of the entire Segment, or during the second read of the lockedGetOrLoad operation.

void evictEntries(ReferenceEntry<K, V> newest) { ///... Omit useless codewhile (totalWeight > maxSegmentWeight) {
        ReferenceEntry<K, V> e = getNextEvictable();
        if(! removeEntry(e, e.getHash(), RemovalCause.SIZE)) { throw new AssertionError(); /** ** return entry for accessQueue **/ ReferenceEntry<K, V>getNextEvictable() {
      for (ReferenceEntry<K, V> e : accessQueue) {
        int weight = e.getValueReference().getWeight();
        if (weight > 0) {
          return e;
        }
      }
      throw new AssertionError();
    }
Copy the code

The above is the code when we expel the Entry, and you can see that the accessQueue is accessing its queue head to expel it. The eviction strategy is usually called when the element in the segment changes, such as insert operation, update operation, and load data operation.

Automatically refresh

The automatic refresh operation is relatively simple to implement in Guava cache. It can be directly queried to determine whether it meets the refresh conditions and then refresh.

Other features

There are some other features in Guava Cache:

Phantom reference

In the Segment, there are two reference queues in the Guava cache:

    final @Nullable ReferenceQueue<K> keyReferenceQueue;

  
    final @Nullable ReferenceQueue<V> valueReferenceQueue;
Copy the code

These two queues are used to record reclaimed references, and each queue records the hash of each reclaimed Entry, so that the previous Entry can be deleted by the hash value in the queue after reclamation.

Delete listener

In guava cache, when a piece of data is obsolete, but you don’t know whether it expired, was expelled, or was recycled because of a virtual reference object? At this point, you can call the removalListener(removalListener listener) method to add a listener for data cull, which can be used to log or some other processing, can be used for data cull analysis.

In RemovalCause records all the reasons for being eliminated: deleted by user, replaced by user, expired, evicted collection, due to size obsolescence.

Summary of Guava Cache

Guava Cache is an API rich LRU Map with good performance. Iqiyi’s cache development is also based on this. Through secondary development of Guava cache, it can update the cache between Java application services.

Into the future – Caffeine

Guava Cache is powerful enough to meet the needs of most people, but it is essentially a layer of encapsulation for LRU, so it is dwarfed by many other better elimination algorithms. The Caffeine Cache implements W-TinylFU (a variant of the LFU+LRU algorithm). Here is a comparison of the hit ratios of different algorithms:

Optimal is the Optimal hit ratio, and LRU is a smaller player than the other algorithms. And our W-Tinylfu is the closest to the ideal hit rate. Of course, not only is the hit ratio better than the Guava cache, but the read/write throughput is also better than the Guava cache.

At this point, you may wonder why caffeine is so awesome. Don’t worry. I’ll tell you all about it.

W-TinyLFU

I’ve already talked about what a traditional LFU is. As long as the probability distribution of the data access pattern remains constant over time in THE LFU, the hit ratio can become very high. Here, I will take iQiyi as an example. For example, when a new play came out, we used LFU to cache it. The new play was visited hundreds of millions of times in these days, and the frequency of this visit was also recorded in our LFU. However, the new series will always be out of fashion. For example, the first few episodes of the new series will be out of fashion after a month, but its page view is really too high for other TV series to eliminate the new series, so there are limitations in this mode. So various variations of the LFU have emerged, decaying based on time periods, or frequencies within the most recent time period. Similarly, the LFU uses extra space to record the frequency of each data access, even if the data is not in the cache, so the extra space to maintain is large.

If you create a hashMap of this maintenance space, every data item will be stored in this hashMap, and when the amount of data is very large, the hashMap will be very large.

Going back to LRU, our LRU is not so bad. It can handle bursts of traffic very well because it does not need to accumulate data frequencies.

So W-TinylFU combines some characteristics of LRU and LFU, as well as other algorithms.

Frequency record

The first thing to talk about is frequency recording. What we want to achieve is to use limited space to record the frequency of access over time. In W-Tinylfu we use the Count-min Sketch to record our access frequency, which is also a variant of the Bloom filter. As shown in the figure below:

Here’s a quick comparison to the previous one: If I have a hashMap to record this frequency, if I have 100 data, then the hashMap has to store 100 access frequencies of this data. Even if the capacity of my cache is 1, because of the Lfu rules, I have to record all the access frequency of the 100 data. If I had more data I would have more records.

In The Count-Min Sketch, and I’ll go straight to the caffeine implementation (in FrequencySketch), if your cache size is 100, it will generate a long array whose size is the nearest 100 to the power of 2, which is 128. And this array is going to keep track of our access frequency. In caffeine, it rules up to 15,15 binary bits 1111 for a total of 4 bits, while Long is 64 bits. So each Long can hold 16 algorithms, but Caffeine doesn’t do that. Instead, it uses only four hash algorithms, and each Long is divided into four segments, each of which holds the frequencies of the four algorithms. The advantage of this is that you can reduce Hash collisions even further, so instead of a 128 size Hash, you have a 128 x4 Hash.

The structure of a Long is as follows:

  1. First, determine which segment the hash of 50 is in. Hash & 3 must yield A number less than 4, assuming hash & 3=0, which is in segment A.
  2. Hash the hash of 50 using another hash algorithm to get the position of the long array. Let’s say we get 1 with S1, 3 with S2, 4 with S3, and 0 with S4.
  3. And then we add 1 at s1 in segment A of long[1], 1As1 plus 1, then we add 1 at 3As2, we add 1 at 4As3, and we add 1 at 0As4.

At this point, one might question whether the maximum frequency of 15 is too small? It doesn’t matter. In this algorithm, for example, if size is equal to 100, if it is globally promoted 1000 times, it will be globally divided by 2 for attenuation, and can continue to increase after attenuation. This algorithm has been proved in w-Tinylfu’s paper that it can better adapt to the access frequency of the time period.

Read and write performance

In the Guava Cache, we said that the read and write operations are interlocked with the expiration time, that is, you may also have to do an obsoleted operation during a Put operation, so the read and write performance is affected. As you can see in the figure above, Caffeine does burst the Guava cache on the read and write operations. The main reason is that in Caffeine, these events are handled asynchronously by submitting them to a queue. The data structure of the queue is a RingBuffer. If you don’t know, check out the high-performance unlocked queue Disruptor in this post. By using the default ForkJoinPool.commonPool() or by configuring your own thread pool, the thread pool can be taken from the queue, and then it can be cancelled or expired.

Of course, there are different queues for reads and writes. In Caffeine, there are more cache reads than writes, so all threads share a Ringbuffer for writes.

For reads to be more frequent than writes, further reducing contention, there is a RingBuffer for each thread:

Data elimination strategy

In Caffeine, all data is in ConcurrentHashMap, which is different from the Guava cache, which implements a structure similar to ConcurrentHashMap. There are three LRU queues referenced by records in Caffeine:

  • Eden queue: Caffeine is limited to %1 of the cache capacity, and if size=100, the effective size of this queue is equal to 1. This queue records new data to prevent burst traffic from being eliminated due to lack of previous access frequency. For example, when a new play goes online, there is actually no access frequency at the beginning, so as to prevent it from being eliminated by other caches after going online, and join this region. The Eden area, the most comfortable and comfortable area, is hard to eliminate from other data.

  • “Probationary queue” means your data is relatively cold and you’re on the verge of being phased out. The effective size is size minus Eden minus protected.

  • Protected queue: With this queue, you can rest assured that you’re not on Probation for a while, but don’t worry, if the queue runs out of data or is full of Protected data, you will be. Of course, to get into this queue, you have to visit Probation once, and then it’s promoted to Protected. The effective size is (size minus Eden) X 80%. If size =100, it would be 79.

The relationship of the three queues is as follows:

  1. All new data will go to Eden.
  2. Eden is full, suspended on Probation.
  3. If one of the data is accessed in Probation, it is made Protected.
  4. If Protected is full, it will be downgraded to “Probation”.

For data elimination, weeding, from Probation will call this data in a queue team head victims, head must have been the first to enter the team, according to the LRU algorithm of queue that he in fact he should be eliminated, but can only be called him a victim here, the queue is Probation queue, representative immediately to give his execution. So here we take the end of the line and we call the candidate, or the attacker. Here, the victim will do a PK with the attacker. The following judgments can be made based on the frequency data recorded in our count-min Sketch:

  • If the attacker is greater than the victim, the victim is eliminated.
  • If the attacker is <=5, then the attacker is eliminated. This logic is explained in his notes:

    He believes that setting a warm-up threshold will lead to a higher overall hit rate.

  • In other cases, random selection.

How to use

For those of you who are familiar with Guava and are worried about switching costs, your concerns are completely overblown. Caffeine’s API is a copy of Guava’s API, and it’s pretty much the same.

public static void main(String[] args) {
        Cache<String, String> cache = Caffeine.newBuilder()
                .expireAfterWrite(1, TimeUnit.SECONDS)
                .expireAfterAccess(1,TimeUnit.SECONDS)
                .maximumSize(10)
                .build();
        cache.put("hello"."hello");
    }
Copy the code

By the way, more and more open source frameworks are abandoning Guava cache, such as Spring5. In business, I’ve compared Guava Cache to Caffeine myself, and I chose Caffeine, which works well online. So don’t worry that Caffeine is immature and no one is using it.

The last

This article mainly talked about iQiyi cache road and a local cache development history (from ancient to present to the future), as well as the basic principles of each cache implementation. Of course, this is not enough to use good caching, such as how local caching is updated synchronously when changes are made elsewhere, distributed caching, multi-level caching, etc. I’ll do a section later on on how to use caching. After the principle of Guava Cache and Caffeine will also take time to write the source analysis of these two, if interested friends can follow the public account for the first time to check the update article.

Finally, this article is included in JGrowing, a comprehensive, excellent, community-built Java learning route, if you want to participate in the maintenance of open source projects, you can build together, github :github.com/javagrowing… A little star, please.

If you think this article has articles for you, you can follow my technical public account, your attention and forwarding is the biggest support for me, O(∩_∩)O