This is the 10th day of my participation in the August Text Challenge.More challenges in August

directory

One, foreword

The realization of LRU

1. What is LRU

2, LRU implementation – array + timestamp

3. Implementation of LRU – linked list

4, LRU implementation – linked list +HashMap

Efficient implementation of LRU in Guava Cache

1. Realization idea of LRU

2. Specific implementation of LRU

A, AccessQueue and WriteQueue

B, RecencyQueue

Four, practice


One, foreword

No matter what caching strategy we do, when we think about how cache items are weeded out, we come across the LRU algorithm, which is the least recently used (accessed) algorithm. For example, in Redis, there is a corresponding allkeys-lru when memory usage is near the threshold: lru tao is applied to allkeys; Volatile – LRU: LRU elimination is performed only for keys with expiration time. Of course, our main character’s Guava Cache also uses the LRU algorithm to flush out Cache entries to ensure that the Cache does not exceed the set threshold.

So what is the LRU algorithm? How is LRU implemented? How to implement the LRU of Guava Cache?

Second,The realization of the LRU

1. What is LRU

Before we begin, let’s briefly review what LRU is. LRU stands for Least Recently Used. The least recently used. That is, when we need to clear the cache items, we start with the least recently used (read and write) cache items.

2, LRU implementation – array + timestamp

This is a fairly common implementation, where all the elements are stored in an array and each element carries a timestamp (last access time). This allows you to find the least recently accessed element based on the timestamp. Its general structure is shown below.

This scheme is also simple to implement, but the time complexity of accessing data and finding least used elements is O(n).

3. Implementation of LRU – linked list

This implementation is also relatively simple, requiring only operations on linked lists. So it’s a little bit more complicated. The solution is to store elements through a two-way linked list and then move them to the head of the list each time an element is accessed. The least recently used element is the element at the end of the head of the list. Its general structure is shown as follows:

In this way, the time complexity of obtaining the least accessed element is not O(n), but O(1). But the operation on the elements is O(n) as in the previous method. So it doesn’t look very good either.

4, LRU implementation – linked list +HashMap

This scheme looks complicated. But that’s easy to say, hahaha. Elements are stored through a HashMap, while elements are built into a bidirectional linked list with each other. It makes full use of the advantages of the two efficient implementation of LRU algorithm. Sounds simple, doesn’t it? In order to be lazy, the author will not draw a picture, interested students can search on the Internet by themselves, hey hey.

The implementation uses a HashMap to store elements, which addresses the time complexity of reading and writing elements. We all know that the time complexity of HashMap is O(1). It also solves the problem of finding the least used element by bidirectional linked list, and its time complexity is still O(1).

Through analysis we found that this algorithm should be the best algorithm in the LRU. Because of this, the LRU in LinkedHashMap in Java is implemented in this way (those who do not know the LRU used in LinkedHashMap can go to the wall). Guava Cache, the protagonist of this paper, is also an LRU algorithm implemented in this way, but on this basis there are certain optimizations to achieve high performance. Next we will explore how the LRU of Guava Cache can be implemented efficiently.

Efficient implementation of LRU in Guava Cache

1. Realization idea of LRU

We already know that the Guava Cache is LRU implemented using the HashMap+ bidirectional list idea. Guava Cache uses ConcurrentHashMap to implement high-concurrency reads and writes. Therefore, we can conclude that the LRU algorithm in Guava Cache is implemented by ConcurrentHashMap+ bidirectional linked list.

2. Specific implementation of LRU

In the LRU implementation of Guava Cache, its bidirectional list is not global (i.e., there is only one Guava Cache here and there). Instead, it is in each Segment(the concept in ConcurrentHashMap). There are three queues involved: AccessQueue, WriteQueue, and RecentQueue. AccessQueue and WriteQueue are bidirectional lists. RecentQueue is the real Queue, it’s CocurrentLinkedQueue. Next we will examine how the Guava Cache implements the LRU through these three queues.

A, AccessQueue and WriteQueue

These two queues are a relatively simple bidirectional list implemented by Guava Cache itself, designed to be non-thread-safe for performance. Therefore, operations on both queues need to be used only if the Segment lock is obtained.

The AccessQueue is responsible for storing records of read actions on elements. WriteAccess, on the other hand, is responsible for logging the write behavior of elements. The record is the same as the list implementation LRU we discussed above: access moves the most recent access to the front of the list.

The question is why there is a WriteQueue. We said that the Hash+ list implementation idea is just a list. Why WriteAccess? This is because the Guava Cache allows you to set how long an element is deleted (deemed invalid) after being written. So WriteAccess is needed to sort elements by write time (each node page in the linked list records the write time of elements).

B, RecencyQueue

Now that we have an AccessQueue, we know the order in which the elements are accessed, making it easy to implement the LRU algorithm. Why do I even need RecentQueue CocurrentLinkedQueue?

As we mentioned above, AccessQueue is designed to be thread-safe and therefore must be accessed when the Lock in the Segment is obtained. What do we need to do when we access an element to ensure that the element is moved to the front of the AccessQueue? It is obvious that in order to do this, we must obtain the Segment Lock every time we access the element before we can safely move the element ahead of the AccessQueue. We have implemented this functionality, but we need to acquire the lock every time we access the element. This breaks the ConcurrentHashMap piecewise lock idea (in ConcurrentHashMap piecewise lock idea, get does not need to acquire locks to provide efficient read performance), resulting in slow reads of elements and poor performance.

Therefore, to ensure the performance of Guava Cache, RecencyQueue is introduced. When an element is read, all accessed elements are added to the RecencyQueue. It supports concurrent inserts because it is a synchronous queue. This ensures high performance read capability. When locks are acquired in certain scenarios, elements in the RecencyQueue are moved to the AccessQueue.

In what circumstances can locks be acquired? There are two main situations as follows:

1, cache changes (write, add, modify, delete) must obtain a lock

2. Try to get a tryLock every time you get an element, and succeed without competition

Segment. Get ->postReadCleanup->cleanUp->runLockedCleanup

Therefore, Guava Cache implements LRU algorithm by combining RecentQueue and AccessQueue to record access to elements while ensuring high performance of GET.

Legacy question: Why doesn’t WriteQueue have a corresponding CocurrentLinkedQueue?

Four, practice

If you have any questions or comments about this article, please add an official account to discuss it. (Add an official account to get 10GB video and graphic materials on “Java Advanced Architecture”.)