Hei had a friend who recently went to an interview and asked him some questions about caching.

Let him answer, design an LRU cache, how should implement

My friend, should be not well prepared for this piece of content, anyway, is not how to answer, so… I told him to go home until further notice.

Today xiaohei will take you to talk about the LRU algorithm, and start to write an LRU cache.

Cache elimination strategy

In developing at ordinary times, we often use to the cache, such as some hot commodities, configuration data, etc., in order to improve the access speed will be in the cache, however, tend to cache capacity is limited, we cannot put all data in the cache, need to set a cache capacity, after full capacity, must have the new data into the cache, The data in the original cache needs to be flushed out according to a certain policy, which is called the cache flushing policy.

There are many kinds of cache elimination strategies, such as FIFO(First in last out), LRU(Least Recently Used), AND LFU (Least Frequently Used).

What’s the LRU

LRU is an abbreviation for Least Recently Used, which means to eliminate data that has been Used Least Recently.

For example, we have the following cache structure:

Started the cache time and space, we respectively to the cache in the 5,6,9 three elements, and then in the put 3 elements, cache space has been used up, this is we need to eliminate one of the elements, release the space trust of elements, according to the logic of LRU algorithm, the cache is the least recently used elements in 5, so will be eliminated, into the three elements.

Let’s think about how to implement such an LRU cache structure. Before we start writing the code, we need to clarify one of the conditions that our cache needs to meet.

  • The cache has a limited capacity
  • When the cache capacity is used up, new elements must be added using the LRU algorithm to eliminate the elements
  • Adding elements, querying elements should be O(1)
  • Operations on the cache should support concurrency

Because of the above requirements, let’s first consider the following question.

How do you ensure that all operations have O(1) time complexity?

To find the answer to this question, we need to think a little more deeply about the characteristics of LRU caching.

First of all, the LRU cache should be a queue structure. If an element is re-accessed, it should be put back at the head of the queue.

Then, our queue has a capacity limit, and whenever a new element is added to the cache, it is added to the head of the queue; When there are elements to be eliminated, they are removed from the end of the queue;

This ensures that the time to add and discard elements is O(1), so what if we want to query elements from the cache?

Did it occur to you to walk through the queue and find the elements that fit? Obviously, this will find the elements, but it’s O(n) time, and when we have a lot of data in the cache, the time to query the elements is not fixed, it can be fast, it can be very slow.

The time complexity of the query operation is O(1).

How can the time complexity of a query be O(1)?

Queues don’t, but hashMaps do.

But here’s the thing: if you just use HashMap, you can make the query time O(1), but when you weed out elements, you can’t delete them at O(1).

We can use the combination of HashMap and linked list to complete the structure of THE LRU cache.

We can use the cache key as the key of the HashMap to ensure that the data can be quickly retrieved when querying elements.

With a bidirectional linked list, it is possible to ensure that the head node and tail node operations when new elements are added and elements are eliminated, possibly in O(1) time.

Is the train of thought becoming very clear now!

Ok, now let’s write the implementation idea:

First, if the HashMap contains a key, the cache hits and the element is retrieved. If not, the cache is not hit.

If the cache hits:

  • Remove the element from the list and add the element to the head of the list;
  • Save the header node as a value in the HashMap;

If miss:

  • Add the element to the head of the list;
  • Save the linked list header node in a HashMap.
  • Well, at this point we can write code.

Code implementation

First, we define a Cache interface. In this structure, we define the methods of Cache:

/ * * *@authorHei says Java *@ClassName Cache
 * @Description
 * @date2022/1/13 * * /
public interface Cache<K.V> {
    
    boolean put(K key, V value);

    Optional<V> get(K key);

    int size(a);

    boolean isEmpty(a);

    void clear(a);
}
Copy the code

Next, we implement our LRUCache class, which implements the Cache interface:

public class LRUCache<K.V> implements Cache<K.V> {
    private int size;
    private Map<K, LinkedListNode<CacheElement<K, V>>> linkedListNodeMap;
    private DoublyLinkedList<CacheElement<K, V>> doublyLinkedList;

    public LRUCache(int size) {
        this.size = size;
        this.linkedListNodeMap = new ConcurrentHashMap<>(size);
        this.doublyLinkedList = new DoublyLinkedList<>();
    }
    / /... Other methods
}
Copy the code

First we define a Map in LRUCache and our custom two-way linkedList, DoublyLinkedList, initialized in the constructor.

Next, implement the method of the concrete operation.

The put operation

public boolean put(K key, V value) {
    CacheElement<K, V> item = new CacheElement<K, V>(key, value);
    LinkedListNode<CacheElement<K, V>> newNode;
    // If it contains elements, it indicates a cache hit
    if (this.linkedListNodeMap.containsKey(key)) {
        // Fetch from map
        LinkedListNode<CacheElement<K, V>> node = this.linkedListNodeMap.get(key);
        // Update data to the top of the list
        newNode = doublyLinkedList.updateAndMoveToFront(node, item);
    } else {
        // No match. If the cache capacity is used up, an elimination policy is executed
        if (this.size() >= this.size) {
            this.evictElement();
        }
        // Create a new node and add it to the list
        newNode = this.doublyLinkedList.add(item);
    }
    if(newNode.isEmpty()) {
        return false;
    }
    // Put the list node into the map
    this.linkedListNodeMap.put(key, newNode);
    return true;
}
Copy the code

If yes, the data will be updated to the head of the linked list. Otherwise, the data will be lost. New nodes need to be added to the cache, and the elimination strategy needs to be determined.

Update the list node to the front in the updateAndMoveToFront() method as follows:

public LinkedListNode<T> updateAndMoveToFront(LinkedListNode<T> node, T newValue) {
    // The node is not empty, and the node must be under the current list
    if (node.isEmpty() || (this! = (node.getListReference()))) {return dummyNode;
    }
    // Detach the original node from the list
    detach(node);
    // Add a new node to the list
    add(newValue);
    return head;
}
Copy the code

EvictElement () is executed when the elimination policy is executed as follows:

private boolean evictElement(a) {
    // Remove the tail node
    LinkedListNode<CacheElement<K, V>> linkedListNode = doublyLinkedList.removeTail();
    if (linkedListNode.isEmpty()) {
        return false;
    }
    linkedListNodeMap.remove(linkedListNode.getElement().getKey());
    return true;
}
Copy the code

Get operation

public Optional<V> get(K key) {
    LinkedListNode<CacheElement<K, V>> linkedListNode = this.linkedListNodeMap.get(key);
    if(linkedListNode ! =null && !linkedListNode.isEmpty()) {
        // Move the hit node to the head of the list and place it in the Map
        linkedListNodeMap.put(key, this.doublyLinkedList.moveToFront(linkedListNode));
        return Optional.of(linkedListNode.getElement().getValue());
    }
    return Optional.empty();
}
Copy the code

Get is a simple operation that determines whether a node exists or is not empty, moves it to the top of the list if it does, and then puts it back into the Map. One difference from the put operation is that the moveToFront() method is used:

public LinkedListNode<T> moveToFront(LinkedListNode<T> node) {
    return node.isEmpty() ? dummyNode : updateAndMoveToFront(node, node.getElement());
}
Copy the code

At this point, the basic function of our cache is complete. But is there a problem?

This is problematic because we are not considering concurrent scenarios, and for our LRUCache thread to be safe, all operations need to support synchronization.

You can use synchronized or Lock to achieve synchronization. In cache scenarios, there is no concurrency problem between reads and reads. Synchronization requires read and write synchronization, and synchronized does not support read/write locks.

public class LRUCache<K.V> implements Cache<K.V> {
    private int size;
    private final Map<K, LinkedListNode<CacheElement<K,V>>> linkedListNodeMap;
    private final DoublyLinkedList<CacheElement<K,V>> doublyLinkedList;
	// Define a lock
    private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();

    public LRUCache(int size) {
        this.size = size;
        this.linkedListNodeMap = new ConcurrentHashMap<>(size);
        this.doublyLinkedList = new DoublyLinkedList<>();
    }
// ...
}
Copy the code

Write lock

In our LRUCache, the operations that need to add write locks are PUT (), and evictElement().

public boolean put(K key, V value) {
        / / lock
        this.lock.writeLock().lock();
        try {
            CacheElement<K, V> item = new CacheElement<K, V>(key, value);
            LinkedListNode<CacheElement<K, V>> newNode;
            if (this.linkedListNodeMap.containsKey(key)) {
                LinkedListNode<CacheElement<K, V>> node = this.linkedListNodeMap.get(key);
                newNode = doublyLinkedList.updateAndMoveToFront(node, item);
            } else {
                if (this.size() >= this.size) {
                    this.evictElement();
                }
                newNode = this.doublyLinkedList.add(item);
            }
            if (newNode.isEmpty()) {
                return false;
            }
            this.linkedListNodeMap.put(key, newNode);
            return true;
        } finally {
            / / unlock
            this.lock.writeLock().unlock(); }}Copy the code

EvictElement:

private boolean evictElement(a) {
    / / lock
    this.lock.writeLock().lock();
    try {
        LinkedListNode<CacheElement<K, V>> linkedListNode = doublyLinkedList.removeTail();
        if (linkedListNode.isEmpty()) {
            return false;
        }
        linkedListNodeMap.remove(linkedListNode.getElement().getKey());
        return true;
    } finally {
        / / unlock
        this.lock.writeLock().unlock(); }}Copy the code

Read lock

When reading data, we also add read locks.

public Optional<V> get(K key) {
    // Add read lock
    this.lock.readLock().lock();
    try {
        LinkedListNode<CacheElement<K, V>> linkedListNode = this.linkedListNodeMap.get(key);
        if(linkedListNode ! =null && !linkedListNode.isEmpty()) {
            linkedListNodeMap.put(key, this.doublyLinkedList.moveToFront(linkedListNode));
            return Optional.of(linkedListNode.getElement().getValue());
        }
        return Optional.empty();
    } finally {
        / / unlock
        this.lock.readLock().unlock(); }}Copy the code

LRUCache now supports concurrent use.

summary

LRU algorithm is a commonly used elimination algorithm. When there are hot data, the efficiency is very good. However, LRU algorithm also has some shortcomings. This condition is known as cache contamination.

LRU’s extended algorithm lRU-K and another commonly used LFU algorithm can be used to solve the cache contamination problem.

That’s all for this episode. If it’s been helpful, give black a thumbs up.

I am xiao Hei, a programmer in the Internet “casual”

Water does not compete, you are in the flow