Talk about cache obsolescence LRU and LFU algorithms

preface

The LRU algorithm and the LFU algorithm are one of the algorithms for page replacement, or more generally, a strategy for how caching is rendered obsolete.

We usually design a system, because the database reading speed is far less than the memory reading speed, so in order to speed up the reading speed, we will put part of the data in memory, called cache.

However, memory capacity is limited, when you want to cache more data than the capacity, you have to delete some data, which data deletion, which data retention, LRU algorithm and LFU algorithm to do.

What is the LRU algorithm

LRU algorithm: Least recently used, that is, the Least recently used. The idea of the LRU algorithm is that if the data has been accessed recently, there is a high probability that it will be accessed in the future.

According to this idea, we can imagine that the implementation of LRU algorithm must use a list/linked list, the purpose is to ensure order; Another uses hash tables, because caches often take the form of key-value pairs.

The simple method is to use list + hash table, but the time complexity of query and insert in this way is O(n). Another method is to use bidirectional linked list + hash table, the time complexity of query and insert is O(1), but it consumes more space resources.

List + hash table implementation

As shown in the figure above, assuming that we use the header interpolation method, that is, the most recently accessed element is placed first and the latest element is placed last, then the steps to implement the LRU algorithm are as follows:

1. Initialize an empty list, as shown in the figure above, whose container is 5.
2. When performing the find (GET) operation, it is necessary to traverse the whole list to find the element with the same key, and the time complexity is O(n).
3. When performing an insert operation, there are three cases:
3.1 Iterates through the list, updates the value of value if the element’s key exists, and places the element at the top of the list to ensure that the last element is the last accessed.
3.2 Iterate over the list. If the element’s key does not exist and the list is not full, place the element at the top of the list
3.3 Iterate over the list. If the key of an element does not exist and the list is full, delete the last element and place the new element at the top

For a source implementation of this approach, see Leetcode’s LRU source implementation

Bidirectional linked list + hash table implementation

This should be a common question in interviews. Interviewers may ask you, what if you implement an O(1) LRU cache?

The structure of this implementation is as follows:

Its structure is as follows (Golang) :

type LinkNode struct{
    key, value int
    pre, next *LinkNode
}

type LRUCache struct {
    m map[int]*LinkNode
    capacity int
    head, tail *LinkNode
}
Copy the code

M of the above LRUCache is actually the HashMap on the left of the figure above, whose purpose is to realize the time complexity of the search as O(1).

If you don’t have this m, then when you’re looking for an element, you need to traverse the bidirectional list in order n time.

The main reason for using bidirectional linked lists is to achieve O(1) time complexity of node insertion/deletion.

Can’t you use a single linked list?

As mentioned above, with a single linked list, new nodes can be quickly inserted in the head and old nodes can be quickly deleted in the tail, and the time complexity is O(1).

But for the intermediate node, such as the value of the node 1 I need update from 2 to 4, this time in addition to update the value, you also need to move it to the front, for the singly linked lists, it know only the next element, an element does not know, in order to get on an element, it must traverse a list just know, the time complexity is O (n), That’s why we use bidirectional lists.

For a source implementation of this approach, see the Leetcode LRU bidirectional list implementation

What is the LFU algorithm

The LFU algorithm is the Least frequently used algorithm. The idea of LFU algorithm is that the node that has been accessed least times in a certain period has the least chance to be accessed in the future.

As you can see, LFU emphasizes access times, while LRU emphasizes access times.

LFU can be implemented in two ways, one is hash table + balanced binary tree, and the other is double hash table. The following uses double hash table as an example to explain the specific steps of LFU:

Double hash table implementation

The implementation of a double hash table is shown as follows:

As above, a double hash table requires two hashes to be maintained and a count minFreq that is least frequently used.

The first hash table is freq_table, whose key is the frequency of access, and whose value is a bidirectional linked list. Each bidirectional linked list node contains three elements: key, value, and count.

The second hash table is key_table, whose key is the key stored in the bidirectional list, and whose value is the pointer to the corresponding node (thus the lookup time is O(1)).

By analogy with LRU, the steps of LFU are as follows:

1. Assume that the LFU cache capacity is 3, and three key-value pairs (1, 1), (2,2), and (3, 3) are initially inserted at the beginning. At this time, the frequency of each key-value pair is 1, so they are all on the same bidirectional linked list, as shown in figure 4.
2. Suppose the search key=1, because key_table stores Pointers to nodes, the complexity of O(1) can be obtained.
2.1 Notice that the frequency of node 1 changes from 1 to 2, so node 1 should be moved to the linked list of frequency 2, as shown in Figure 5
2.2 In addition, minFreq should also remember to synchronize updates, but this operation is not necessary.
3. Suppose a new key-value pair (4,4) is inserted at this time. Since its frequency is 1, it will be inserted at the beginning of linked list 1, and the last element in the same list must be inserted last because of this operation.
3.1 as newly added elements lead to capacity overflow, we need to delete the one with the least frequency and the latest insertion time, namely (3,3,1) in FIG. 5.

Note: If you still don’t understand the above steps, please leave a comment or check the official explanation of Leetcode LFU

Talk about cache obsolescence LRU and LFU algorithms

preface

What is the LRU algorithm

List + hash table implementation

Bidirectional linked list + hash table implementation

What is the LFU algorithm

Double hash table implementation

Related Posts

Python (18) : Basic file manipulation

Remember a Go NET library DNS troubleshooting

Springboot series (4) : Springboot stater introduction | super dry, suggested that collection