Abstract

The two cache mechanism algorithms have to solve the common difficulty: how to quickly delete the appropriate data when the cache is full.

The difference is:

  • LRU deletes “least frequently used data”
  • LFU is more complex, deleting “the least frequently used and least frequently used data”.

So, although are with the help of “hash table + bidirectional linked list” to solve, but there are small differences.

LRU (Least Recently Used)

LRU caching mechanism – LeetCode

Using your knowledge of data structures, design and implement an LRU (least recently used) caching mechanism. Implement the LRUCache class:

  • LRUCache(int capacity) The LRU cache is initialized using a positive integer as capacity
  • Int get(int key) Returns the value of the key if it exists in the cache, otherwise -1 is returned.
  • Void put(int key, int value) If the keyword already exists, change its data value. If the keyword does not exist, the set of keyword – values is inserted. When the cache reaches its maximum capacity, it should delete the oldest unused data values before writing new data to make room for new data values.

Advanced: Can you do both in O(1) time?

So let’s just focus on, how do we do this in order one time?

Get is easy, and you can usually think of using hash tables as data structures.

The key is how to find the “least frequently used data” and delete it in O(1) time when the cache is full.

The answer is two-way linked lists.

Answer key

List + hash table [Int:Node]

Key point: You need to ensure that the headers of the List are “recently” used.

Algorithm:

When get is selected, the node is moved to the head of the List.

When put:

  1. If it already exists, update the value and move it to the head of the List
  2. If the cache is full, delete the end of the List (least common)
  3. Otherwise, insert it directly into the List header

Combined with the following figure:

Text description is not very specific, can directly look at the code:

class LRUCache {
    // Two-way linked list
    class Node {
        var key = 0, value = 0
        var pre: Node?
        var nx: Node?
    }

    class List {
        var head: Node?
        var tail: Node?

        func appendToHead(_ node: Node) {
            if let h = head {
                node.nx = h
                h.pre = node
                head = node
                return
            }

            / / no head
            head = node
            tail = node
            node.pre = nil
            node.nx = nil
        }

        func moveToHead(_ node: Node) {
            guard let _ = head else {
                // There are no nodes in the linked list
                appendToHead(node)
                return
            }
            / / delete
            delete(node)

            // Move to the head
            appendToHead(node)
        }

        /// delete a node
        func delete(_ node: Node) {
            guard let _ = head, let t = tail else {
                / / no nodes
                return
            }

            if node = = = t { // Tail needs to be updated
                tail = t.pre
                tail?.nx = nil
            }

            // Use sentinel nodes to simplify code
            // Case 1 has only one node
            // Case 2 has more than one node

            let dummy = Node()
            dummy.nx = head
            head?.pre = dummy

            node.pre?.nx = node.nx
            node.nx?.pre = node.pre

            head = dummy.nx
            head?.pre = nil
        }

        / / / clean up
        func removeAll(a) {
            head = nil
            tail = nil}}private var capacity = 0
    /// [key : node]
    private var kvT = [Int:Node] ()private var list = List(a)// Set the maximum value during initialization
    init(_ capacity: Int) {
        self.capacity = capacity
    }

    func get(_ key: Int) -> Int {
        // If you want to ensure thread safety, you can lock

        // read
        let v = self._get(key)

        return v
    }

    private func _get(_ key: Int) -> Int {
        guard capacity > 0.let n = kvT[key] else {
            return -1
        }

        _handleGetNode(n)

        return n.value
    }


    /// move the node to the head of the List
    private func _handleGetNode(_ n: Node) {
        list.moveToHead(n)
    }


    func put(_ key: Int._ value: Int) {
        // If you want to ensure thread safety, you can lock

        // write
        self._put(key, value)
    }

    /// 1. If it already exists, update value and move it to the head of List
    If cache is full, delete the end of List (least common).
    /// 3. Otherwise, insert it directly into the List header
    private func _put(_ key: Int._ value: Int) {
        if let n = kvT[key] { / / has been in existence
            n.value = value
            _handleGetNode(n)
            return
        }
        // The cache is full
        _handleFull()

        / / the new node
        let n = Node()
        n.key = key
        n.value = value

        // Insert into kvT
        kvT[key] = n

        // Insert into the List header
        list.appendToHead(n)
    }

    // the cache is full, delete the endnode
    private func _handleFull(a) {
        guard kvT.count = = capacity else {
            return
        }

        guard let n = list.tail else {
            return
        }
        list.delete(n) // Delete the end node
        kvT.removeValue(forKey: n.key) // Remove the kvT}}Copy the code

LFU (Least Frequently Used)

460. LFU cache

You design and implement data structures for least-used (LFU) caching algorithms.

Implement the LFUCache class:

  • LFUCache(int Capacity) – Initializes the object with the capacity of the data structure
  • Int get(int key) – Gets the value of the key if it exists in the cache, otherwise -1 is returned.
  • Void put(int key, int value) – If the key already exists, change its value; If the key does not exist, insert a key-value pair. When the cache reaches its capacity, the least frequently used items should be invalidated before new items are inserted. In this problem, when there is a tie (that is, two or more keys with the same frequency of use), the most recent and oldest unused key should be removed.

Note that the number of times the item was used is the sum of the number of get and PUT calls to the item since it was inserted. The number of times will be set to 0 after the corresponding item is removed.

To determine the least-used key, you can maintain a usage counter for each key in the cache. The key with the smallest use count is the key that has not been used for the longest time.

When a key is first inserted into the cache, its usage counter is set to 1 (due to the PUT operation). Perform a GET or put operation on a key in the cache, and the value of the use counter will be incremented.

Compared with LRU, there is one more condition to consider when deleting: frequency of use.

Answer key

Data structure: bidirectional linked list + 2 hash table

KvT: [Int:Node] and freqT: [Int:List].

The following points should be noted:

  1. Save “minimum frequency” with an extra minFreq
  2. When you add elements to a List, you can only add them to the header.

Algorithm:

When get, it updates the frequency of the node and its position in freqT.

When put:

  1. If it already exists, the value is updated, its usage frequency + 1, and its position in freqT is updated
  2. If the cache is full, delete the List tail.
  3. Otherwise, insert it directly into the List header for frequency 1.

Combining pictures

The text description is not very specific, just look at the following code:

class LFUCache {
    class Node {
        var key = 0, freq = 0, value = 0
        var pre: Node?
        var nx: Node?
    }

    class List {
        var head: Node?
        var tail: Node?

        func appendToHead(_ node: Node) {
            if let h = head {
                node.nx = h
                h.pre = node
                head = node
                return
            }

            / / no head
            head = node
            tail = node
            node.pre = nil
            node.nx = nil
        }

        func delete(_ node: Node) {
            guard let _ = head, let t = tail else {
                / / no nodes
                return
            }

            if node = = = t { // Tail needs to be updated
                tail = t.pre
                tail?.nx = nil
            }

            // Use sentinel nodes to simplify code
            // Case 1 has only one node
            // Case 2 has more than one node

            let dummy = Node()
            dummy.nx = head
            head?.pre = dummy

            node.pre?.nx = node.nx
            node.nx?.pre = node.pre

            head = dummy.nx
            head?.pre = nil}}private var capacity = 0
    /// [key : node]
    private var kvT = [Int:Node] ()/// [freq : List]
    private var freqT = [Int:List] ()/// minimum frequency, easy to delete
    private var minFreq = 0

    init(_ capacity: Int) {
        self.capacity = capacity
    }

    func get(_ key: Int) -> Int {
        guard capacity > 0.let n = kvT[key] else {
            return -1
        }

        _handleGetNode(n)

        return n.value
    }

    private func _handleGetNode(_ n: Node) {
        // save new freq temporarily
        let nf = n.freq + 1

        // Delete from freq's list
        freqT[n.freq]?.delete(n)

        // Add to the list header of freq+1
        let l = freqT[nf] ?? List()
        l.appendToHead(n)
        freqT[nf] = l

        // Update minFreq, the latter condition is related to the List structure
        if minFreq = = n.freq, freqT[minFreq]?.head = = nil {
            minFreq = nf
        }

        / / update freq
        n.freq = nf
    }

    func put(_ key: Int._ value: Int) {
        if let n = kvT[key] { / / has been in existence
            n.value = value
            _handleGetNode(n)
            return
        }
        // The cache is full
        _handleFull()

        / / insert
        let n = Node()
        n.key = key
        n.value = value
        n.freq = 1

        // Insert into kvT
        kvT[key] = n

        // Insert into freqT
        let list1 = freqT[1] ?? List()
        list1.appendToHead(n)
        freqT[1] = list1

        minFreq = 1
    }

    private func _handleFull(a) {
        guard kvT.count = = capacity else {
            return
        }

        guard let n = freqT[minFreq]?.tail else {
            return
        }
        freqT[minFreq]?.delete(n) // Delete the end node
        kvT.removeValue(forKey: n.key) // Remove the kvT}}Copy the code

summary

The core differences between the two algorithms and their solutions are as follows:

Caching mechanisms If the cache is full, you need to delete it Solution – Data structures
LRU The least commonly used data Bidirectional linked list + hash table
LFU The least frequently used and least commonly used data Bidirectional linked list + 2 hash tables

In my opinion, these two questions actually prove the power of bidirectional linked lists.

Its advantage is that the time complexity of insert and delete operation is O(1).

Combined with a hash table, we can do the search time complexity O(1).

digression

When I do these two questions, I look at the source code of NSCache on the iOS platform and find that the underlying implementation is much the same.

As a practical example, it should be easier to read than a dry algorithm problem.

Interested readers can look at the source code – iOS cache NSCache.