Dig deep into LinkedHashMap

preface

The first time I saw a LinkedHashMap, during the summer reading of the Collection chapter in Java Core Technology Volume I, it was stated that LinkedHashMap could iterate over elements in access order and implement an LRU strategy based on LinkedHashMap. The next time I saw it, I was working on this problem: 146. LRU caching mechanism, I used LinkedHashMap to do it directly. Today, we’ll take a look at the source code to see how LinkedHashMap can iterate over elements in access order, and how it can make an LRU.

LinkedHashMap Basic structure

public class LinkedHashMap<K.V>
    extends HashMap<K.V>
    implements Map<K.V>
{
    // LinkedHashMap node structure
    static class Entry<K.V> extends HashMap.Node<K.V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next); }}// The first and last nodes of the bidirectional list
    transient LinkedHashMap.Entry<K,V> head;
    transient LinkedHashMap.Entry<K,V> tail;
    // Whether to iterate in access order
    final boolean accessOrder;
}
Copy the code

First look at the basic structure and attributes of LinkedHashMap, it can be seen that it inherits the HashMap class, and also implements the Map interface, so it can use the methods and attributes of HashMap. The LinkedHashMap has its own Node structure: the inner class Entry, which, as you can see, inherits the Nodes of the HashMap, as well as the head and tail Pointers of the Entry type. Seeing this, we assume that the LinkedHashMap maintains a bidirectional list structure, but we’ll get to the details later. Finally, accessOrder is used to indicate whether the LinkedHashMap organizes the list structure in order of access.

LinkedHashMap initialization

public LinkedHashMap(int initialCapacity, float loadFactor) {
    super(initialCapacity, loadFactor);
    accessOrder = false;
}

public LinkedHashMap(int initialCapacity) {
    super(initialCapacity);
    accessOrder = false;
}

public LinkedHashMap() {
    super(a); accessOrder =false;
}

public LinkedHashMap(Map<? extends K, ? extends V> m) {
    super(a); accessOrder =false;
    putMapEntries(m, false);
}
AccessOrder can be specified
public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) {
    super(initialCapacity, loadFactor);
    this.accessOrder = accessOrder;
}
Copy the code

The initializer section, like most classes, has several versions of the constructor. Note that only the last function can specify accessOrder. The constructors are called from the parent HashMap constructor, which you can check out on the HashMap blog.

NewNode method

To talk about other methods of LinkedHashMap, we must start with newNode. Let’s go back to newNode in HashMap:

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        // Create a node
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    // Create a node
                    p.next = newNode(hash, key, value, null);
        // A bunch of other code

/ / newNode method
Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) {
    return new Node<>(hash, key, value, next);
}
Copy the code

For example, in putVal of HashMap, newNode appears twice, once hashing the corresponding bucket to find that the bucket is empty, which must create a newNode, and once traversing the end of the current bucket, creating a newNode to the end of the bucket’s single linked list. The newNode is simply a new HashMap node. These are all HashMap operations, which are briefly mentioned here. LinkedHashMap overrides this function, so let’s see:

Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
    LinkedHashMap.Entry<K,V> p =
        new LinkedHashMap.Entry<>(hash, key, value, e);
    // Insert the new node to the end of the list
    linkNodeLast(p);
    return p;
}  
// Insert the new node to the end of the list
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
    LinkedHashMap.Entry<K,V> last = tail;
    tail = p;
    if (last == null)
        head = p;
    else{ p.before = last; last.after = p; }}Copy the code

This Entry, as mentioned earlier, inherits the Node class of the HashMap. As we’ll see later, LinkedHashMap itself does not override the put related methods. So if LinkedHashMap calls putVal, newNode is called when a newNode is created. On the one hand, Entry is a Node subclass of HashMap and can be placed in the bucket + list structure of HashMap. On the other hand, This Entry node is then organized into the end of the unique LinkedHashMap two-way list structure.

In other words, LinkedHashMap actually maintains two structures: a bucket + list or red-black tree of the HashMap, and its own bidirectional list structure. This list structure is used to provide operations such as iteration order.

The get process

Put and get are the most common operations in the Map family. Let’s start with get. LinkedHashMap overwrites two get methods:

public V get(Object key) {
     Node<K,V> e;
     if ((e = getNode(hash(key), key)) == null)
         return null;
     if (accessOrder)
         afterNodeAccess(e);
     return e.value;
 }

 public V getOrDefault(Object key, V defaultValue) {
    Node<K,V> e;
    if ((e = getNode(hash(key), key)) == null)
        return defaultValue;
    if (accessOrder)
        afterNodeAccess(e);
    return e.value;
}
Copy the code

As you can see, after both gets are complete, the following code will be run:

if (accessOrder)
    afterNodeAccess(e);
Copy the code

Take a look at what this code is doing:

void afterNodeAccess(Node<K,V> e) { // move node to last
    LinkedHashMap.Entry<K,V> last;
    AccessOrder is true and the node is not at the end of the list
    if(accessOrder && (last = tail) ! = e) { LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;// Delete node p (e) from the original list
        p.after = null;
        if (b == null)
            head = a;
        else
            b.after = a;
        if(a ! =null)
            a.before = b;
        else
            last = b;
        // insert p at the end of the list
        if (last == null)
            head = p;
        else{ p.before = last; last.after = p; } tail = p; ++modCount; }}Copy the code

This code looks very long, but it’s actually very simple. It just moves the node passed in the parameter to the end of the list. It’s removed and then inserted into the end of the list in two steps. LinkedList (LinkedList) – LinkedList (LinkedList)

So, when accessOrder is set to true, the get operation, and the value of the key is found, moves the node to the end of the bidirectional list maintained by the LinkedHashMap.

Put the process

public V put(K key, V value) {
    return putVal(hash(key), key, value, false.true);
}  
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    // A bunch of operations to put
    afterNodeInsertion(evict);
    return null;
}
Copy the code

LinkedHashMap does not override the PUT method itself, so it puts the HashMap put method instead. AfterNodeInsertion (EVICT); afterNodeInsertion(EVICT); What does it do? This is actually for LinkedHashMap. AfterNodeInsertion was rewritten in LinkedHashMap:

void afterNodeInsertion(boolean evict) { // possibly remove eldest
    LinkedHashMap.Entry<K,V> first;
    // Determine whether to delete it
    if(evict && (first = head) ! =null && removeEldestEntry(first)) {
        K key = first.key;
        removeNode(hash(key), key, null.false.true);
    }
}  
protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
    return false;
}
Copy the code

If (evict && (first = head)! = null && removeEldestEntry(first)). If true, the head of the list will be removed.

So if you look at the judgment condition, first look at EVICT.

public V putIfAbsent(K key, V value) {
    return putVal(hash(key), key, value, true.true);
}
public V put(K key, V value) {
    return putVal(hash(key), key, value, false.true);
}  
// Other functions that call putVal
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {}
Copy the code

This evict depends primarily on what the method calling putVal is passing. This is true in most cases except for initialization.

(first = head) ! = null indicates that the maintained list structure is not empty.

RemoveEldestEntry (first), this function returns false by default:

protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
    return false;
}
Copy the code

But it’s protected, and we can override it by inheritance.

Anyway, every time a new node is inserted, a judgment is made to see if the head of the list needs to be removed.

Problem solving

How do I iterate in access order?

Looking at the above functions, we can answer some questions, starting with how to iterate in order of access. AfterNodeAccess is called after each get. If accessOrder is set to true, afterNodeAccess will be inserted at the end of the list. This list is maintained in insertion order according to linkNodeLast in the source of the newNode method. Look at a simple use example:

public class someInterface {
    public static void main(String[] args){
        // Set accessOrder to true
        LinkedHashMap<String,Integer> testLink = new LinkedHashMap<>(10.0.75f,true);
        testLink.put("a".1);
        testLink.put("b".2);
        testLink.put("c".3);
        testLink.get("b");
        Iterator<Map.Entry<String,Integer>> iterator = testLink.entrySet().iterator();
        while (iterator.hasNext()) {
            Map.Entry<String,Integer> next = iterator.next();
            System.out.println("key:" + next.getKey() + " value:"+ next.getValue()); }}}Copy the code

Here we initialize a LinkedHashMap object, set accessOrder to true, call GET after put, and look at the result:

It’s in the order that we would expect it to be accessed, and we end up accessing b, so it’s at the end of the list.

In addition to the setting of accessOrder and afterNodeAccess, the key to accessing in order is the unique iterator of LinkedHashMap:

public Set<Map.Entry<K,V>> entrySet() {
    Set<Map.Entry<K,V>> es;
    return (es = entrySet) == null ? (entrySet = new LinkedEntrySet()) : es;
}

final class LinkedEntrySet extends AbstractSet<Map.Entry<K.V>> {
    // Other functions
    public final Iterator<Map.Entry<K,V>> iterator() {
        return new LinkedEntryIterator();
    }
    // Other functions
}
final class LinkedEntryIterator extends LinkedHashIterator
    implements Iterator<Map.Entry<K.V>> {
    public final Map.Entry<K,V> next() { returnnextNode(); }}// The key to iteration
abstract class LinkedHashIterator {
    LinkedHashMap.Entry<K,V> next;
    LinkedHashMap.Entry<K,V> current;
    int expectedModCount;

    LinkedHashIterator() {
        // Initialize next
        next = head;
        expectedModCount = modCount;
        current = null;
    }

    public final boolean hasNext() {
        returnnext ! =null;
    }
    // Core code for iteration
    final LinkedHashMap.Entry<K,V> nextNode() {
        LinkedHashMap.Entry<K,V> e = next;
        if(modCount ! = expectedModCount)throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        current = e;
        // Next iteration
        next = e.after;
        returne; }}Copy the code

After a long series of calls, the key to the final iteration is the nextNode method of LinkedIterator. Next is initialized as head, and next = e.after; Set the value of the next next. This mechanism dictates that the LinkedHashMap iterator iterates over the self-maintained bidirectional list, quite unlike a traditional HashMap.

How to implement LRU?

public class someInterface {
    public static void main(String[] args){
        LinkedHashMap<String,Integer> testLink = new LinkedHashMap<>(10.0.75f,true) {/ / rewrite removeEldestEntry
            @Override
            protected boolean removeEldestEntry(Map.Entry<String, Integer> eldest) {
                return size() > 3; }}; testLink.put("a".1);
        testLink.put("b".2);
        testLink.put("c".3);
        testLink.get("a");
        testLink.put("d".4);
        Iterator<Map.Entry<String,Integer>> iterator = testLink.entrySet().iterator();
        while (iterator.hasNext()) {
            Map.Entry<String,Integer> next = iterator.next();
            System.out.println("key:" + next.getKey() + " value:"+ next.getValue()); }}}Copy the code

If (evict && (first = head)! = null && removeEldestEntry(first)) returns false by default, so we need to override this function. In this example, we call return size() > 3; The maximum capacity is set to 3. If the capacity is greater than 3, the head of the list is removed when a new node is inserted, and according to the previous analysis, the tail of the list is the longest unused node. Let’s look at this code in action:

Testlink. get(“a”); , so the node of B was finally deleted, in line with the principle of LRU.

conclusion

LinkedHashMap inherits HashMap and has the functionality of HashMap
LinkedHashMap maintains two data structures, a HashMap structure and a bidirectional linked list for iteration
LinkedHashMap’s unique iterator design and some function rewrites cause iterators to iterate over bidirectional lists, and if not setaccessOrder, iterates in insert order; otherwise, iterates in access order
Through rewritingremoveEldestEntryCan realize LRU function