Java collection framework analysis (5)LinkedHashMap analysis

Introduction of LinkedHashMap

public class LinkedHashMap<K,V> extends HashMap<K,V> implements Map<K,V>
Copy the code

Inherited from HashMap, an ordered Map interface implementation where elements can be arranged in insertion order or access order; In contrast to HashMap, because LinkedHashMap is inherited from HashMap, LinkedHashMap is also implemented based on hash tables. Both Serializable and Cloneable interfaces are implemented, supporting serialization and cloning. And it’s also not thread-safe. The difference is that a bidirectional circular linked list is maintained internally, which is ordered and can be sorted by element insertion order or element most recently accessed order (LRU).

LinkedHashMap Data structure

LinkedHashMap not only carries out the storage mode of Entry array + next LinkedList based on hash table and single LinkedList like HashMap, but also combines the advantages of LinkedList to add precursors and successors for each Entry node. A header header node is added to construct a bidirectional circular linked list. That is, every time KV is put in, in addition to storing it in its place in the pair of hashing tables, it is inserted into the end of the bidirectional circular linked list.

The figure above shows the entire data structure of LinkedHashMap, including hash table and circular bidirectional linked list. Since there are too many lines in the circular bidirectional linked list, it is difficult to draw a simple node (circled in yellow). Note that the red arrow on the left refers to the next reference to the Entry node object (the single linked list in the hash table), and the green line refers to the before and after references to the Entry node object (the before and after references to the circular bidirectional list);

Above special circulation two-way linked list is extracted, intuitive, pay attention to the loop two-way chain table of the head store is one of the most long access nodes or the first insert node, the tail of a recent visit to or insert node recently, iterators iterate through the direction from the list of the head start to list the tail end, have an empty at the end of the list header node, This node does not store key-value content. It is the member attribute of the LinkedHashMap class and the entry of the circular bidirectional list.

LinkedHashMap source

Above is the LinkedHashMap source code analysis of the general knowledge, understand, to better analyze its source code, we will begin the formal analysis work.

Properties:

Private static Final Long serialVersionUID = 3801124242820219131L; Private TRANSIENT LinkedHashMapEntry<K,V> header; private transient LinkedHashMapEntry<K,V> header; Private final Boolean accessOrder; private final Boolean accessOrder;Copy the code

These properties are simple but important, and are not easy to understand because they are directly detailed at the beginning, but we will review what they mean when we finish analyzing the code. Let’s analyze its constructor.

Constructor analysis

Constructor that sets the initial capacity and load factor

/** * Public LinkedHashMap(int initialCapacity,float loadFactor) {
     super(initialCapacity, loadFactor);
     accessOrder = false;
 }
Copy the code

Constructor to set the initial capacity

/** * Sets the initial capacity constructor * @param initialCapacity The Initial capacity * @throws IllegalArgumentExceptionif the initial capacity is negative
  */
 public LinkedHashMap(int initialCapacity) {
     super(initialCapacity);
     accessOrder = false;
 }
Copy the code

The default constructor for empty arguments, default capacity 16 and load factor 0.75

With the default initial capacity (16) and load factor (0.75). */ publicLinkedHashMap() {
     super();
     accessOrder = false;
 }
Copy the code

Construct the LinkedHashMap using an existing Map

/** * Use an existing Map to construct LinkedHashMap * @param m the Map whose mappings are to be placedin this map
  * @throws NullPointerException if the specified map is null
  */
 public LinkedHashMap(Map<? extends K, ? extends V> m) {
     super(m);
     accessOrder = false;
 }
Copy the code

Constructor that sets the iteration order

/** * @param initialCapacity The initial capacity * @param loadFactor the load factor * @param accessOrder the ordering mode - <tt>true</tt> for
  *         access-order, <tt>false</tt> for insertion-order
  * @throws IllegalArgumentException if the initial capacity is negative
  *         or the load factor is nonpositive
  */
 public LinkedHashMap(int initialCapacity,
                      float loadFactor,
                      boolean accessOrder) {
     super(initialCapacity, loadFactor);
     this.accessOrder = accessOrder;
 }
Copy the code

These constructors are relatively simple, and as we mentioned briefly, if initialCapacity is not specified, the default is to use the initialCapacity of HashMap, which is 16. If loadFactor is not specified, the default value is 0.75. AccessOrder defaults to FASLse. The Boolean value is the flag bit for ordering elements in a bidirectional list.

If accessOrder is false, the list is sorted by insertion order when traversing the bidirectional list. AccessOrder, if true, indicates that the elements in the bidirectional list are sorted in order of access, with the least recently used element being traversed first.

As you can see from the constructor, the insertion order is used by default to maintain the order in which the key-value pairs are fetched. All constructors create objects by calling the constructor of the parent class.

And in the constructor of the parent class we can see that it calls init, init in the constructor of the Map class, so let’s go ahead and look at that

@Override
    void init() {
        header = new LinkedHashMapEntry<>(-1, null, null, null);
        header.before = header.after = header;
    }
Copy the code

This init method primarily initializes the header node to form a bidirectional linked list. After analyzing the constructor, let’s take a look at Entry, the most common attribute.

LinkedHashMapEntry analysis

// This Entry extends from HashMapEntry private static class LinkedHashMapEntry<K,V> extends HashMapEntry<K,V> fields comprise the doubly linked list usedforiteration. LinkedHashMapEntry<K,V> before, after; // constructor LinkedHashMapEntry(inthash, K key, V value, HashMapEntry<K,V> next) {
            super(hash, key, value, next); } // Remove a node private voidremove() { before.after = after; after.before = before; } private void addBefore(LinkedHashMapEntry<K,V> existingEntry) {after = existingEntry; before = existingEntry.before; before.after = this; after.before = this; } /* * This method is called in the put and get methods of the HashMap, which is empty in the HashMap. * In the LinkedHashMap, this method inserts the current node into the end of the list (the node before the head node) when sorting by access order. Void recordAccess(HashMap<K,V> m) {LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m; // When LinkedHashMap is sorted by accessif(lm.accessOrder) { lm.modCount++; Remove (); AddBefore (lm.header); } } void recordRemoval(HashMap<K,V> m) { remove(); }}Copy the code

Next, look at the most common method, put.

Put analysis

When we use LinkedHashMap’s PUT method, we find that it calls HashMap’s PUT method and does not copy it itself. In this case, we have two cases to discuss LinkedHashMap’s PUT operation.

The Key already exists

In the Put method of a HashMap, the recordAccess() method is called when the inserted key is found to already exist, in addition to doing the replacement, which is null in the HashMap. LinkedHashMap overwrites this method, which is also called when the LinkedHashMap overridden get method is called. LinkedHashMap does not overwrite the Put method in HashMap, RecordAccess () is implemented in LinkedHashMap as follows:

// If the currently specified accessOrder is TRUE, the current access Entry is placed at the end of the two-way circular list to indicate the most recent access, This is why there is an empty recordAccess(HashMap<K,V> m) method in hashmap.entry void recordAccess(HashMap<K,V> m) {LinkedHashMap<K,V> lm =  (LinkedHashMap<K,V>)m; //LRU algorithm to insert the visited node to the end of the listif(lm.accessOrder) { lm.modCount++; Remove (); AddBefore (lm.header); Private void addBefore(LinkedHashMapEntry<K,V> existingEntry) {after = existingEntry; before = existingEntry.before; before.after = this; after.before = this; }Copy the code

Key does not exist

During the process of putting a new Entry, if a key is found not to exist, in addition to putting the new Entry into the appropriate position in the hash table, the addEntry method is called, which calls the creatEntry method, which puts the newly inserted element at the end of the bidirectional linked list. This corresponds to both the order of insertion and the order of access.

// Create a node and insert it into the LinkedHashMap. This method overrides the HashMap addEntry method void addEntry(inthash, K key, V value, int bucketIndex) {// Note that the next node of the header is header.after, which is stored at the head of the list.  LinkedHashMapEntry<K,V> eldest = header.after; // Delete the least recently used node if necessaryif(eldest ! = header) { boolean removeEldest; size++; Try {// The implementation of the removeEldestEntry method, which Rimmer considersfalse
                removeEldest = removeEldestEntry(eldest);
            } finally {
                size--;
            }
            if(removeEldest) { removeEntryForKey(eldest.key); }} // Call the HashMap addEntry method super.addentry (hash, key, value, bucketIndex); } // Create a node and insert it into the end of the list void createEntry(inthash, K key, V value, int bucketIndex) {
        HashMapEntry<K,V> old = table[bucketIndex];
        LinkedHashMapEntry<K,V> e = new LinkedHashMapEntry<>(hash, key, value, old); table[bucketIndex] = e; // move it to the end of the bidirectional list e.ddbefore (header); size++; }Copy the code

There is a removeEldestEntry method in the addEntry method above. This method can be overridden, for example, to return true if the set memory is full, so that the least recently used node (after the header) can be removed.

Why does this method always return false?

Combined with the addEntry(int Hash,K key,V value,int bucketIndex) method above, this design makes the LinkedHashMap a normal Map without removing the “oldest” node. Why not just remove that logic from your code and design it like this? This is handy for developers, if they want to use the Map as a Cache and limit the size, they just inherit the LinkedHashMap and rewrite the removeEldestEntry(Entry<K,V> Younger) method, like this:

private static final int MAX_ENTRIES = 100;
protected boolean removeEldestEntry(Map.Entry eldest) {
      return size() > MAX_ENTRIES;
}
Copy the code

Any new element put in, regardless of the accessOrder flag bit, is placed at the end of the double-linked list, and the removeEldestEntry method can be overridden whenever the Lru algorithm needs to be implemented, eliminating the least recently used node.

The get analysis

// Override the get method in HashMap to get the Entry object through the getEntry method. // The recordAccess method does nothing if the list is sorted by insertion, and moves e to the end of the list if the list is sorted by access. public V get(Object key) { LinkedHashMapEntry<K,V> e = (LinkedHashMapEntry<K,V>)getEntry(key);if (e == null)
            return null;
        e.recordAccess(this);
        return e.value;
    }
    
void recordAccess(HashMap<K,V> m) {
            LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;
            if(lm.accessOrder) { lm.modCount++; remove(); addBefore(lm.header); }}Copy the code

The get(Object key) method obtains the node through the getEntry(Object key) method of the HashMap and returns the value of the node. If the node is obtained, null is returned. Value is obtained by key, which differs from a HashMap in that when a LinkedHashMap is sorted by access order, the current node visited is moved to the end of the list (the node before the head node).

Here we summarize the details of how the accessOrder flag bit works.

1. AccessOrder does not work

For the PUT operation, we insert the node to the end of the list regardless of the accessOrder flag bit, but we can override the removeEldestEntry method whenever we need to implement the Lru algorithm, removing the least recently used node.

AccessOrder works

When we put, if the key is not null, we call the recordAccess method, where accessOrder takes effect. If accessOrder is fasle, nothing is done. That is, when we put in a key-value pair that already has a Key, its position in the double-linked list doesn’t change. When accessOrder is set to true, the PUT operation places the related element at the end of the double-linked list.

The other case is the get operation, the GET operation we’re also going to call the recordAccess method, and for this method, we need to determine the state of the accessOrder, and if accessOrder is fasle, we don’t do anything, That is, when we put in a key-value pair that already has a Key, its position in the double-linked list doesn’t change. When accessOrder is set to true, the PUT operation places the related element at the end of the double-linked list. From a cache point of view, this is so-called “dirty data”, that is, data that has been accessed recently, so that when memory needs to be cleaned up (when new elements are added), the node behind the head of the double-linked list (empty node) can be removed.

Uncommon method

At this point, basically the most important methods of LinkedHashMap have been analyzed, but there are still some less important methods that we will take a look at once.

//
@Override
void transfer(HashMapEntry[] newTable) {
    int newCapacity = newTable.length;
    for (LinkedHashMapEntry<K,V> e = header.after; e != header; e = e.after) {
        int index = indexFor(e.hash, newCapacity);
        e.next = newTable[index];
        newTable[index] = e;
    }
}
Copy the code

The Transfer (hashmap.entry [] newTable) method is also called in HashTable as is the init() method. The transfer(hashmap.entry [] newTable) method is called when the HashMap calls resize(int newCapacity). Compute e’s index in the new capacity table array based on the hash value of the list node E and insert E into the list referenced by the calculated index.

public boolean containsValue(Object value) {
        // Overridden to take advantage of faster iterator
        if (value==null) {
            for(LinkedHashMapEntry e = header.after; e ! = header; e = e.after)if (e.value==null)
                    return true;
        } else {
            for(LinkedHashMapEntry e = header.after; e ! = header; e = e.after)if (value.equals(e.value))
                    return true;
        }
        return false;
    }
Copy the code

Overwrite the parent class containsValue(Object Value) method, directly through the header through the linked list to determine whether there is value and value equal, using the characteristics of the bidirectional circular list for query, without the array outer for loop, without the query table array.

public void clear() {
       super.clear();
       header.before = header.after = header;
   }
Copy the code

The clear() method calls the clear() method of the parent class, and then refers the before and after references of the header node to the header itself, meaning that the header node is a bidirectional circular list. The remaining nodes in the original linked list cannot be accessed and will be reclaimed by GC. Emptying the HashMap restores the bidirectional list to an empty list with only headers.

That concludes the analysis of the main methods of the LinkedHashMap source code, and it’s time to summarize what HashMap and LinkedHashMap are all about.

conclusion

For LinkedHashMap, we summarize the following:

1. As LinkedHashMap inherits from HashMap, it not only implements the storage mode of Entry array + next LinkedList based on hash table and single LinkedList like HashMap, but also combines the advantages of LinkedList. A precursor and successor are added for each Entry node, and a header header node is added to construct a bidirectional circular linked list. (An additional two-way circular linked list with header as its head, that is, every time KV is put in, it is inserted at the end of the two-way circular linked list in addition to being stored at the corresponding location in the pair of hashed tables.)

2. LinkedHashMap has one more accessOrder attribute than HashMap. When it is false, it indicates that the elements in the bidirectional linked list are sorted in the order in which the Entry is inserted into the LinkedHashMap. That is, each Entry put into the LinkedHashMap is placed at the end of the bidirectional linked list. The order of Entry output is the same as the order of insertion, which is the default storage order of bidirectional linked lists. When this is true, it means that the elements in the bidirectional list are listed in the order they were accessed, and you can see that while entries are inserted into the list in the order they were put into the LinkedHashMap, However, both the PUT and GET methods call the recordAccess method. This method determines whether the accessOrder is true, and if so, Move the currently accessed Entry (put Entry or get Entry) to the end of the bidirectional linked list. When a new Entry is put, addEntry is called, which calls creatEntry. This method also places the newly inserted element at the end of the bidirectional list, in both the order of insertion and the order of access, since the Entry was also accessed.) Otherwise, nothing.

The constructor has a method to set accessOrder. If we need to implement the LRU algorithm, we need to set accessOrder to TRUE.

In a HashMap, if the key is not null and the hash table already exists, the recordAccess method will be called when iterating through the linked list in table[I]. In a HashMap, the recordAccess method is null. This method is implemented in LinkedHashMap, which determines whether accessOrder is true, and if true, it moves the currently accessed Entry (in this case, the Entry put in) to the end of the two-way circular list, In this way, the elements in the bidirectional linked list can be sorted according to the access order (the most recently accessed Entry is placed at the end of the list, so that the previous elements are not recently accessed. When the LRU algorithm is implemented, when the number of nodes in the bidirectional linked list reaches the maximum, the previous elements can be deleted. Because the preceding element is the least recently used), otherwise do nothing.

About the author

Focus on Android development for many years, like to write blog record summary learning experience, blog synchronous update in my public number, welcome everyone to pay attention to, exchange learning ~