How are Java LinkedHashMap and HashMap different and related? Why does LinkedHashMap iterate faster? How is LinkedHashSet intrinsically related to LinkedHashMap? This article from the data structure and algorithm level, combined with vivid illustrations for readers to answer one by one.

Github address for this article

General introduction

If you’ve looked at HashSet and HashMap, and TreeSet and TreeMap, you can probably imagine that LinkedHashSet and LinkedHashMap are the same thing. LinkedHashSet and LinkedHashMap have the same implementation in Java, the former is just a wrapper around the latter, that is, there is a LinkedHashMap (adapter pattern) in the LinkedHashSet. Therefore, this article focuses on the LinkedHashMap.

LinkedHashMap implements the Map interface, which allows elements with a null key and a null value to be inserted. As you can see from the name, this container is a hybrid of Linked List and HashMap, meaning that it meets some of the features of both HashMap and Linked List. Think of LinkedHashMap as a HashMap enhanced with Linked List.

In fact, LinkedHashMap is a direct subclass of HashMap. The only difference between the two is that LinkedHashMap connects all entries in the form of doubly-linked list based on HashMap. This is to ensure that elements are iterated in the same order as they were inserted. The figure above shows the structure of LinkedHashMap. The main body of LinkedHashMap is exactly the same as HashMap, with more headers pointing to the head of the bidirectional list (a dummy element). The iteration order of the bidirectional list is the insertion order of entries.

In addition to preserving the order of the iteration, this structure has another advantage: Iteration of LinkedHashMap does not need to traverse the entire table as a HashMap does. Instead, it only needs to traverse the two-way linked list pointed to by the header. In other words, the iteration time of LinkedHashMap is only related to the number of entries, not the size of the table.

There are two parameters that affect LinkedHashMap performance: inital Capacity and load factor. The initial capacity specifies the initial table size, and the load factor specifies the threshold for automatic expansion. When the number of entries exceeds capacity*load_factor, the container automatically expands and hashes again. For scenarios with a lot of inserts, setting the initial capacity to be large can reduce the number of rehashes.

There are two methods of special concern when putting objects into a LinkedHashMap or LinkedHashSet: hashCode() and equals(). The hashCode() method determines which bucket objects will be placed in, and when multiple objects have conflicting hash values, equals() determines whether they are “the same object.” So, if you want to put a custom object into a LinkedHashMap or LinkedHashSet, you need the * @override *hashCode() and equals() methods.

To get a LinkedHashMap in the same order as the source Map iteration:

void foo(Map m) {
    Map copy = newLinkedHashMap(m); . }Copy the code

LinkedHashMap is not synchronized for performance reasons and requires manual synchronization by the programmer if used in a multithreaded environment; Or wrap the LinkedHashMap as wrapped synchronously as follows:

Map m = Collections.synchronizedMap(new LinkedHashMap(...) );

Methods analyze the

get()

The get(Object key) method returns a value based on the specified key value. This method is almost identical to the flow of the hashmap.get () method, which you can refer to for your own reference.

put()

The put(K key, V value) method adds the specified key and value pair to the map. This method first does a lookup on the map to see if it contains the tuple, and returns it if it does, similar to the get() method. If no entry is found, the addEntry(int Hash, K key, V value, int bucketIndex) method is used to insert a new entry.

Note that the insertion here has two meanings:

  1. fromtableThe Angle of view is newentryYou need to insert into the correspondingbucketIn, when there is a hash conflict, use the header method to newentryInsert into the head of the conflict list.
  2. fromheaderThe Angle of view is newentryIt needs to be inserted at the end of the bidirectional list.

The addEntry() code looks like this:

// LinkedHashMap.addEntry()
void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null! = table[bucketIndex])) { resize(2 * table.length);// Automatically expand and rehash
        hash = (null! = key) ? hash(key) :0;
        bucketIndex = hash & (table.length-1);// hash%table.length
    }
    // 1. Insert a new entry into the head of the conflicting list
    HashMap.Entry<K,V> old = table[bucketIndex];
    Entry<K,V> e = new Entry<>(hash, key, value, old);
    table[bucketIndex] = e;
    // 2. Insert a new entry at the end of the bidirectional list
    e.addBefore(header);
    size++;
}
Copy the code

The code above uses the addBefore() method to insert the new entry E before the header reference so that e becomes the last element in the bidirectional list. The code for addBefore() looks like this:

/ / LinkedHashMap. Entry. AddBefor (), insert this to the front of the existingEntry
private void addBefore(Entry<K,V> existingEntry) {
    after  = existingEntry;
    before = existingEntry.before;
    before.after = this;
    after.before = this;
}
Copy the code

The above code simply modifies the reference to the relevant entry.

remove()

The remove(Object key) function is to delete the entry corresponding to the key value. The logic of this method is implemented in removeEntryForKey(Object Key). The removeEntryForKey() method first finds the entry corresponding to the key value and then deletes it (modifying the corresponding reference to the linked list). The search process is similar to the get() method.

Note that deletion here also has two meanings:

  1. fromtableFrom the point of view of the need to beentryFrom the correspondingbucketIf the corresponding conflict list is not empty, the corresponding reference to the conflict list needs to be modified.
  2. fromheaderFrom the point of view of the need to beentryRemoves from a bidirectional list, and modifies references to preceding and following elements in the list.

The code for removeEntryForKey() is as follows:

/ / LinkedHashMap. RemoveEntryForKey (), remove the key value corresponding to the entry
final Entry<K,V> removeEntryForKey(Object key) {...int hash = (key == null)?0 : hash(key);
    int i = indexFor(hash, table.length);// hash&(table.length-1)
    Entry<K,V> prev = table[i];// Get the conflicting linked list
    Entry<K,V> e = prev;
    while(e ! =null) {// Iterate over the conflicting linked list
        Entry<K,V> next = e.next;
        Object k;
        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {// Find the entry to delete
            modCount++; size--;
            // 1. Delete e from the conflict list for the corresponding bucket
            if (prev == e) table[i] = next;
            else prev.next = next;
            // 2. Delete e from the bidirectional list
            e.before.after = e.after;
            e.after.before = e.before;
            return e;
        }
        prev = e; e = next;
    }
    return e;
}
Copy the code

LinkedHashSet

As mentioned above, LinkedHashSet is a simple wrapper for LinkedHashMap. Function calls to LinkedHashSet are converted to the appropriate LinkedHashMap method, so the implementation of LinkedHashSet is very simple and will not be described here.

public class LinkedHashSet<E>
    extends HashSet<E>
    implements Set<E>, Cloneable.java.io.Serializable {...// LinkedHashSet contains a LinkedHashMap
    public LinkedHashSet(int initialCapacity, float loadFactor) {
        map = newLinkedHashMap<>(initialCapacity, loadFactor); }...public boolean add(E e) {// Simple method conversion
        return map.put(e, PRESENT)==null; }... }Copy the code