[TOC]

Introduction of the Map

A Map is a collection of key-value pairs, each element containing a key object and a value object. Key objects are not allowed to be duplicated.

The Map interface is different from the Collection interface. The Map interface has two main implementation classes: HashMap class and TreeMap class. HashMap class accesses key-value objects by hash algorithm, while TreeMap class can sort key-value objects.

graph LR
Map-->HashMap
Map-->HashTable
Map-->TreeMap
Map-->IdentityHashMap
Map-->WeakHashMap
graph LR
Collection-->List
List-->ArrayList
List-->LinkedList
List-->Vetor
Collection-->Set
Collection-->Queue
Collection-->SortedSet
The method signature instructions
V get(Object key) Returns the value of the specified key-value pair in the Map union
V put(K key, V value) Add key-value pairs to the Map collection, returning the value of the previous key, or null if none exists
V remove(Object key) Removes the key-value pair corresponding to the key from the Map collection. Returns the value corresponding to the key, or null if none is present
Set entrySet() Return a Set of all key-value pairs in a Map Set of type Map inner class, map.entry
Set keySet() Returns all entries in the Map collectionKey objectThe Set of collection

After java1.8, array + linked list + red-black tree is used for storage, and key-value pairs are stored by hash mapping

  • Perform a modular operation on the key’s Hashcode to get the array subscript

  • An array stores a single linked list

  • Arrays with the same subscripts will be stored in the same linked list

  • The elements are arranged out of order

  • Lists that exceed a certain length (TREEIFY_THRESHOLD=8) are converted to red-black trees

  • Red-black trees will return to the linked list again if certain conditions are met

HashMap

HashMap occupies a high frequency in the implementation class of our Map interface. During initialization, the default capacity will be given as 16, and the loading factor for expansion will be 0.75. When the number of nodes in the instance is >16*0.75, the capacity will be expanded, and each expansion will be 1<<2, that is, twice.

Hashmap cannot be used in multi-threaded scenarios, concurrentHashmap is recommended for multi-threaded scenarios!

The reason is expansion! Expansion mechanism! Expansion mechanism! It will be dissected in detail below

So how can we use the code together to take a closer look at the data structure of the HashMap?

Map hashMap=new HashMap(3);
hashMap.put("Lin".Awesome!);
hashMap.put("Pony".777);
Copy the code

The data structure in our HashMap should have only one Node

Where the index of the Node[] array is modded by the hash value of the key object, the source formula uses the bit operation (leng-1)&hash and then placed into the Node[(leng-1)&hash] of the Node array.

Of course, if Lin and Pony might have the same hash value (hash collision), a node will be appended to Lin’s exit, forming a linked list.

When you append a node, it becomes a linked list, so there’s a header insertion and a tail insertion.

The first interpolation

First we declare the next one-way linked list node


    class ListNode {
        // The data stored on the node
        int val;
        // One-way linked list pointer
        ListNode next;

        ListNode() {
        }
        ListNode(int val) {
            this.val = val;
        }
        ListNode(int val, ListNode next) {
            this.val = val;
            this.next = next;
        }
        public void setVal(int val) {
            this.val = val;
        }

        public ListNode getNext(a) {
            return next;
        }

        public void setNext(ListNode next) {
            this.next = next; }}Copy the code

The next step is to use the head plug method for data storage

	class List {
        / / head node
        private ListNode headNode = new ListNode();
       /** ** ** /
        public boolean headInsert(int val) {
            // instantiate the new node, i.e. the node to be inserted
            ListNode node = new ListNode();
            // Put the value object first
            node.setVal(val);
            // The next pointer to the node to be inserted points to the next pointer field of the header
            node.setNext(headNode.getNext());
            // The header pointer points to the node to be inserted
            headNode.setNext(node);
            // Check whether it points to success
            if (headNode.getNext()==node){
                return true;
            }
            return false; }}Copy the code

The insertion action of the head plug unidirectional linked list is shown below

The tail interpolation

Or on the basis of the node data structure of the above square head interpolation method, let’s simply express the lower tail interpolation method

   class List {
        / / end nodes
        private ListNode tailNode = new ListNode();
        /** ** ** /
        public boolean tailInsert(int val) {
            // instantiate the new node, i.e. the node to be inserted
            ListNode node = new ListNode();
            // Put the value object first
            node.setVal(val);
            // The end pointer points to the node to be inserted
            tailNode.setNext(node);
            // The end node is changed directly to the node being inserted
            tailNode=node;
            // Check whether the end node is the current node
            if (tailNode.getNext()==null) {return true;
            }
            return false; }}Copy the code

The tail insertion action is shown below

Why does Java8 use tail interpolation

Header interpolation was added to the Map list in Java7, while tail interpolation was used in Java8 for two reasons:

  • Headpin causes a dead chain
  • Java7 considers that hot data may be used earlier, and if a linked list migration occurs, headers will still disrupt the order of inserts

Internally important attribute

     /** * Default initialization capacity, the value is 16, must be 2 n */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    
    /** * Default load factor * Resize when size>capacity*DEFAULT_LOAD_FACTOR */
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;  
    
    /** * The length of the list that needs to be turned into a red-black tree. * If MIN_TREEIFY_CAPACITY is set to 64 and the length of the list is 8, then MIN_TREEIFY_CAPACITY is set to 64. * Otherwise, expand the capacity. * /
    static final int TREEIFY_THRESHOLD = 8;    
    
    /** ** The number of elements in a red black tree */
    static final int UNTREEIFY_THRESHOLD = 6;


    /** * tree list, minimum array size threshold * if the array size exceeds this threshold, the list will tree */
    static final int MIN_TREEIFY_CAPACITY = 64;        
Copy the code

Different types of node structures

Both LinkedMap and HashMap are internally dependent on Node nodes, which are the most basic structure of our data storage. In Java7 it is called Entry, and in Java8 it is called Node. Both of them are essentially the implementation of Map.Entry, but their internal structure is different. LinkedMap is a bidirectional list, sorted in TreeMap, or a red-black tree.

    /** * list node, implement map. Entry interface */
    static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
        
        // ... 
    }
    
    /** * the node type of the element in LinkedHashMap */
    static class Entry<K.V> extends HashMap.Node<K.V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next); }}/** * a red-black tree inherits linkedhashmap. Entry and indirectly inherits hashmap. Node, so it has the properties of a linked list. * This node type can actually be used as either a red-black tree node or a bidirectional linked list node. * /
    static final class TreeNode<K.V> extends LinkedHashMap.Entry<K.V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        
        // ...
    }    
Copy the code

Important operations in the HashMap

In HashMap, we often use such operations as GET and set, but there are also some other linked list to red-black tree, red-black tree to linked list and a series of expansion or iteration operations, which affect the internal structure of HashMap. This is also why HashMap is not thread-safe and has high performance.

A HashMap instantiation

     /** * You can specify the initial capacity of the current new Map by constructing an argument *@paramInitialCapacity initialCapacity *@paramLoadFactor loadFactor that affects capacity expansion */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
    /**
     * Constructs a new <tt>HashMap</tt> with the same mappings as the
     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
     * default load factor (0.75) and an initial capacity sufficient to
     * hold the mappings in the specified <tt>Map</tt>.
     *
     * @param   m the map whose mappings are to be placed in this map
     * @throws  NullPointerException if the specified map is null
     */
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
Copy the code

Put New node

	public V put(K key, V value) {
        return putVal(hash(key), key, value, false.true);
    }

    /**
     * Implements Map.put and related methods
     *
     * @paramHash The hash value of the key object *@paramKey Key object *@paramValue the value object in a key-value pair is used to store this object *@paramOnlyIfAbsent Whether the same key object replaces the current value object *@param evict if false, the table is in creation mode.
     * @returnOldValue is returned if it is a replacement, null */ is returned otherwise
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}if(e ! =null) { // existing mapping for key
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
Copy the code

TreeifyBin Tree linked list


    /** * Replace all list nodes */ on the array when two thresholds are reached
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) ! =null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while((e = e.next) ! =null);
            if((tab[index] = hd) ! =null) hd.treeify(tab); }}Copy the code

The resize capacity

The size of Node array is first. When the number of data inserts reaches a certain number, it will be expanded. Resize ()

So when do you resize? There are two factors

  • Capacity: indicates the current Capacity of the HashMap
  • LoadFactor: indicates the LoadFactor

If the Map length is greater than capacity x load factor (for example, the capacity is 16) and the number of nodes in the Map exceeds 12, you need to expand the capacity. How do I expand the capacity?

  1. Check whether the capacity of the old array is greater than or equal to the maximum capacity 1<<30. Expand the new array capacity and threshold parameters by two times

            Node<K,V>[] oldTab = table;
    		// Determine the size of the current array. If it is 0, create a new array. The capacity and load factor are both 16*0.75 by default
            int oldCap = (oldTab == null)?0 : oldTab.length;
            int oldThr = threshold;
            int newCap, newThr = 0;
    		// Check whether the current array has capacity and the maximum capacity is 1<<30
            if (oldCap > 0) {
                if (oldCap >= MAXIMUM_CAPACITY) {
                    threshold = Integer.MAX_VALUE;
                    return oldTab;
                }
                // If there is capacity, and the new capacity does not exceed the maximum capacity, double the threshold and capacity, i.e., capacity is always a power of 2
                else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                         oldCap >= DEFAULT_INITIAL_CAPACITY)
                    newThr = oldThr << 1; // double threshold
            }
            else if (oldThr > 0) // initial capacity was placed in threshold
                newCap = oldThr;
            else {               // zero initial threshold signifies using defaults
                // If the threshold is 0, the default value is used
                newCap = DEFAULT_INITIAL_CAPACITY;
                newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
            }
            if (newThr == 0) {
                float ft = (float)newCap * loadFactor;
                newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                          (int)ft : Integer.MAX_VALUE);
            }
            threshold = newThr;
    Copy the code
    1. Create a new array based on the new threshold

      		// Create a new array based on the new capacity parameter
      		Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
      		// The Node array reference points to the new array
              table = newTab;
              if(oldTab ! =null) {
                  for (int j = 0; j < oldCap; ++j) {
                      Node<K,V> e;
                      if((e = oldTab[j]) ! =null) {
                          oldTab[j] = null;
                          if (e.next == null)// indicates only one node at j subscript
                              // Based on the capacity of the new array, calculate the position of the module in the new array
                              newTab[e.hash & (newCap - 1)] = e;
                          else if (e instanceof TreeNode)
                              // Split the red-black tree into two bidirectional lists. If the length of the bidirectional list is greater than 6, the decomposed bidirectional list is converted into a red-black tree.
                              // Then place the bidirectional linked list or red-black tree at the corresponding subscript of the new table
                              ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                          else { // preserve order
                              // Split the original list into two (at least one) lists and maintain the relative order of the elements in the original list
                              Node<K,V> loHead = null, loTail = null;
                              Node<K,V> hiHead = null, hiTail = null;
                              Node<K,V> next;
                              do {
                                  next = e.next;
                                  if ((e.hash & oldCap) == 0) {
                                      if (loTail == null)
                                          loHead = e;
                                      else
                                          loTail.next = e;
                                      loTail = e;
                                  }
                                  else {
                                      if (hiTail == null)
                                          hiHead = e;
                                      elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                              if(loTail ! =null) {
                                  loTail.next = null;
                                  // Place the new linked list at the corresponding subscript of the new table
                                  newTab[j] = loHead;
                              }
                              if(hiTail ! =null) {
                                  hiTail.next = null;
                                  newTab[j + oldCap] = hiHead;
                              }
                          }
                      }
                  }
              }
              return newTab;
      Copy the code

Get Gets the Value object

Return the value associated with the key; If the key does not exist, null is returned.

  1. Calculates the hash value of the key, and calculates the position of the key in the table array based on the hash value
  • If the position is not null, the key is compared to its hash value

  • If it is equal, the node is returned

  • Otherwise,

    • If the node is a red-black tree, the node corresponding to the key is searched from the red-black tree
    • Otherwise, the key is found in the linked list
      • Otherwise, null is returned
  1. Gets the value of the node based on 1, or returns NULL
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    /** * implements the Map interface get method */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if((tab = table) ! =null && (n = tab.length) > 0 &&
            // At least one Node exists in the array after the hash value is moduled. One Node exists at the index position
            (first = tab[(n - 1) & hash]) ! =null) {
            if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
                return first;
            if((e = first.next) ! =null) {
                // If the head of the list is not the target element, then check whether it is a tree-like node
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    // Iterate through the list until you find a node with the same key and hash value
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        return e;
                } while((e = e.next) ! =null); }}return null;
    }
Copy the code

Remove Deletes a key object

Delete the key and return the value associated with the key. If the key does not exist, null is returned

  1. Find the node corresponding to the key

  2. Delete the node from the linked list or red-black tree

    public V remove(Object key) {
        Node<K,V> e;
        return (e = removeNode(hash(key), key, null.false.true)) = =null ?
            null : e.value;
    }

    /** * implements the Map interface remove method */
    final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        if((tab = table) ! =null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) ! =null) {
            Node<K,V> node = null, e; K k; V v;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                node = p;
            else if((e = p.next) ! =null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while((e = e.next) ! =null); }}if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    tab[index] = node.next;
                else
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                returnnode; }}return null;
    }
Copy the code

When will the capacity be expanded?

  • Size property > array size * when loading factor

  • The length of the linked list formed to resolve hash conflicts is >= TREEIFY_THRESHOLD (8), but the length of the table array is <= MIN_TREEIFY_CAPACITY (64)

When is the red-black tree operation performed?

  • If the length of the linked list formed to resolve hash conflicts is greater than = TREEIFY_THRESHOLD (8) and the length of the table array is greater than MIN_TREEIFY_CAPACITY (64)

  • TREEIFY_THRESHOLD (8) If the length of the new list is >= TREEIFY_THRESHOLD (8)

Why is it thread-safe?

For performance and safety. In multithreaded environments, use ConcurrentHashMap

  • What happens if multiple threads write HashMap instances?
  1. Multithreaded PUT (regardless of resize)

    If hash conflicts occur, the subsequent thread overwrites the key-value of the previous thread. For example, if N key-value pairs are hash conflicts in a series of threads, n-1 key-value pairs may be lost at most. That is, only one thread’s key-value pairs are eventually added to the linked list or red-black tree in the table

  2. Multithreaded resize

    The execution sequence of threads A and B is as follows

    A thread

    1. oldTab = table; // Assign a reference to the old array
    2. Node

      [] newTab = (Node

      [])new Node[newCap]; // Create the expanded array
      ,v>
      ,v>
    3. table = newTab; // Assign the new array to the old array
    4. Put all elements of oldTab into newTab // A thread has not executed here

    Thread B

    1. oldTab = table; // Thread B gets the table that thread A just created without any elements
    2. Node

      [] newTab = (Node

      [])new Node[newCap]; // Create the expanded array
      ,v>
      ,v>
    3. table = newTab; // Assign the new array to the old array, overwriting the newTable created by thread A
    4. Put all the elements in oldTab into the newTab // B thread before thread A. There are no elements in oldTab, so there are no elements in newTab, that is, there are no elements in table. In this way, all elements are lost

    If thread B starts executing the above logic while thread A is executing step 4, it is possible to lose not all elements, but some elements that thread A has not put into the newTable.

  3. Multithreaded remove