Big Data to God -Java Advanced Features Enhancement (HashMap)

GitHub github.com/wangzhiwubi…

Big Data into the Road of God series:

GitHub github.com/wangzhiwubi…

Java Advanced Feature Enhancements – Collection

Java Advanced Feature enhancements – Multithreading

Java advanced features enhancements -Synchronized

Java Advanced feature enhancements – Volatile

Java Advanced Feature Enhancements – Concurrent Collections framework

Java Advanced Features enhancement – Distributed

Java Advanced Feature enhancements -Zookeeper

Java Advanced Features enhancements -JVM

Java Advanced Feature Enhancements -NIO

The public,

The only net from 0 to help Java developers to do big data domain public number ~
Http://import_bigdata/import_bigdata/import_bigdata/import_bigdata/import_bigdata/import_bigdata/import_bigdata

Java Advanced Features Enhancements – Collections Framework (HashMap)

This part of the network has a large number of resources can refer to, in this part of the collation, thank the predecessors pay, at the end of each section of the article has a reference list, source code recommendation see JDK1.8 later version, pay attention to the screen ~ #### multithreading ### collection framework #### #Java concurrent container

Collections framework

Collections framework in Java

ArrayList/Vector LinkedList HashMap HashSet LinkedHashMap … The content of this chapter mainly refers to the content of the Internet, there are a lot of high-quality resources on the Internet, the author has made the following arrangement:

HashMap (based on JDK1.8)

Introduction of a HashMap

HashMap is mainly used to store key-value pairs. Based on the Map interface of hash table, HashMap is one of the commonly used Java collections. In JDK1.8, a HashMap consists of an array, which is the body of a HashMap, and a list, which is primarily used to resolve hash collisions. Transform linked lists into red-black trees to reduce search time.

Analysis of underlying data structures

Before JDK1.8, the underlying HashMap was a combination of arrays and lists, known as list hashes. The HashMap uses the hash function of the key’s hashCode to get the hash value, and then uses (n-1) &hash to determine where the current element is stored (where n refers to the length of the array). If there is an element in the current position, Determine whether the hash value and key of the element are the same as those of the element to be saved. If they are, overwrite them directly. If they are not, solve the conflict by zipping. The perturbation function refers to the hash method of the HashMap. The use of hash methods, known as perturbed functions, is intended to prevent some poorly implemented hashCode() methods, in other words, to reduce collisions. The JDK 1.8 hash method is simpler than the JDK 1.7 hash method, but the principle remains the same.

static final int hash(Object key) { int h; // key.hashcode () : returns the hash value, also known as hashCode // ^ : bitwise xor // >>>: unsigned right shift, ignoring the sign bit, and filling the empty space with 0return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
  }
Copy the code

Compare the hash method source code for JDK1.7’s HashMap.

static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).

    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}
Copy the code

Hash methods in JDK 1.7 perform slightly worse than those in JDK1.8 because they are disturbed four times.

The so-called “zipper method” is a combination of linked lists and arrays. That is, create an array of linked lists, where each cell is a linked list. If a hash conflict is encountered, add the value of the conflict to the linked list.

Class attributes:

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {// private static Final Long serialVersionUID = 362498820763181265L; Static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; Static final int MAXIMUM_CAPACITY = 1 << 30; // The default fill factor is static finalfloatDEFAULT_LOAD_FACTOR = 0.75 f; Static final int TREEIFY_THRESHOLD = 8; static final int TREEIFY_THRESHOLD = 8; Static final int UNTREEIFY_THRESHOLD = 6; static final int UNTREEIFY_THRESHOLD = 6; Static final int MIN_TREEIFY_CAPACITY = 64; static final int MIN_TREEIFY_CAPACITY = 64; // Array of storage elements, always power of 2 TRANSIENT Node<k,v>[] table; // Transient Set<map.entry<k,v>> entrySet; // The number of elements to store. Note that this does not equal the length of the array. transient int size; // Map count transient int modCount; // Threshold When the actual size (capacity x filling factor) exceeds the threshold, the system expands the capacity. Int threshold; // Fill factor finalfloat loadFactor;
}
Copy the code

LoadFactor loadFactor

The loadFactor controls the density of data stored in the array. The more loadFactor approaches 1, the more entries stored in the array will be, and the more dense it will be. In other words, the length of the linked list will increase, while the loadFactor will be smaller, that is, it will approach 0.

Too large a loadFactor will result in inefficient element search, while too small will result in low array utilization and scattered data storage. The default value of loadFactor is 0.75F, which is officially a good threshold.

Given a default capacity of 16 with a load factor of 0.75. When Map is used, data is continuously stored in it. When the amount of data reaches 16 x 0.75 = 12, the current capacity of 16 needs to be expanded. This process involves rehash and data replication, which consumes high performance.

threshold

Threshold = capacity * loadFactor; if Size>=threshold, then we need to consider whether the array needs to be amplified.

Node class source code:

Entry<K,V> static class Node<K,V> implements map. Entry<K,V> {final inthash; // Hash value, which is used to store elements in a hashmap with other elementshashValue comparison final K key; / / V value; Node<K,V> next; Node(inthash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "="+ value; } / / rewritehashCode() method public final inthashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            returnoldValue; Public final Boolean equals(Object o) {public final Boolean equals(Object o) {if (o == this)
                return true;
            if(o instanceof Map.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false; }}Copy the code

Tree node class source code:

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> { TreeNode<K,V> parent; / / parent TreeNode < K, V > left; / / left TreeNode < K, V > right; / / right TreeNode < K, V > prev. // needed to unlink next upon deletion boolean red; // Determine the color TreeNode(inthash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next); } final TreeNode<K,V>root() {
            for (TreeNode<K,V> r = this, p;;) {
                if ((p = r.parent) == null)
                    return r;
                r = p;
       }
Copy the code

HashMap source code analysis

A constructor

// Default constructor. public More ...HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted} // include another "Map" constructor public More... HashMap(Map<? extends K, ? extends V> m) { this.loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries(m,false); } // Specify the size of the constructor public More... HashMap(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } // Specify the "capacity size" and "load factor" constructor public More... HashMap(int initialCapacity,float loadFactor) {
         if (initialCapacity < 0)
             throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
         if (initialCapacity > MAXIMUM_CAPACITY)
             initialCapacity = MAXIMUM_CAPACITY;
         if (loadFactor <= 0 || Float.isNaN(loadFactor))
             throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
         this.loadFactor = loadFactor;
         this.threshold = tableSizeFor(initialCapacity);
     }
Copy the code

PutMapEntries method:

final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    int s = m.size();
    if(s > 0) {// Check whether the table is initializedif(table == null) {// pre-size // uninitialized, s is the actual number of elements in mfloat ft = ((float)s/loadFactor int t = ((ft < (float)MAXIMUM_CAPACITY) ? (int)ft : MAXIMUM_CAPACITY); // If t is greater than the threshold, initialize the thresholdif(t > threshold) threshold = tableSizeFor(t); } // The system has been initialized and the number of M elements exceeds the thresholdelse if(s > threshold) resize(); // Add all elements in m to the HashMapfor (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict); }}}Copy the code

The put method HashMap only provides put for adding elements. The putVal method is just a method called by the PUT method and is not provided to the user.

The putVal method adds elements as follows:

(1) If the array location has no element, insert it directly. (2) If there are elements in the array location, compare it with the key to be inserted. If the key is the same, overwrite it directly. If the key is different, judge whether P is a tree node. If yes, call e = ((TreeNode<K,V>)p).puttreeval (this, TAB, Hash, key, value) to add the element. If not, iterate over the list insert.

public V put(K key, V value) {
    return putVal(hash(key), key, value, false.true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; // The table is not initialized or has a length of 0if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // (n - 1) & hashDetermines which bucket the element is in, which bucket is empty, and which node is in the bucket (in this case, the node is in an array).if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null); // Elements already exist in the bucketelse{ Node<K,V> e; K k; // Compare the value of the first element in the bucket (node in the array)hashThe values are equal, the keys are equalif (p.hash == hash&& ((k = p.key) == key || (key ! = null && key.equals(k))))) // Assign the first element to e, using e to record e = p; //hashValues are not equal, that is, keys are not equal; Is a red-black tree nodeelse ifE = ((TreeNode<K,V>)p).puttreeval (this, TAB,hash, key, value); // Is a linked list nodeelse{// Insert node at the end of the listfor(int binCount = 0; ; ++binCount) {// reaches the end of the listif((e = p.ext) == null) {// insert a newNode at the end of the node.hash, key, value, null); // The number of nodes reaches the threshold and is converted into a red-black treeif (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash); // Break the loopbreak; } // Check whether the key value of the node in the list is equal to the key value of the inserted elementif (e.hash == hash&& ((k = e.key) == key || (key ! = null && key.equals(k)))) // equal, out of the loopbreak; P = e; p = e; p = e; }} // Find the key in the bucket.hashNode whose value is equal to the inserted elementif(e ! = null) {// record e value V oldValue = e.value; / / onlyIfAbsent forfalseOr the old value is nullif(! OnlyIfAbsent | | oldValue = = null) / / replace the old with the new value value e.v alue = value; AfterNodeAccess (e); // Return the old valuereturnoldValue; } // Struct change ++modCount; // If the actual size is greater than the threshold, expand the capacityif(++size > threshold) resize(); // afterNodeInsertion insertion (evict);return null;
} 
Copy the code

Let’s compare the code for the JDK1.7 PUT method again

The put method is analyzed as follows:

(1) If the array location has no element, insert it directly. (2) If there is an element in the array location, traverse the linked list with this element as the head node, and compare it with the inserted key in turn. If the key is the same, it will be overwritten directly.

public V put(K key, V value)
    if (table == EMPTY_TABLE) { 
    inflateTable(threshold); 
}  
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key);
    int i = indexFor(hash, table.length);
    for(Entry<K,V> e = table[i]; e ! = null; E = e.ext) {// first traverses Object k;if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue; 
        }
    }

    modCount++;
    addEntry(hash, key, value, i); / / insert againreturn null;
}
Copy the code

The get method

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if((tab = table) ! = null && (n = tab.length) > 0 && (first = tab[(n - 1) &hash])! = null) {// Array elements are equalif (first.hash == hash&& // always check first node ((k = first.key) == key || (key ! = null && key.equals(k))))returnfirst; // There is more than one node in the bucketif((e = first.next) ! = null) {// get in treeif (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key); // Get in the listdo {
                if (e.hash == hash&& ((k = e.key) == key || (key ! = null && key.equals(k))))return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}
Copy the code

The resize method performs expansion with a rehash assignment and traverses all elements of the hash table, which is time-consuming. Avoid resize when writing programs.

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if(oldCap > 0) {// If oldCap exceeds the maximum value, it will not be expandedif (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            returnoldTab; } // If it does not exceed the maximum value, it is doubledelse if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else{ signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } // Calculate the new resize upper limitif (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes"."unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if(oldTab ! = null) {// Move each bucket to the new bucketsfor (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if((e = oldTab[j]) ! = null) { oldTab[j] = null;if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { 
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do{ next = e.next; / / the original indexesif ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            elseloTail.next = e; loTail = e; } // Old index +oldCapelse {
                            if (hiTail == null)
                                hiHead = e;
                            elsehiTail.next = e; hiTail = e; }}while((e = next) ! = null); // Put the original index in the bucketif(loTail ! = null) { loTail.next = null; newTab[j] = loHead; } // Add oldCap to bucketif(hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } }return newTab;
}
Copy the code

Common HashMap method tests

import java.util.Collection; import java.util.HashMap; import java.util.Set; public class HashMapDemo { public static void main(String[] args) { HashMap<String, String> map = new HashMap<String, String>(); // The key cannot be repeated, the value can be repeated map.put("san"."Zhang");
        map.put("si"."Bill");
        map.put("wu"."Fifty");
        map.put("wang"."Wang");
        map.put("wang"."Lao wang 2"); // Old wang is overwritten by map.put("lao"."Wang");
        System.out.println("------- directly output hashmap:-------"); System.out.println(map); System.out.println(** *);"-------foreach to get all keys in the Map :------");
        Set<String> keys = map.keySet();
        for (String key : keys) {
            System.out.print(key+""); } System.out.println(); Println (system.out.println (system.out.println))"-------foreach gets all the values in the Map :------");
        Collection<String> values = map.values();
        for (String value : values) {
            System.out.print(value+""); } System.out.println(); Println (system.out.println (system.out.println));"------- gets the value of the key and the corresponding value of the key :-------");
        Set<String> keys2 = map.keySet();
        for (String key : keys2) {
            System.out.print(key + "：" + map.get(key)+""); } /** * When I call put(key,value), I wrap the key and value into the static inner class object // and add the Entry object to the array. So we want to get all the key-value pairs in the // map, we just get all the entries in the array, Set< java.util.map. Entry<String, String>> entrys = map.entrySet(); entrys = map.entryset ();for (java.util.Map.Entry<String, String> entry : entrys) {
            System.out.println(entry.getKey() + "--"+ entry.getValue()); } /** */ system.out.println ("After the map. The size () :"+map.size());
        System.out.println("After the map. IsEmpty () :"+map.isEmpty());
        System.out.println(map.remove("san"));
        System.out.println("After the map. The remove () :"+map);
        System.out.println("After the map. The get (si) :"+map.get("si"));
        System.out.println("After map. Either containsKey (si) :"+map.containsKey("si"));
        System.out.println("After containsValue(lI Si) :+map.containsValue("Bill"));
        System.out.println(map.replace("si"."Li si 2"));
        System.out.println("After map.replace(si, lI Si 2):"+map); }}Copy the code

Reference articles and books: “the Effective Java” to thank the following author: www.cnblogs.com/skywang1234… Crossoverjie. Top/JCSprout / # /… Github.com/Snailclimb/… Blog.csdn.net/qq_34337272… www.jianshu.com/p/a5f99f253… www.jianshu.com/p/506c1e38a…

Please stamp: making the original https://github.com/wangzhiwubigdata/God-Of-BigData pay close attention to the public, push, interview, download resources, focus on more big data technology ~ data into the way of god ~ update is expected to 500 + articles, has been updated 50 + ~Copy the code