If you have any questions or suggestions, please leave a comment below. Last updated: 2018-09-21

preface

The implementation principle of HashMap element insertion in JDK1.8 has been analyzed in detail. After the article was posted, a friend asked the question: “Header insertion in JDK1.7, why tail insertion in JDK1.8?” . Some people say that this is the Java god of random, no special use. At that time because I did not carefully read the 1.7 source code, so it is difficult to answer. Now I hereby write this article, to carry out a detailed analysis of the problem.

Static constants

Source:
 1/ * *

2* The default initial size, 16, must be a power of 2

3* /


4static final int DEFAULT_INITIAL_CAPACITY = 1 << 4// aka 16

5

6/ * *

7* Maximum capacity, must not be greater than 2^30

8* /


9static final int MAXIMUM_CAPACITY = 1 << 30;

10

11/ * *

12* The default load factor is 0.75

13* /


14static final float DEFAULT_LOAD_FACTOR = 0.75f;

15

16/ * *

17* Empty array of HashMap

18* /


19static final Entry<?.? >[] EMPTY_TABLE = {};

20

21/ * *

22* Optional default hash threshold

23* /


24static final int ALTERNATIVE_HASHING_THRESHOLD_DEFAULT = Integer.MAX_VALUE;

Copy the code

Note: the default HashMap in jdk1.7 uses array + single linked list to store elements. When a hash conflict occurs, elements are stored in the single linked list at that location. This is different from 1.8, except for arrays and single-linked lists. When the single-linked list has more than 8 elements, it is converted to red-black tree storage, which cleverly reduces the time complexity of traversing elements from O(n) to O(logn)).

The constructor

1, no arguments constructor:

1public HashMap(a) {

2    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);

3}

Copy the code

2, parameter constructor, specify the initial capacity:

1public HashMap(int initialCapacity) {

2    this(initialCapacity, DEFAULT_LOAD_FACTOR);

3}

Copy the code

3, the parameter constructor, specify the initial capacity and load factor:

 1public HashMap(int initialCapacity, float loadFactor) {

2    if (initialCapacity < 0)

3        throw new IllegalArgumentException("Illegal initial capacity: " +

4                                           initialCapacity);

5    if (initialCapacity > MAXIMUM_CAPACITY)

6        initialCapacity = MAXIMUM_CAPACITY;

7    if (loadFactor <= 0 || Float.isNaN(loadFactor))

8        throw new IllegalArgumentException("Illegal load factor: " +

9                                           loadFactor);

10

11    this.loadFactor = loadFactor;

12    threshold = initialCapacity;// Unlike jdK8, the initial threshold is the initial capacity and is not raised to the second power

13    init();

14}

Copy the code
Constructor (); Map set ();
 1public void putAll(Map<? extends K, ? extends V> m) {

2        int numKeysToBeAdded = m.size();

3        if (numKeysToBeAdded == 0)

4            return;

5

6        if (table == EMPTY_TABLE) {

7            inflateTable((int) Math.max(numKeysToBeAdded * loadFactor, threshold));

8        }

9

10        if (numKeysToBeAdded > threshold) {

11            int targetCapacity = (int)(numKeysToBeAdded / loadFactor + 1);

12            if (targetCapacity > MAXIMUM_CAPACITY)

13                targetCapacity = MAXIMUM_CAPACITY;

14            int newCapacity = table.length;

15            while (newCapacity < targetCapacity)

16                newCapacity <<= 1;

17            if (newCapacity > table.length)

18                resize(newCapacity);

19        }

20

21        for (Map.Entry<? extends K, ? extends V> e : m.entrySet())

22            put(e.getKey(), e.getValue());

23    }

Copy the code

Note: When the constructor is executed, the array that holds the elements is not initialized. Instead, it is initialized when the elements are first put in. When creating a HashMap object, only the initial capacity and the new threshold are calculated.

Add elements

1, source code:

 1public V put(K key, V value) {

2    if (table == EMPTY_TABLE) {

3        inflateTable(threshold);// Initialize the array

4    }

5    if (key == null)// Add key as null

6        return putForNullKey(value);

7    int hash = hash(key);// Compute the hash of the key value

8    int i = indexFor(hash, table.length);// Get the index position in the array based on the hash value

9    for(Entry<K,V> e = table[i]; e ! =null; e = e.next) {// Traverses the singly linked list of index positions to determine whether the specified key exists

10        Object k;

11        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {// Update the value if the key already exists

12            V oldValue = e.value;

13            e.value = value;

14            e.recordAccess(this);

15            return oldValue;

16        }

17    }

18

19    modCount++;

20    addEntry(hash, key, value, i);// If key does not exist, insert element

21    return null;

22}

23

24private V putForNullKey(V value) {

25    for (Entry<K,V> e = table[0]; e ! =null; e = e.next) {

26        if (e.key == null) {// If key is null, update value

27            V oldValue = e.value;

28            e.value = value;

29            e.recordAccess(this);

30            return oldValue;

31        }

32    }

33    modCount++;

34    addEntry(0.null, value, 0);// If the key is null, the hash value is 0

35    return null;

36}

37

38void addEntry(int hash, K key, V value, int bucketIndex) {

39    if ((size >= threshold) && (null! = table[bucketIndex])) {// There are elements at the insertion position and the number of elements is greater than or equal to the new threshold

40        resize(2 * table.length);// Perform double capacity expansion

41        hash = (null! = key) ? hash(key) :0;// The value of the hash seed may be adjusted during expansion, so the hash value is recalculated

42        bucketIndex = indexFor(hash, table.length);// Recalculate the position in the expanded array

43    }

44

45    createEntry(hash, key, value, bucketIndex);// Add elements

46}

47

48// Compute the object hash value

49final int hash(Object k) {

50    int h = hashSeed;

51    if (0! = h && kinstanceof String) {//String takes a separate algorithm

52        return sun.misc.Hashing.stringHash32((String) k);

53    }

54

55    h ^= k.hashCode();// Use the hash seed xor hash value to increase randomness for optimization

56

57    h ^= (h >>> 20) ^ (h >>> 12);

58    return h ^ (h >>> 7) ^ (h >>> 4);// The shift xor operation here belongs to the perturb function, which is to increase the randomness of the hash value and reduce the probability of hash conflict

59}

60

61void createEntry(int hash, K key, V value, int bucketIndex) {

62    Entry<K,V> e = table[bucketIndex];

63    table[bucketIndex] = new Entry<>(hash, key, value, e);// Insert the new element into the index position of the array. The original element is the successor node

64    size++;

65}

Copy the code

2. Flow chart:

Figure note: Add element flowchart

3. Examples:

Figure note: Initial state


Figure note: Add 10


Figure note: Add 18


Figure Note: Capacity expansion


Figure Note: Added after capacity expansion

Initializing an array

1, source code:

 1// Initializes the array according to the specified size

2private void inflateTable(int toSize) {

3    // Find a power of 2 >= toSize

4    int capacity = roundUpToPowerOf2(toSize);

5

6    threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);// Calculate the threshold based on the capacity and load factor, the maximum is 2^30+1

7    table = new Entry[capacity];// Create an array of the specified size

8    initHashSeedAsNeeded(capacity);

9}

10

11// Gets the lowest power greater than the specified value, up to 2^30

12private static int roundUpToPowerOf2(int number) {

13    // assert number >= 0 : "number must be non-negative";

14    return number >= MAXIMUM_CAPACITY

15            ? MAXIMUM_CAPACITY

16            : (number > 1)? Integer.highestOneBit((number -1) < <1) : 1;

17}

Copy the code

2. Description:

In terms of hash seeds, the purpose is to optimize the hash function to make its values more random, thus reducing the probability of hash conflicts. Using HashMap private static class Holder, the JVM startup, specify – Djdk. Map. Althashing. Threshold = value, to set up the optional hash threshold, so as to decide whether to need to adjust the hash in initHashSeedAsNeeded seeds.

 1private static class Holder {

2

3    / * *

4     * Table capacity above which to switch to use alternative hashing.

5* /


6    static final int ALTERNATIVE_HASHING_THRESHOLD;

7

8    static {

9        String altThreshold = java.security.AccessController.doPrivileged(

10            new sun.security.action.GetPropertyAction(

11                "jdk.map.althashing.threshold"));/ / by - Djdk. Map. Althashing. Threshold = value specified optional hash threshold

12

13        int threshold;

14        try {

15            threshold = (null! = altThreshold)

16                    ? Integer.parseInt(altThreshold)

17                    : ALTERNATIVE_HASHING_THRESHOLD_DEFAULT;// Defaults to integer.max_value

18

19            // disable alternative hashing if -1

20            if (threshold == -1) {

21                threshold = Integer.MAX_VALUE;

22            }

23

24            if (threshold < 0) {

25                throw new IllegalArgumentException("value must be positive integer.");

26            }

27        } catch(IllegalArgumentException failed) {

28            throw new Error("Illegal value for 'jdk.map.althashing.threshold'", failed);

29        }

30

31        ALTERNATIVE_HASHING_THRESHOLD = threshold;// specify an optional hash threshold in initHashSeedAsNeeded as a criterion for initializing hashed seeds

32    }

33}

34

35// Decide whether to initialize the hash seed according to the capacity

36final boolean initHashSeedAsNeeded(int capacity) {

37    booleancurrentAltHashing = hashSeed ! =0;// Hash seed defaults to 0

38    boolean useAltHashing = sun.misc.VM.isBooted() &&

39            (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);// If the capacity is greater than the optional hash threshold, the hash seed needs to be initialized

40    boolean switching = currentAltHashing ^ useAltHashing;

41    if (switching) {

42        hashSeed = useAltHashing

43            ? sun.misc.Hashing.randomHashSeed(this)// Generate a random hash seed

44            : 0;

45    }

46    return switching;

47}

Copy the code

capacity

1, source code:

 1// Expand the array according to the specified capacity

2void resize(int newCapacity) {

3    Entry[] oldTable = table;

4    int oldCapacity = oldTable.length;

5    if (oldCapacity == MAXIMUM_CAPACITY) {// If the original capacity reaches the maximum value, the capacity will not be expanded

6        threshold = Integer.MAX_VALUE;

7        return;

8    }

9

10    Entry[] newTable = new Entry[newCapacity];

11    transfer(newTable, initHashSeedAsNeeded(newCapacity));

12    table = newTable;

13    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);// Recalculate the threshold based on the expanded capacity

14}

15

16// Reassign the elements to the new array

17void transfer(Entry[] newTable, boolean rehash) {

18    int newCapacity = newTable.length;

19    for (Entry<K,V> e : table) {// Iterate over the array

20        while(null! = e) {

21            Entry<K,V> next = e.next;

22            if (rehash) {// The array needs to be recalculated after expansion

23                e.hash = null == e.key ? 0 : hash(e.key);

24            }

25            int i = indexFor(e.hash, newCapacity);// Calculates the position in the new array

26            e.next = newTable[i];// Add to the new array using a header insertion method

27            newTable[i] = e;

28            e = next;

29        }

30    }

31}

Copy the code

2. Questions:

When the above expansion code is executed concurrently, it will cause the problem of loop formation of linked list. The following examples are used to analyze the problem: 2.1 initial state:

Figure note: Initial state


Thread 1 inserts 18, thread 2 inserts 26. Thread 1 finds that the size is 6 and expands the capacity. Thread 2 finds that the size is 6 and expands the capacity.



2.2. Thread 1 execution:


Thread 1 first acquires CPU execution and executes the code in Transfer () :

 1for (Entry<K,V> e : table) {

2    while(null! = e) {

3        Entry<K,V> next = e.next;Thread 1 executes to this line of code, e = 10, next = 2. The CPU dispatches thread 2 to execute.

4        if (rehash) {

5            e.hash = null == e.key ? 0 : hash(e.key);

6        }

7        int i = indexFor(e.hash, newCapacity);

8        e.next = newTable[i];

9        newTable[i] = e;

10        e = next;

11    }

12}

Copy the code

2.3 Execution by Thread 2: Thread 2 now obtains CPU execution right and executes the code in Transfer () :

 1for (Entry<K,V> e : table) {

2    while(null! = e) {

3        Entry<K,V> next = e.next;

4        if (rehash) {

5            e.hash = null == e.key ? 0 : hash(e.key);

6        }

7        int i = indexFor(e.hash, newCapacity);

8        e.next = newTable[i];

9        newTable[i] = e;

10        e = next;

11    }

12}

Copy the code

First traversal: e is 10, next is 2, rehash is false, I is 2, newTable[2] is null, 10. Next is null, newTable[2] is 10, e is 2. Second pass: e is 2, next is null, rehash is false, I is 2, newTable[2] is 10, 2. Next is 10, newTable[2] is 2, e is null. Third pass: e is null, exit the loop. Notice that the next of element 2 in the original table now points to 10.

Figure note: Result of capacity expansion performed by thread 2



2.4. Thread 1 execution:

 1for (Entry<K,V> e : table) {

2    while(null! = e) {

3        Entry<K,V> next = e.next;Thread 1 executes to this line of code, e = 10, next = 2. The CPU schedules thread 1 to continue execution.

4        if (rehash) {

5            e.hash = null == e.key ? 0 : hash(e.key);

6        }

7        int i = indexFor(e.hash, newCapacity);

8        e.next = newTable[i];

9        newTable[i] = e;

10        e = next;

11    }

12}

Copy the code

Current: e is 10, next is 2, rehash is false, I is 2, newTable[2] is null, modify: 10. Next is null, newTable[2] is 10, e is 2. Second pass: current: e is 2, next is 10 [result of thread 2], rehash is false, I is 2, newTable[2] is 10, modify: 2. Next is 10, newTable[2] is 2, e is 10. Third pass: current: e is 10, next is null, rehash is false, I is 2, newTable[2] is 2, modify: 10. Next is 2, newTable[2] is 10, e is null, exit the loop. At this point, the linked list into a loop, if the search, will fall into an endless loop!!

Figure note: Result of thread 1 performing capacity expansion

3. Description:

As can be seen from the above example, HashMap adopts header insertion method in JDK1.7, which will change the original order of the elements in the linked list during expansion, resulting in the problem of the linked list being looped in concurrent scenarios. In jdk1.8, the tail insertion method will keep the original order of linked list elements during expansion, so there will not be a problem of linked list ring.

conclusion

Based on the above analysis, the changes of HashMap between 1.7 and 1.8 are summarized here:

  • 1.7 Uses array + single linked list, 1.8 changes to red black tree storage when single linked list exceeds a certain length
  • 1.7 Recalculate the hash value and index position during capacity expansion. 1.8 Does not recalculate the hash value. Skillfully calculates the new index position using & operation after capacity expansion.
  • 1.7 Insert elements into a single linked list using the head insert method, 1.8 uses the tail insert method.

Through the study of HashMap source code in JDk1.7 and 1.8, I deeply understand a truth: all design has its reasons behind. As learners, we need to constantly ask ourselves why and what are the benefits of this design. With such a learning attitude, I think one day soon, you will become him. At the end of this article, thank you for your support. Please scan the qr code below and follow us. If you have any questions, please leave a message.








Photo note: Childhood collection of posters


Photo note: Grow up and idol together