preface

HashMap is the most important part of interviews. Go to 10 companies and eight of them will ask, why do people use HashMap to start a conversation?

How is HashMap implemented?

  1. Jdk1.7’s HashMap is implemented using arrays + linked lists

  2. Jdk1.8 HashMap is implemented using arrays + linked lists + red-black trees

Insert a picture description here

Attached is the Java interview + core technology learning guide that I summarized for three months, which is my summary of these years and spring recruitment. At present, I have got the offer from big factory, thank you!

access

1. Follow + forward this article

2. You can reply “PDF” in the background and get it for free

The trunk of a HashMap is an array. Suppose we have three key-value pairs DNF :1, cf:2, lol:3, The key.hash % table.length (the object’s hashcode does something to mod the length of the array) is used to determine which bit of the array the pair should be placed in

1 = indexFor(DNF), we place the key-value pair at index 1 in the array

Write the picture description here

3 = indexFor(cf)  

Write the picture description here

1 = indexFor(lol), 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1, 1 = 1

Jdk1.8 is the tail plug method

Write the picture description here

1=hash(DNF); DNF is not equal to lol, and is compared with the next element, equality returns. The process of set and get is that simple. Locate the slot position (that is, the position in the array) and iterate through the list to find the same element.

As can be seen from the figure above, HashMap uses the chain address method when hash conflicts occur. This method is not the only one to resolve hash conflicts. There are four common methods as follows

  1. Open addressing

  2. Chain address method

  3. Then the hash method

  4. Public overflow area method.

JDK1.7 source

Several important properties

Static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; Static final int MAXIMUM_CAPACITY = 1 << 30; // Load factor static finalfloatDEFAULT_LOAD_FACTOR = 0.75 f; static final Entry<? ,? >[] EMPTY_TABLE = {}; Transient Entry<K,V>[] TABLE = (Entry<K,V>[]) EMPTY_TABLE; // Put key-value pair number transient int size; // Capacity x load factor int threshold; // Capacity x load factor int threshold; // Load factor finalfloat loadFactor;
Copy the code

If there are 16 elements in a hashmap and the load factor is 0.75, the load factor is 0.75. The capacity expansion threshold is 12 (16 x 0.75)

  1. The smaller the load factor, the easier capacity expansion, waste space, but high search efficiency

  2. The larger the load factor is, it is not easy to expand, the space is more fully utilized, and the search efficiency is low (the linked list is elongated)

Static inner class for storing data, array + linked list, by which I mean Entry array

static class Entry<K,V> implements Map.Entry<K,V> { final K key; V value; Entry<K,V> next; // Stores a reference to the next Entry, a singly linked list inthash; // to the hashcode value of the keyhashThe calculated value is stored in Entry to avoid double counting Entry(int h, K K,V V, Entry<K,V> n) {value = V; next = n; key = k;hash = h;    }}
Copy the code

The constructor

Everything else is an extension on this basis, mainly setting the initial capacity and load factor, these two parameters were introduced before.

public HashMap(int initialCapacity, float loadFactor) {    if (initialCapacity < 0)        throw new IllegalArgumentException("Illegal initial capacity: " +                initialCapacity);    if (initialCapacity > MAXIMUM_CAPACITY)        initialCapacity = MAXIMUM_CAPACITY;    if (loadFactor <= 0 || Float.isNaN(loadFactor))        throw new IllegalArgumentException("Illegal load factor: "+ loadFactor); this.loadFactor = loadFactor; threshold = initialCapacity; init(); }Copy the code

The most important knowledge point, the process to see the source code is better to understand

The execution process of the PUT method

  1. If the key is null, place it at table[0]. Hash the key hashCode() to calculate the index.

  2. If the node already exists, replace the old value(to ensure the uniqueness of the key) and return the old value

  3. Resize is required if the capacity expansion threshold (exceeds capacity * load factor) and a collision occurs

  4. Put the element at the top of the bucket, headers

Public V put(K key, V value) {// The array of hashMap is emptyif (table == EMPTY_TABLE) {        inflateTable(threshold);    }    if (key == null)        returnputForNullKey(value); / / to gethashValue of the inthash = hash(key); Int I = indexFor(int I = indexFor(hash, table.length); // Iterate through the linked list at table[I] to find the same key, replace oldValue with a new value, and return oldValuefor(Entry<K,V> e = table[i]; e ! = null; e = e.next) { Object k; // If the key already exists, set value to new and return the old valueif (e.hash == hash && ((k = e.key) == key || key.equals(k))) {            V oldValue = e.value;            e.value = value;            e.recordAccess(this);            returnoldValue; } } modCount++; // Place the element in the table[I]. The new element is always the first element in the table[I].hash, key, value, i);    returnnull; }Copy the code

When null, the HashMap has not yet created the array, either using the default initial value of 16, or using a custom length. In this case, you need to change the array length to a minimum multiple of 2 that is greater than or equal to the initial size

Private void inflateTable(int toSize) {// Returns a number greater than or equal to the nearest power of 2 int Capacity = roundUpToPowerOf2(toSize); threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); table = new Entry[capacity]; initHashSeedAsNeeded(capacity); }Copy the code

If the key is null, place the value on the table[0] chain

private V putForNullKey(V value) {    for(Entry<K,V> e = table[0]; e ! = null; e = e.next) {if (e.key == null) {            V oldValue = e.value;            e.value = value;            e.recordAccess(this);            return oldValue;        }    }    modCount++;    addEntry(0, null, value, 0);    returnnull; }Copy the code

Find where to put it in the array, h & (Length-1) which you can think of as a hash mod to the length of the array, which we’ll talk about later

static int indexFor(int h, int length) {    // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";    returnh & (length-1); }Copy the code

Add elements

void addEntry(int hash, K key, V value, int bucketIndex) {// Expand the capacity when the capacity exceeds the threshold and a collision occursif((size >= threshold) && (null ! = table[bucketIndex])) {resize(2 * table.length); // recalculatehashThe value, if you don't do anything special, and what we calculated beforehashHave the same valuehash= (null ! = key) ?hash(key) : 0;        bucketIndex = indexFor(hash, table.length);    }    createEntry(hash, key, value, bucketIndex); }Copy the code

Place the newly added element at the top of the table and follow the other elements after the first element

void createEntry(int hash, K key, V value, int bucketIndex) {    Entry<K,V> e = table[bucketIndex];    table[bucketIndex] = new Entry<>(hash, key, value, e); size++; }Copy the code

The capacity exceeds the threshold and a collision occurs. Procedure

void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; // The capacity has reached the maximumif (oldCapacity == MAXIMUM_CAPACITY) {        threshold = Integer.MAX_VALUE;        return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable, initHashSeedAsNeeded(newCapacity)); table = newTable; threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); }Copy the code

The initHashSeedAsNeeded function always returns false by default, meaning that rehash is false by default

void transfer(Entry[] newTable, boolean rehash) { int newCapacity = newTable.length; // go through the number groupfor(Entry<K,V> e: table) {// Iterate through the linked listwhile(null ! = e) { Entry<K,V> next = e.next;if (rehash) { e.hash = null == e.key ? Zero:hash(e.key); } int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; }}}Copy the code

This transfer function is quite interesting. If you carefully understand its replication process, you will find the following two particularly interesting points

  1. OldTable [I] will be placed in newTable[I] or newTable[I + oldtable.length]

  2. Linked lists are reversed as they are transferred

These two points need to be noted and I will mention them again in JDK1.8

The execution of the GET method

  1. If the key is null, fetch the index directly from table[0], hash the key hashCode(), calculate index;

  2. Check key.equals(k) for the corresponding Entry and return value

public V get(Object key) {    if (key == null)        return getForNullKey();    Entry<K,V> entry = getEntry(key);    returnnull == entry ? null : entry.getValue(); }Copy the code

Obtain a null key from table[0]

private V getForNullKey() {    if (size == 0) {        return null;    }    for(Entry<K,V> e = table[0]; e ! = null; e = e.next) {if (e.key == null)            return e.value;    }    returnnull; }Copy the code

If the key is not null

final Entry<K,V> getEntry(Object key) {    if (size == 0) {        return null;    }    int hash= (key == null) ? Zero:hash(key);    for (Entry<K,V> e = table[indexFor(hash, table.length)]; e ! = null; e = e.next) { Object k;if (e.hash == hash&& ((k = e.key) == key || (key ! = null && key.equals(k))))return e;    }    returnnull; }Copy the code

JDK1.8 source

Jdk1.8 accesses data with a null key without special judgment, but places it at table[0] by returning the hash value to 0

Put Execution Process

  1. Perform xOR on the top 16 and bottom 16 bits of the key’s hashcode() to find the hash value

  2. If the table array is not initialized, the length of the initialized table array is 16

  3. Compute the index based on the hash value and put it directly into a bucket if there is no collision (bucket can be a linked list or red-black tree)

  4. If a collision causes the list to be too long, turn the list into a red-black tree

  5. If the key already exists, replace old value with new value and return old value

  6. If the capacity expansion exceeds the threshold, expand the capacity. Threshold = Capacity x load factor

The algorithm used in jdk1.8 and JDk1.7 to retrieve the location of an element value in a new array has changed.

  1. Jdk1.7, hash & (length-1)

  2. Jdk1.8, check whether the hash bit to be added is 0 or 1. NewTable [I]; otherwise newTable[I + oldtable.length]

Get Execution Process

  1. Perform xOR on the top 16 and bottom 16 bits of the key’s hashcode() to find the hash value

  2. Returns if the first node in the bucket is hit directly

  3. If there is a conflict, find the corresponding Node through key.equals(k) and return value. The efficiency of looking up in a tree is O(logn), the efficiency of looking up in a linked list is O(n).

Frequently seen exam

Difference between HashMap, HashTable, ConcurrentHashMap

object Whether the key and value can be null Thread safe or not
HashMap Both key and value can be NULL no
HashTable Neither key nor value can be null is
ConcurrentHashMap Neither key nor value can be null is

Under what conditions does HashMap expand

jdk1.7

  1. The capacity expansion threshold is exceeded. Procedure

  2. A collision

jdk1.8

  1. The capacity expansion threshold is exceeded. Procedure

Why is the size of HashMap 2 to the n

To hash the location of the element in the array using the hash value, we need to hash%length. At that time, the % operation is time-consuming, so we optimize it to hash & (length-1).

When length is 2 to the n, hash & (length-1) =hash % length

Let’s assume the array is 15 and 16 in length and the hash code is 8 and 9

h & (length – 1) h length index
8 & (15-1) 0100 1110 0100
9 & (15-1) 0101 1110 0100
8 & (16-1) 0100 1111 0100
9 & (16-1) 0101 1111 0101

When the array length is 15, the elements with hash codes 8 and 9 are put into the same position in the array to form a linked list. The key reduces the query efficiency. When the hahs code and 15-1(1110) are put &, the last bit is always 0, so 0001,0011,0101,1001,1011,0111. 1101 these positions will never be placed elements, which would result

  1. Waste of space

  2. Increases the probability of collisions and slows down the efficiency of queries

When the array length is long, all the bits of the array are 1, for example, 8-1=7, that is, 111. Then the value of the low-order & operation is always the same as the original hash value, reducing the probability of collision

What has changed in JDK1.8?

  1. Change from array + list to array + list + red-black tree, when the list length exceeds 8, the list becomes red-black tree. Why did you change it? We know that the search efficiency of a linked list is O(n), and the search efficiency of a red-black tree is O(logn), so the search efficiency becomes higher. Why not just use red black trees? Because red-black trees are more efficient at finding things, but less efficient at inserting things, it’s not a good idea to start with a red-black tree. The appropriate threshold from a probabilistic point of view is 8

  2. Optimized the hash algorithm

  3. The algorithm for calculating the position of an element in a new array has been changed. The new algorithm uses new bits to determine whether oldTable[I] should be placed in newTable[I] or newTable[I + oldtable.length].

  4. Head plug method is changed to tail plug method, the linked list does not inverted during expansion (avoid forming an endless loop)

What happens to HashMap with high concurrency?

  1. Multithreading capacity expansion, linked list will form a loop, resulting in an infinite loop

  2. Multithreaded PUT can cause elements to be lost

How do I avoid the problem of HashMap with high concurrency?

  1. The use of ConcurrentHashMap

  2. With the Collections. SynchronizedMap (hashMap) packaged into synchronous collection

Finally, attached is the Java interview + core technology learning guide that I summarized for three months, which is my summary of these years and spring recruitment. At present, I have got the offer from big factory, thank you!

Free collection method

1. Follow + forward this article

2. You can reply “PDF” in the background and get it for free