Introduction to HashMap

(1) the concept of hash

A Hash is a Hash algorithm that transforms an input of any length (also known as a pre-image, or pre-image) into a fixed-length output (usually an integer), which is the Hash value. This transformation is a compression mapping, that is, the space of the hash value is usually much smaller than the space of the input. Different inputs may be hashed into the same output, making it impossible to determine the input value uniquely from the hash value. Simply put, a hash function is a function that compresses a message of any length into a fixed length.

(2) the JDK1.7 HashMap

JDK1.7’s HashMap is implemented based on hash tables, where each element is a key-value pair. The problem of hash value conflicts can be solved by storing elements with the same key in a single linked list. When the capacity is insufficient (beyond the threshold), the array grows automatically.

HashMap is non-thread-safe and is only used in a single-threaded environment, where a concurrentHashMap can be used concurrently and issued in a packet.

HashMap implements the Serializable interface, so it supports serialization, implements the Cloneable interface, and can be cloned.

public class HashMap<K.V>  extends AbstractMap<K.V>  
                            implements Map<K.V>, Cloneable.Serializable  {... }Copy the code

(3) JDK1.8 HashMap

In JDK1.8, HashMap has been greatly optimized. The underlying implementation of HashMap has been changed from “array + list” to “array + list + red-black tree”. When the length of the linked list is too long (that is, when there are many elements with the same hash value, the search efficiency will be reduced), the linked list will be converted into a red-black tree, which greatly improves the search efficiency.

(4) Explain ideas

  • 1.HashMapHow do I locate the index position of an array of inserted elements
  • (2) analysisput.getmethods
  • 3.

Common methods of HashMap

Under the jdk1.7

(1) Construction method

A few important member variables

// Initialize the bucket size, which is the default size of the array since the underlying is the array
static final int DEFAULT_INITIAL_CAPACITY = 16;

// Bucket Max. 2^30
static final int MAXIMUM_CAPACITY = 1 << 30; 

// Default load factor (0.75)
static final float DEFAULT_LOAD_FACTOR = 0.75 f;

// The real array of data
transient Entry<K,V>[] table;

// The size of the Map store
transient int size;

// Bucket size, which can be specified explicitly at initialization
int threshold;

// Load factor, which can be specified explicitly at initialization
final float loadFactor;
Copy the code

2. Parametric construction method

  • The given default capacity is16, the load factor is0.75. whenMapThe number of elements in16 * 0.75 = 12, expansion is required, and the expansion process involvesrehashAnd data replication. Therefore, capacity expansion consumes performance
  • Therefore, it is recommended to estimate beforehandHashMapThe initial capacity required to minimize the performance cost of expansion
public HashMap(int initialCapacity, float loadFactor) {
    // The initial capacity should be >0
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
    // Cannot exceed the maximum capacity
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    // Load factor is an integer and cannot be less than 0
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " + loadFactor);

    this.loadFactor = loadFactor;
    threshold = initialCapacity;
    init(); // The init method of HashMap is empty and implements LinkedHashMap
}
Copy the code

threshold = initialCapacity; Here the initial capacity is assigned to the threshold. Why? The following will

(2)put()

As you can see, the HashMap initializes the array only when the PUT operation is performed, and initializes the array with a threshold (the initial capacity) as a value to calculate the final capacity.

public V put(K key, V value) {
    // 1. When an array is empty, the array is initialized first
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    if (key == null)
        return putForNullKey(value);
    // 2. Calculate the hash value based on key
    int hash = hash(key);
    // Calculate the data subscript based on the hash value and the data length
    int i = indexFor(hash, table.length);
    for(Entry<K,V> e = table[i]; e ! =null; e = e.next) {
        Object k;
        // 3. If the hash value is the same, then compare whether the key is the same. If the hash value is the same, replace the value
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            returnoldValue; }}// fast-fail, quick response failure during iteration, modCount++ before adding elements, leaving many hidden dangers for the future
    modCount++; 
    // Add elements. The last argument I is the index of the table array
    addEntry(hash, key, value, i);  
    return null;
}
Copy the code

1. inflateTable(int toSize)

When the array is empty, the following methods are first called to initialize the array

Capacity =16(2^4); toSize=13; capacity=16(2^4); If toSize=16, capacity=16;

private void inflateTable(int toSize) {
    // Returns the nearest power of 2 less than (tosize-1) * 2
    int capacity = roundUpToPowerOf2(toSize); // Use threshold calculation
    threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
    table = new Entry[capacity];
    initHashSeedAsNeeded(capacity);
}
Copy the code

That is, when you set the initial capacity of a HashMap (initCapacity * loadFactor), you do not expand the capacity of the HashMap when it reaches the specified initial capacity (initCapacity * loadFactor). The capacity is expanded only when roundUpToPowerOf2(initCapacity) * loadFactor is used.

If toSize is 1, capacity after calculation is 1. Therefore, if initCapacity is set to 1, the capacity will not be expanded during the first PUT

Capacity must be the smallest power of 2 (toSize). –> See this blog

2. hashCode(Object K)

int hash = hash(key); Then compute the hash value based on the key.

3. indexFor(int hash, int length)

int i = indexFor(hash, table.length); Calculate the array location for the current key based on the hash value and array size.

static int indexFor(int h, int length) {
  Assert integer.bitcount (length)=1: "Length must be a nonzero power of 2";
  return h & (length-1);
}
Copy the code

So this code is going to look a little bit better, and when length is equal to 16, if h is equal to 1010, 1010, then the index I that you compute is going to be

0000 1111&1010 1010 ——————————— 0000 1010 h 1111&1010 ——————————— 1010 1010Copy the code

So what happens if the length of the array is not to the power of 2, so let’s try it out, let’s say length is equal to 17

0001 0000&1010 1010 ——————————— 0000 0000Copy the code

Table. Length – 1 = 15; table. Length – 1 = 15; table. Length – 1 = 15 So the & between hashCode and “table.length-1” depends only on the lower 4 bits of hashCode

In this case, since the hash result only depends on the lower four bits of the hashCode, the probability of hash collisions is also increased. Therefore, in JDK 1.8, the high level is also involved in the calculation, in order to reduce the probability of hash collisions.

4. The for loop

So what the for loop does is, it iterates through the list of numbers with subscript I

if (e.hash == hash && ((k = e.key) == key || key.equals(k)))

  • If true, it coverskeyThe correspondingvalueAnd returns the original valueoldValue
  • Otherwise, use header interpolation to insert elements into the list, as discussed below.

5. addEntry()

addEntry(hash, key, value, i); After entering this method, first determine whether the current HashMap needs to be expanded. The conditions for expansion are as follows:

  • size >= threshold.HashMapWhether the total number of elements exceeds the thresholdThreshold (capacity * 0.75).
  • null ! = table[bucketIndex], whether the array element at the current subscript is empty
void addEntry(int hash, K key, V value, int bucketIndex) {
    // Determine whether the Entry needs to be expanded when writing the Entry. If so, double the expansion and rehash the current key
    if ((size >= threshold) && (null! = table[bucketIndex])) { resize(2 * table.length);
        hash = (null! = key) ? hash(key) :0;
        bucketIndex = indexFor(hash, table.length);
    }
    createEntry(hash, key, value, bucketIndex);
}
Copy the code

5.1 the resize ()

Take a quick look at the resize() method

void resize(int newCapacity) {  
    // Reference the Entry array before expansion
    Entry[] oldTable = table; 
    int oldCapacity = oldTable.length;  
    if (oldCapacity == MAXIMUM_CAPACITY) {  
        // The size of the array before the expansion has reached the maximum (2^30)
        threshold = Integer.MAX_VALUE; 
        // Change the threshold to the maximum value of int (2^31-1) so that the capacity will not be expanded later
        return;  
    }  
    // Initialize a new Entry array with a size of 2*capacity
    Entry[] newTable = new Entry[newCapacity];    
    // Move the data to the new Entry array
    transfer(newTable);     
    // The table attribute of the HashMap references the new Entry array
    table = newTable;                             
    threshold = (int) (newCapacity * loadFactor);// Change the threshold
}  
Copy the code

5.1.1 transfer ()

Transfer () moves the linked list elements from the original array to the new array

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    // Iterate over the old Entry array
    for (Entry<K,V> e : table) {
        // Iterate over the list at each position on the array
        while (null! = e) {// Access the element in the next Entry chain
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            // Recalculates the position of each element in the array
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e; // Put the element on the arraye = next; }}}Copy the code

If the list structure of an element on the array is [1]->[2]->[3], then

  • Entry<K,V> next = e.next;It’s like pointing to an element[2]
  • e.next = newTable[i]; ` ` [1]-> points to a new array with subscriptiThe first element of
  • newTable[i] = e;The new array subscript isiThe first element of the[1]
  • e = next;E points to enextThe function of point 1 is to record firste.nextThe location of avoideGet lost in point 2
  • Repeat the above logic, but eventually the corresponding list structure will be reversed[3] - > [2] - > [1]

The algorithm for recalculating the position of the element in the new array is as follows:

If the size of the array capacity is 16, the value of an elementhashValue: 1011 1010 0000 1111&1011 1010 ——————————— 0000 1010 ==10 After capacity expansion, Capacity =32 0001 1111&1011 1010 ——————————— 0001 1010 ==26 (10+ the original array capacityCopy the code

Next look at the createEntry() method

// Pass the bucket in the current position into the new bucket. If the current bucket has a value, it will form a linked list of positions (using the header method).
void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];  / / get
    table[bucketIndex] = new Entry<>(hash, key, value, e);
    size++;
}
Copy the code

Take a look at the Entry constructor to see how header interpolation works here.

The logic here is: get the linked list head pointer E corresponding to the array subscript bucketIndex, change the head pointer to new Entry, where the next pointer of the new node points to the original head node E, and finally increase the total size of HashMap elements by 1.

Entry(int h, K k, V v, Entry<K,V> n) {
    value = v;
    next = n;
    key = k;
    hash = h;
}
Copy the code

(3)get()

public V get(Object key) {
    if (key == null)
        return getForNullKey();
    Entry<K,V> entry = getEntry(key);
    return null == entry ? null : entry.getValue();
}

final Entry<K,V> getEntry(Object key) {
    // Calculate the Hashcode based on the key and locate it in the bucket
    int hash = (key == null)?0 : hash(key);
    // Check whether the position is in a list. If it is not in a list, return the value based on whether the hashcode of the key is equal.
    // If the key and hashcode are equal, the list is iterated until the value is returned
    for(Entry<K,V> e = table[indexFor(hash, table.length)]; e ! =null; e = e.next) {
        Object k;
        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
            return e;
    }
    return null;
}
Copy the code

jdk1.8

// constructor 1
public HashMap(int initialCapacity, float loadFactor) {
    // The specified initial capacity is non-negative
    if (initialCapacity < 0)
        throw new IllegalArgumentException(Illegal initial capacity:  +
                                           initialCapacity);
    // If the specified initial capacity is greater than the maximum capacity, set it to the maximum capacity
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    // The fill ratio is positive
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException(Illegal load factor:  +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);// New capacity expansion threshold
}
 
// constructor 2
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
 
// constructor 3
public HashMap(a) {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
 
Constructor 4: initializes the hash map with the elements of m
public HashMap(Map<!--? extends K, ? extends V--> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Copy the code