This article is an introduction to the Java Collections framework, but first we will introduce a few other apis that we often use in development. There are a few things that need to be noticed in use, and we will share them with you here.

An Object class

The Object class is the parent of all classes in Java, and all classes inherit from the Object class. Subclasses can use all methods of the Object class. The following three important methods of Object class are introduced one by one. They are also commonly used methods

The equals (Object obj) method

Equals (Object obj) equals(Object obj) equals(Object obj) equals(Object obj) equals(Object obj) equals(Object obj) equals(Object obj) equals(Object obj) In fact, at the bottom of it is the “==” judgment, because using “==” requires comparing both values and addresses.

public boolean equals(Object obj) {
    return (this == obj);
}
Copy the code

HashCode () method

Object calling the hashCode() method returns the hash value (the decimal heap address value) of the object, which in hexadecimal is exactly the same as the address value.

Object obj = new Object();
int addr = obj.hashCode();
Integer.toHexString(addr);// Convert decimal to hexadecimal
Copy the code

The toString () method

The toString() method returns a string representation of an Object. For our class, which inherits the Object class by default, we can override the toString() method to display some information inside the Object when it is output. For example:

public class Person{
    // Member variables
    private String name;
    private int age;
    // constructor
    public Person(String name , int age) {
        this.name = name;
        this.age = age;
    }
    // Override the toString() method
    public String toString(a){
        return ("Name:"+name+",年龄:"+age+"Gender,"+sex+", height:"+height);
    }
}
main(String[] args) {
    Person p = new Person("Xiao Ming" , 18 , "Male" , 180);
    System.out.println(p); // Call the p.tostring () method by default, equivalent to system.out.println (p.tostring ());
}
Copy the code

2. The String class

String, a String type, is immutable, and immutable means that once it’s initialized, it can’t be modified. But in our common use, it seems that String variables can be reassigned, and we can reassign without any problems. Why is this? Take this example:

String str = "abc";
s = "def";
Copy the code

The above code shows that it looks as if s has been changed, but in fact if we use the output s.ashcode () we can see that their heap address value has been changed by assigning the variable S to the new address value. So what happens to the original “ABC” string? Let’s take a look at the source of the String class

public final class String
    implements java.io.Serializable.Comparable<String>, CharSequence {
    // ...
    private final char[] value;
    // ...
}
Copy the code

As you can see, the String class is modified with the final keyword to indicate that it can no longer be inherited. In addition, the Char[] array is declared inside the String, which holds the value of the String. You can see that it is also decorated with the final keyword, meaning that once the string is assigned, its address is immutable and can never be reassigned.

A common String method

  1. Equals (String STR) and equalsIgnoreCase(String STR). The String class overrides the Object’s equals(String STR) method to compare String values, not addresses. The equalsIgnoreCase(String STR) method ignores case differences and only compares values.
  2. Constructor: When the String class uses a constructor, it removes space from the heap, so whenever new is used and then == is used, the heap address is different and only false is returned.
String str1 = "abc";
String str2 = "abc";
// str1 and str2 are both addresses to strings of 'a', 'b', and 'c' in the constant pool
System.out.println(str1==str2); // true
System.out.println(str1.equals(str2)); // true

String mes1 = new String("abc");
String mes2 = new String("abc");
// The heap memory of MES1 and MES2 stores the addresses of 'a' 'b' 'c' strings (in the constant pool), and assigns the heap memory addresses to mes1 and MES2
System.out.println(mes1==mes2); // false
System.out.println(mes1.equals(mes2)); // true
Copy the code
  1. CharAt (int index) and codePointAt(int Index) methods, charAt(int index) gets the character of the specified index in the string, CodePointAt (int index) method obtains the ASCII code value of the corresponding subscript character.
  2. Concat (String STR) method that concatenates two strings into one, the same as “+”;
  3. The endsWith(String STR) and startsWith(String STR) methods, endsWith(String STR) determines whether to end with a String and startsWith(String STR) determines whether to start with a String;
  4. ToUpperCase () converts all strings toUpperCase, provided that the string consists entirely of letters; Similarly, toLowerCase() converts all strings toLowerCase;
  5. The getBytes(” encoding set “) method converts a string to an array of bytes based on the specified encoding set;
  6. IndexOf (String STR, [int fromIndex]) finds the first occurrence of the specified String from left to right at the specified position. If no index is passed, the index defaults to the first occurrence from the left, or -1 if no index exists. The lastIndexOf(String STR, [int fromIndex]) method finds the first occurrence of the specified String from right to left from the specified position. If no specified index is passed, the search defaults to the last position from the right, or -1 if none exists.
  7. The isEmpty() method determines whether a string isEmpty, i.e. “”;
  8. The length() method gets the length of the string;
  9. The replace(String old, String new) method replaces the old String in a String with a new String;
  10. The split(String STR) method splits the original String according to the specified String and returns the split String as an array of characters.
  11. Substring (int begin, [ing end]), which extracts a character from the specified start position to the end position and returns it. If no end position is passed, the string is directly truncated to the end of the string.
  12. Trim () method that removes whitespace from the left and right sides of the string and returns the result.

Resolve string garbled characters

When we use strings for binary conversion operations in our code, we often run into problems when testing locally, but when we put them on machines in other environments, we may have problems with garbled characters. The reason for this is that the encoding format used for binary conversion is not mandatory, or the encoding format used does not fully support the current scenario, such as the following scenario:

String str  ="my name is 驖䦂";
// The string is converted to a byte array
byte[] bytes = str.getBytes("ISO-8859-1");
// Byte arrays are converted to strings
String s2 = new String(bytes);
log.info(s2);
// The result is printed as:
my name is ??
Copy the code

You can see that for Chinese, the printed result is?? New String(bytes, “ISO-8859-1”); new String(bytes, “ISo-8859-1 “); Because isO-8859-1 encoding support for Chinese is limited, so for some Chinese, is still garble, this time we should use utF-8 format encoding to deal with, after using UTF-8, the printed result is normal.

3. Packaging

Java provides corresponding wrapper types for our eight basic data types to use. At this point, one might ask, why give us a wrapper type when we have a basic data type? Basic data type doesn’t it smell? In fact, in some cases, basic data types are not enough for our needs.

byte	Byte
short	Short
int	Integer
long	Long
double	Double
float	Float
boolean	Boolean
char	Character
Copy the code

Let’s start with the concept of basic data types. Since Java is an object-oriented language, when we create an object, we store it in the heap and then refer to it through the stack. However, for some of the most commonly used types such as int, long, char, etc., these types are relatively simple and consume very little memory, so it is not very efficient to store them in the heap and then use them again. At this point Java decides to use them not to create them through New, but to store them directly on the stack, create them at method execution time, and destroy them after execution, for greater efficiency.

So why do we need a wrapper class? This is because Java itself is an object-oriented language, but the basic data types are not object-oriented, which can be inconvenient in some places in practical use. So Java provides wrapper classes for primitive data types, “wrapping” them with object-like properties and adding properties and methods to enrich the operations of primitive data types. For some scenarios where basic types are not applicable, for example, when a collection is used and its data type can only be Object type, it is not applicable to use basic data types, but wrapper classes can meet the requirements.

Automatic packing and unpacking

Automatic packing and unpacking basically means that Java automatically wraps the basic data type into the corresponding wrapper type or converts the wrapper type to the corresponding basic data type when using the basic data type and wrapper type. As follows:

Integer a = 127; // Automatically wrap 127 into an Integer type
int b = a; // Automatically unpack a and assign it to variable b
Copy the code

In the Integer type, there is a buffer pool. The buffer pool is between -128 and 127. If an Integer variable is created within this range, the value is taken directly from the buffer pool; otherwise, a new space is created in the heap to store the value. Therefore, when performing an Integer type comparison, pay attention to the comparison of address values, as follows:

Integer a = 127;
Integer b = 127;
System.out.println(a==b);
Integer x = 128;ValueOf (int I) is called by default to wrap I as an object
Integer y = 128;
System.out.println(x==y);
// The result is true,false
/* if a and b are equal because they are between -128 and 127, it will be stored in the buffer with the same address and the same value, and return true.*/ If x and y are not equal because they are between -128 and 127, it will return false
Copy the code

Set framework

When it comes to data structures, Java collections are inescapable, and they will be used in all of our work, even as long as we develop in Java. Using them allows us to easily process, transfer, and store data in memory. So, how do these functions do it, and how do they process the data at the bottom of it? Let’s take a look at it from the source level. Let’s start with the architecture of the Java Collections framework:The figure above shows some of the apis commonly used in the Java Collection framework. In fact, there are more than these in the collection framework. I have omitted many of them, but only listed some of the more common classes and interfaces, and their relationships.

List

ArrayList

ArrayList is the one we use most in our daily work, almost all of our code development. Let’s take a look at its underlying source code exactly how to achieve. Before we do that, we need to clarify the characteristics of ArrayList itself, and then take a look at its source code with these characteristics:

  • ArrayList has subscripts and the underlying data structure is an array. The efficiency of adding and deleting is low, and the efficiency of modifying and querying is high.

The overall structure of an ArrayList is relatively simple, just an array structure, as shown below:ElementData is an array of length 9 (which is not the length of the ArrayList). It is an array of data stored inside the ArrayList, and index is the index of the array. Let’s take a look at the source code for ArrayList.

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess.Cloneable.java.io.Serializable {

    private static final int DEFAULT_CAPACITY = 10;

    private static final Object[] EMPTY_ELEMENTDATA = {};

    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    transient Object[] elementData;

    private intsize; . }Copy the code

You can see that five properties are declared inside the ArrayList:

  • DEFAULT_CAPACITY: initial size of array;
  • EMPTY_ELEMENTDATA: empty array, used during instantiation;
  • DEFAULTCAPACITY_EMPTY_ELEMENTDATA: empty array, similar to EMPTY_ELEMENTDATA, used at initialization;
  • ElementData: An array that holds data inside an ArrayList;
  • Size: Describes the number of elements stored inside the ArrayList, which is the length of the ArrayList.
Initialize the

Since ArrayList declares two properties for initialization, how do these two properties work? Here’s how its constructor is designed:

// select * from ()
public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}// set the following parameters
public ArrayList(a) {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
// select * from ()
public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if((size = elementData.length) ! =0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class); }else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

You can see that ArrayList provides three constructors:

  • If the number passed in is greater than 0, initialize an array of the corresponding size and assign it to elementData. If it is equal to 0, assign EMPTY_ELEMENTDATA directly to elementData. If less than 0, throw an exception.
  • The second method is to directly call the no-parameter construction and assign DEFAULTCAPACITY_EMPTY_ELEMENTDATA to elementData.
  • Method three calls the collection by passing in an existing collectiontoArray()Method, convert to an array and assign to elementData, then determine whether the length of the assigned elementData is equal to zero. If not, further determine whether elementData is an object array type. If not, call elementDataArrays.copyOf()Method does a conversion. If the length of elementData is determined to be equal to 0, assign EMPTY_ELEMENTDATA directly to elementData.
capacity

Capacity expansion is one of the most complex and important features of ArrayList design. Capacity expansion this function is mainly reflected in the add() method, add() method has a lot of overload, but its capacity expansion principle is basically similar, we take the most simple and clear add(E E) method to look at its source code is how to design:

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}
Copy the code

You can see that when the add(E E) method is called, a total of four methods are called to ensure that elementData has enough capacity to hold the new data. Let’s take a look at each of the four methods:

  1. Add (E E): This method is called firstensureCapacityInternal(size + 1)Method to ensure sufficient capacity to store new data; And then theelementData[size++] = eThe obvious thing here is to store the new data in the array and make size+1, the length of the ArrayList +1; Return true for success;
  2. EnsureCapacityInternal (int minCapacity): The method mentioned above is used to ensure that the capacity is sufficient, and when this method is called, the passed in issize + 1This parameter is because a new data is added in the method of add(E E), so the overall size after the addition needs to be at least +1 to ensure sufficient capacity. DEFAULTCAPACITY_EMPTY_ELEMENTDATA = DEFAULTCAPACITY_EMPTY_ELEMENTDATA = DEFAULTCAPACITY_EMPTY_ELEMENTDATA You need to compare minCapacity (size+1) with DEFAULT_CAPACITY and use the larger value as the target capacity for capacity expansion.
  3. ensureExplicitCapacity(int minCapacity): This method will first modCount++, then compare the minimum memory required to the current length of elementData, if elementData is less than the minimum length, call grow(minCapacity) to expand;
  4. grow(int minCapacity): This method first assigns the length of the original elementData to the variable oldCapacity, then defines the new length newCapacity as 1.5 times of the original length, and compares the obtained new length with the minimum length minCapacity. If the new length is not enough, Take the minimum length you need as the new length; Finally, newCapacity is compared to MAX_ARRAY_SIZE, and if it is greater, the maximum value of Integer is assigned to newCapacity. Finally, after the new array length is determined, the arrays.copyof () method is called to transfer all the original data to the expanded array to complete the expansion. When the ArrayList is expanded, a new array is created to hold the data, and the new array is referenced to elemenData. Once an array is initialized, its size cannot be changed, so you have to create a new, larger array to hold the new data.

ArrayList expansion operation, its time complexity is O(n), need to copy all data into the new array, that is, proportional to the length of the array. If you understand the process of adding (E, E), you can easily understand other methods that have multiple overloads. Whether you add one or more, the principle is similar, but the size of the expansion is different.

The query

Get (int index) is used to query ArrayList. This method is simple to implement, which is to retrieve the corresponding data directly through the array subscript and return, time complexity is O(1), only need to access the array once, its source code is as follows:

public E get(int index) {
    // Prevent the index from exceeding the length of the array
    rangeCheck(index);
    // Return the index of the array directly
    return elementData(index);
}
Copy the code
conclusion

For example, the remove(int index) method involves the operation of removing intermediate data. This operation is O(N) complexity. The amount of data to move is proportional to the length of the array. And iterator is also designed more clever, including forward and reverse traversal, as well as concurrent modification, but because of space reasons, here will not focus on the introduction, interested friends can also point to open source to read.

LinkedList

LinkedList is also a commonly used data structure. Its underlying structure is linked by chain nodes one by one. Let’s take a look at its characteristics:

  • The underlying data structure of LinkedList is a bidirectional LinkedList, which is orderly but without subscript. The efficiency of adding and deleting data is high, but the efficiency of querying data is low.

Take a look at the structure of LinkedList:It is not difficult to see from the above figure that LinkedList has the following characteristics:

  • A linked list is made up of nodes one by one. Each Node has two Pointers, next to the next Node and prev to the previous Node.
  • The prev pointer to first is null, and the next pointer to next is null.

With that in mind, let’s take a look at the source code for LinkedList. First, let’s take a look at the key attributes that make up LinkedList:

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable.java.io.Serializable {
    
    transient int size = 0;

    transient Node<E> first;

    transient Node<E> last;
    
    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev; }}... }Copy the code

It is easy to see from the above code that LinkedList has three important attributes, one is size, which represents the length of LinkedList, the second is first, which is the head node, and the third is last. Node is an inner class of LinkedList with prev and Next attributes. These attributes are also of Node type. These are obviously Pointers to the previous node and the next node; There’s also an item property, which is what holds the data; The Node constructor takes three parameters: the front Node, the current storage element, and the back Node.

Initialize the

Initializing a LinkedList is easy because it is not a fixed space like an array, its size is arbitrarily variable, and its length is not fixed, as long as the machine has enough memory, there is no limit to its size.

// constructor
public LinkedList(a) {}

public LinkedList(Collection<? extends E> c) {
    this(a); addAll(c); }Copy the code
Insert data

LinkedList inserts data quickly, with a time complexity of O(1). Because it is designed as a LinkedList structure, it does not move half of the average data like an array. Instead, it only needs to change the pointer before and after the node where it is inserted (the most typical add(int index, Insert the next pointer to the next node and the prev pointer to the next node to the new node, and the new node’s pointer to the front and back nodes to complete the insert. Here’s a look at the corresponding source code:

public void add(int index, E element) {
    / / check whether the insertion position index is beyond the scope of the length of the list, if you exceed the thrown IndexOutOfBoundsException exception
    checkPositionIndex(index);
    // Determine whether the insertion position is exactly at the end
    if (index == size)
        // If so, call the end of the new node method new
        linkLast(element);
    else
        // If not, call the intermediate node insertion method to insert data. The purpose of the node(index) method is to return the old node at the current insertion position of the list
        linkBefore(element, node(index));
}

// The method of adding nodes at the end
void linkLast(E e) {
    // Use an intermediate variable l to hold the current tail node
    final Node<E> l = last;
    // Use the new element to construct a new node, the new node is the tail node, the previous node is L, the next node is null
    final Node<E> newNode = new Node<>(l, e, null);
    // Point the last tail node to the new node
    last = newNode;
    // If the original tail node is empty
    if (l == null)
        // If the list is empty, the new node will also be a header
        first = newNode;
    else
        // If it is not null, the next pointer to the last node points to the new node
        l.next = newNode;
    size++;
    modCount++;
}

// Insert a new node in the middle of the list
void linkBefore(E e, Node<E> succ) { // succ is the old node at the insertion position
    // First use pred to save the previous node of the insertion position
    final Node<E> pred = succ.prev;
    // Construct a new Node Node with the new element
    final Node<E> newNode = new Node<>(pred, e, succ);
    // Point the prev pointer of the old node to the new node
    succ.prev = newNode;
    // Check whether the node before the insertion position is empty
    if (pred == null)
        // If it is empty, the list is empty, and the head node also points to the new node
        first = newNode;
    else
        // If it is not null, the next pointer of the previous node points to the new node
        pred.next = newNode;
    size++;
    modCount++;
}
Copy the code

The above is the source of the new node, from the above source can be seen, the list of new nodes is very simple, as long as the change before and after the pointer to point to, and the corresponding deletion of data is also very fast. LinkedList = node(int index); LinkedList = node(int index);

Node<E> node(int index) {
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        returnx; }}Copy the code

As you can see, in order to find the node at the current insertion position, the LinkedList needs to traverse back to index through the head node or forward to index through the tail node. This takes more time, but because you don’t need to copy data like an array, the overall insertion efficiency is higher.

Query data

LinkedList = node(int index); LinkedList = node(int index);

public E get(int index) {
    // check if the subscript is out of bounds
    checkElementIndex(index);
    // Iterate to find the target node and fetch the element value
    return node(index).item;
}
Copy the code
conclusion

This is the first introduction to LinkedList. In fact, LinkedList is relatively easy to understand in general. For LinkedList related data structure, its transformation is always the transformation of its pointer pointing to the front and back.

Map

HashMap

As an excellent collection, HashMap has very high efficiency in data insertion, deletion and query, and its time complexity is O(1). Let’s look at the features of HashMap:

  • The underlying data structure isArray + linked list + red-black treeTo store datakey-valueThe value of key and value can be null. The key value cannot be repeated, but the value can be repeated.

With the above features in mind, let’s first take a look at the overall structure of HashMap through the following figure:As shown in the figure above, the array structure of the HashMap is at the top. Each Node uses the hash value of the key to drop into the corresponding position in the array. If multiple nodes’ keys are hash to the same position in the array, they are stored as a linked list or red-black tree. On the left is the linked list structure and on the right is the red-black tree structure.

Let’s look at the details of the HashMap data store in the source code. First, let’s look at the properties of the HashMap.

public class HashMap<K.V> extends AbstractMap<K.V>
    implements Map<K.V>, Cloneable.Serializable {
    // The initial capacity is 16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
    // Maximum capacity
    static final int MAXIMUM_CAPACITY = 1 << 30;
    // The default load factor is 0.75. The condition for array expansion is the current array size * the load factor < the required capacity
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;
    // The length of the list converted to a red-black tree is 8
    static final int TREEIFY_THRESHOLD = 8;
    // When a red-black tree is converted to a linked list, the length of the list is 6
    static final int UNTREEIFY_THRESHOLD = 6;
    // Another condition for converting a list to a red-black tree is that the array length is greater than 64
    static final int MIN_TREEIFY_CAPACITY = 64;
    // Array of HashMap
    transient Node<K,V>[] table;
    // The actual size of the HashMap
    transient int size;
    // Threshold for expansion = current array capacity * load factor > required array size
    int threshold;
    
    transient int modCount;

    transient Set<Map.Entry<K,V>> entrySet;

    final floatloadFactor; . }Copy the code
Initialize the

Let’s first look at the constructor of the HashMap

// Manually configure the initial capacity and load factor values
public HashMap(int initialCapacity, float loadFactor) {
    // If the initial capacity is less than 0, an exception is thrown
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    // If the initial capacity is greater than the maximum capacity, it equals the maximum capacity
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    // If the load factor is less than or equal to 0, or non-numeric, an exception is thrown
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    // Calculate the value of threshold based on the initial capacity, which is a power of 2 greater than the initial capacity
    this.threshold = tableSizeFor(initialCapacity);
}

public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

public HashMap(a) {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    // 
    putMapEntries(m, false);
}
Copy the code

It can be found that all constructors are used to determine the values of two attributes, loadFactor and threshold, which will be used in subsequent HashMap expansion.

Insert data

HashMap inserts data quickly, but the process is quite complicated. First, it determines how to save the same data, then it uses a linked list to save different data but hash it into the same position in the array, and it needs to convert the chain expression to a red-black tree when it reaches a certain length. The above steps are quite troublesome, and the code processing details are as follows:

public V put(K key, V value) {
    // Call putVal() to save the data and hash() to get the hash value;
    return putVal(hash(key), key, value, false.true);
}
// onlyIfAbsent If the value is true, the value is not replaced if the key is the same. If the value is false, the value is replaced.
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
    Node<K,V>[] tab;
    Node<K,V> p;
    int n, i;
    // Check whether the table is initialized;
    if ((tab = table) == null || (n = tab.length) == 0) {
        // If there is no initialization, call the resize() method to initialize, and assign the initialized length to n;
        n = (tab = resize()).length;
    }
    // Check whether the array at the current position p is empty;
    if ((p = tab[i = (n - 1) & hash]) == null) {
        // If it is empty, a new node is generated directly to the current location;
        tab[i] = newNode(hash, key, value, null);
    } else { // If not empty
        Node<K,V> e; // Temporary variable, used to save the p node when the key of node P is the same as that of the new node, used to determine whether to replace the value later;
        K k;
        // Determine the hash value, whether the key of the new data is the same as the key of p;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k)))) {
            // set p to e;
            e = p;
        } else if (p instanceof TreeNode) { // If the key is not equal, the current p is a red-black tree node.
            // If yes, use the red-black tree method to add data, and if the same key is found, then assign p to e;
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        } else { // If it is not a red-black tree, it is a linked list;
            // Spin linked list;
            for (int binCount = 0; ; ++binCount) {
                // p.ext == null, which indicates the last node in the current list;
                if ((e = p.next) == null) {
                    // Link the new node to the end of the list;
                    p.next = newNode(hash, key, value, null);
                    // Check whether the current list length is 7;
                    if (binCount >= TREEIFY_THRESHOLD - 1) {
                        // If the tree reaches 7, the length of the new node will be 8.
                        treeifyBin(tab, hash);
                    }
                    // The new node successfully completes the loop
                    break;
                }
                // Check whether the value of e is consistent with that of the new node.
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        // If yes, end the spin;
                        break;
                    }
                // assign e to p, that is, p to the next position in the list.p = e; }}// If e is not null, the same key is found in the current position.
        if(e ! =null) {
            V oldValue = e.value;
            // Replace the value if onlyIfAbsent is false or oldValue is null.
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            // Post processing of e, which can be overridden by inheriting classes;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold) {
        // Check whether the new size is greater than threshold. If so, expand the capacity.
        resize();
    }
    // post-processing, for inherited classes to override;
    afterNodeInsertion(evict);
    return null;
}
Copy the code

PutTreeVal (), and treeifyBin(), which I didn’t expand here for space reasons. Interested friends can also take a look at the source code.

Query data

Query the get(key) method of the HashMap corresponding to the data. It is relatively easier to find the data than to add it. First, it is to find the corresponding position through the hash of the key, and then determine whether the key is equal. If so, determine whether the current node has a next value. If so, determine whether it is a linked list type or a red-black tree type, and then search for the corresponding data according to their respective methods. If not, return null. The corresponding source details are as follows:

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab;
    Node<K,V> first, e;
    int n;
    K k;
    // Check whether the current position is empty;
    if((tab = table) ! =null && (n = tab.length) > 0 && 
        (first = tab[(n - 1) & hash]) ! =null) {
        // If the value is not null, the key value is equal.
        if(first.hash == hash && ((k = first.key) == key || (key ! =null && key.equals(k)))) {
                // if the value is equal, return it directly;
                return first;
            }
        // Determine if there is a next node
        if((e = first.next) ! =null) {
            // If the next node is a red-black tree node, the search is based on the red-black tree and returns
            if (first instanceof TreeNode) {
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            }
            // This is a list node
            do {
                // If a node with the same key is found, return it directly
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        returne; }}while((e = e.next) ! =null); }}return null;
}
Copy the code
conclusion

To understand HashMap, first of all, we need to know what its overall structure is: array + linked list + red-black tree. Secondly, we need to know how HashMap determines whether to add data by red-black tree or linked list. What are the conditions of conversion between the two, and how to process the data with repeated keys; Finally, it’s easy to read the source code of HashMap to understand how the HashMap expansion works and how loadFactor and Threshold affect the expansion.

Set

HashSet

HashSet is an implementation class that belongs to the Set interface. It is also a commonly used collection in our development, especially when there is no order requirement for data and no repetition is allowed. HashSet has the following characteristics:

  • The underlying layer is a HashMap, so there is no guarantee of element order when inserting data; And because the bottom layer is HashMap, its operation on data is very fast, and the time complexity is O(1). Elements inserted into a HashSet cannot be repeated.

With this in mind, let’s take a look at the source details of a HashSet and how to use a HashMap as the underlying data store. First, let’s look at the properties of a HashSet:

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable.java.io.Serializable {

    private transient HashMap<E,Object> map;

    private static final Object PRESENT = newObject(); .Copy the code

PRESENT is the final variable PRESENT. Map is the underlying HashSet that stores the data. So what’s the PRESENT? This will be solved below.

Initialize the

Here’s how to initialize a HashSet:

public HashSet(a) {
    map = new HashMap<>();
}

public HashSet(Collection<? extends E> c) {
    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
    addAll(c);
}

public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}

HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}
Copy the code

Since the underlying data store is a HashMap, you can see that the main purpose of the initialization of the HashSet is to initialize the underlying data store HashMap and create the HashMap object when the constructor executes.

Insert data

A HashSet is a key-value map that stores a single value, but a key-value map is a key-value map. The final value PRESENT is used, and the source details of the HashSet insert are as follows:

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}
Copy the code

We call the put(key, value) method of the map variable, and use the data that needs to be stored in the HashSet as the map key. The final quantity PRESENT is then stored into the map as value, and the data is inserted. This has completed the conversion from HashSet to the underlying HashMap storage, and can also understand why the inserted data is unordered, the operation time of the HashSet data is O(1), and why the HashSet storage data cannot be repeated. Because HashMap keys are not allowed to be duplicated.

conclusion

This is all for the introduction of HashSet. In fact, we can see that HashSet itself is very simple. We mainly understand HashMap, and we can see the composition and characteristics of HashSet at a glance.

5. Afterword.

The collection framework is almost finished. It took more than two weeks to write this article in pieces, which took a bit long indeed. Side because they are lazy, often write two times went aside, probably because I just start writing, on the other side sometimes often fall into the plight of speaking like a book, don’t know how to write a lot of things, and then also pretty tangled, source this thing is really difficult to chew, sometimes a code may need to spend a long time to understand. In fact, at the beginning of this article, I intended not only to introduce these apis, but also to introduce TreeMap in Map, LinkedHashMap, TreeSet in Set, etc. These are likely used in the work, but because of the current problems, the level of this a few collection source also didn’t go to understand, so don’t do first, subsequent wait you have time to learn about them up, then this one also may be split into many, also looks not so tired. In the end, generally speaking, or very gratified, after all, still finished writing, not abandoned pit. Again, these things will be more like my own notes, occasionally forget to come back to skim, deepen the impression. Some friends have seen it and feel that there is a need to improve, you can also talk about me privately. It is a process of mutual growth. Thank you for your guidance.