Java container is a very important part of the Java language, daily code will use a lot of containers. Java container is a huge system, daily use and learning are inevitably entangled in details, so it is difficult to fully grasp the Java container. This article gives you an overview of Java’s container architecture and then dives into the details.

Java containers are divided into two parts: Collection and Map. Collection is primarily used to store a single element. Maps, on the other hand, store key-value pairs.

This article is based on JDK1.8

Collection

In the figure above, circles represent interfaces and rectangles represent classes, both abstract and ordinary. Green means thread-safe, yellow means not thread-safe. The class diagram above only includes classes under java.util. Concurrent. The container classes under java.util. Concurrent are not that different from each other functionally, but the classes under this package are thread-safe.

The class diagram shows that the Collection inherits the Iterator interface, indicating that all collections can be accessed through iterators.

The Collection interface has three subinterfaces, List, Set, and Queue. A List stores elements in the order in which they were inserted; none of the elements in a Set can be repeated. Some common methods are defined in the Collection, which are basic utility methods, such as getting the size of the container, determining when the container is empty, emptying the container, iterating over the elements of the container, and so on. After JDK1.8, the Collection interface added the default method, the stream() and parallelStream() methods for java8-enabled functional programming, and the odd spliterator() method, This method can be used to segment the elements of the container so that multiple threads can process the elements of different segments without colliding, thus improving the efficiency of processing the elements of the container.

interface Collection<E> extends 可迭代<E> {
    
    int size(a);
    boolean isEmpty(a);
    boolean contains(Object o);
    Iterator<E> iterator(a);
    Object[] toArray();
    <T> T[] toArray(T[] a);
    default <T> T[] toArray(IntFunction<T[]> generator) {
        return toArray(generator.apply(0));
    }
    boolean add(E e);
    boolean remove(Object o);
    boolean containsAll(java.util.Collection
        c);
    boolean addAll(java.util.Collection<? extends E> c);
    boolean removeAll(java.util.Collection
        c);
    default boolean removeIf(Predicate<? super E> filter) {
        Objects.requireNonNull(filter);
        boolean removed = false;
        final Iterator<E> each = iterator();
        while (each.hasNext()) {
            if (filter.test(each.next())) {
                each.remove();
                removed = true; }}return removed;
    }
    boolean retainAll(java.util.Collection
        c);
    void clear(a);
    boolean equals(Object o);
    int hashCode(a);
    @Override
    default Spliterator<E> spliterator(a) {
        return Spliterators.spliterator(this.0);
    }
    default Stream<E> stream(a) {
        return StreamSupport.stream(spliterator(), false);
    }
    default Stream<E> parallelStream(a) {
        return StreamSupport.stream(spliterator(), true); }}Copy the code

List

List interface under the ArrayList everyday write code to use a lot. Part of the code for ArrayList is as follows. As you can see from the code, the underlying data structure of ArrayList is an array, and ArrayList implements RandomAccess to support RandomAccess.

class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess.Cloneable.java.io.Serializable {

    transient Object[] elementData;
}
Copy the code

An ArrayList is much like an array, but provides more convenience. Vector has the same functionality as ArrayList, but is thread-safe. Stack, a subclass of Vector, is thread-safe, too, but these classes are no longer recommended. If you want to use thread-safe classes, CopyOnWriteArrayList in java.util.Concurrent is a better choice.

LinkedList and ArrayList have similar functions. The biggest difference between them is that ArrayList supports random access while LinkedList does not. The code for LinkedList is as follows, and you can see that the underlying LinkedList uses a two-way LinkedList data structure. It also implements a Deque interface, so it can be used as a queue or a double-endian queue in addition to a list container.

class LinkedList<E> extends AbstractSequentialList<E>
        implements List<E>, Deque<E>, Cloneable.java.io.Serializable {
    
    transient int size = 0;

    transient Node<E> first;
    
    transient Node<E> last;

}
Copy the code

LinkedList also provides LinkedBlockingQueue and LinkedBlockingDeque in java.util.Concurrent for the same purpose, except that it has an advantage over LinkedList in a multi-threaded environment. There is little difference in function.

Set

What all sets have in common is that the elements of a Set are not duplicated. This feature is useful in some cases. HashSet is the most commonly used Set class. A HashSet is created using a HashMap. All values are stored in the Key of the HashMap. Value is placed in the fixed object PRESENT.

class HashSet<E>
        extends AbstractSet<E>
        implements Set<E>, Cloneable.java.io.Serializable {

    private transient HashMap<E, Object> map;

    private static final Object PRESENT = new Object();

    public HashSet(a) {
        map = newHashMap<>(); }}Copy the code

The elements in a HashSet are unordered, and if you want to keep the elements in a set in order, you can use a LinkedHashSet, where the insertion order is recorded by constructing a two-way linked list. TreeSet uses the underlying red-black tree structure to provide sequential access, depending on specific requirements. There is also a thread-safe version of Set, CopyOnWriteArraySet.

Queue

Queue/Deque is a Queue interface provided in Java. An ArrayQueue is a concrete queue class that can be used as a normal queue or a two-endian queue. However, queues are more commonly used in concurrent situations, and using LinkedBlockingQueue or LinkedBlockingDeque is a better choice. Sometimes, in addition to sequential queues, there may be a need to schedule queues by priority. PriorityQueue was created for this purpose, and in the case of concurrency, its counterpart is PriorityBlockingQueue.

Map

The Map class diagram structure is relatively simple. All Map classes inherit the Map interface. HashMap is the most commonly used Map class. HashMap is also unordered, similar to Set. LinkedHashMap and TreeMap also ensure order in different ways. LinkedHashMap records insertion order through a two-way linked list. TreeMap sorts its elements and can be accessed in the order in which they are sorted.

As a typical Map implementation, HashMap code structure is much more complex, and HashMap claims to haveThe access speed is only approximate and may degenerate into in extreme cases). The key to this speed is the implementation of the hash function, and a good implementation of the hash function helps to evenly distribute the key-value pairs so that there areThe following is the implementation of HashMap hash function, and HashMap expansion and hash collision processing and other problems are also very complex.

static final int hash(Object key) {
        int h;
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
Copy the code

Similar to the structure in Collection, HashTable is similar to the HashMap functionality, but HashTable is thread-safe. Again, HashTable is not recommended because the way it is implemented is not as good as the performance provided in java.util.concurrent. ConcurrentHashMap is recommended in concurrent cases. ConcurrentHashMap performs better in concurrent cases through the mechanism of segmental locking. Use ConcurrentNavigableMap if you also need to ensure the order of maps in concurrent situations.

The Collections utility class

Under the java.util package is the Collections class, which is a utility class in which all the methods are static and the class cannot be instantiated. It provides methods for manipulating various container objects more efficiently.

For example, sort a List:

ArrayList<Integer> list = new ArrayList();
list.add(1);
list.add(4);
list.add(6);
list.add(2);
list.add(8);
Collections.sort(list);
Copy the code

You can also customize the sorting rules by implementing a Comparator and passing it in as a parameter.

Collections.sort(list, new Comparator<Integer>() { 
    @Override   
    public int compare(Integer o1, Integer o2) {
        return o1 > o2 ? 1 : 0; }});Copy the code

There are binary search algorithms out of the box:

Collections.binarySearch(list, 2);
Copy the code

You can also reverse the list directly:

Collections.reverse(list);
Copy the code

You can also shuffle the list using the shuffle algorithm:

Collections.shuffle(list);
Copy the code

These are just a few of the methods. There are also ways to swap elements in a list to find the smallest, the largest, and so on.

Since most containers under the java.util package are not thread-safe, Collections has a class of methods that can turn ordinary container objects into thread-safe objects:

Collections.synchronizedList(list);
Copy the code

There are similar utility methods for maps and sets.

In concurrent environments, it is also possible to convert a normal container object into an immutable container object, which is thread-safe in concurrent environments:

Collections.unmodifiableList(list);
Copy the code

(after)

The original

Follow the wechat official account and chat about other things