Set an overview

Collections in Java are divided into two classes, Map and Collection, from the perspective of the upper interface. Map is an upper layer interface of Collection and has no inheritance relationship.

Common collections in Java can be summarized as follows.

  • The Map and Collection interfaces are the parent interfaces of all collections frameworks
  • The subinterfaces of the Collection interface include Set interface and List interface
  • The Map interface implementation classes include HashMap, TreeMap, HashTable ELINkedhashMap, ConcurrentHashMap, and Properties
  • The main implementation classes of Set interface are: HashSet, TreeSet, LinkedHashSet and so on
  • List interface implementation classes are: ArrayList, LinkedList, Stack and Vector, etc

What are the differences between HashMap and Hashtable?

  • HashMap does not take synchronization into account and is thread unsafe; Hashtable uses the synchronized keyword and is thread-safe;
  • HashMap allows NULL keys; A Hashtable does not allow null keys, and a Hashtable value cannot be considered null

HashMap is thread unsafe, right? Can you give me an example?

Forget the quick fail mechanic

  • The HashMap thread is not safe mainly because of the possibility of HashMap infinite loop during capacity expansion in multi-threaded environment
  • Hashtable is thread-safe because its internal implementation uses synchronized on methods like PUT and remove, so the use of a single method is thread-safe. However, when multiple methods are combined, thread-safety cannot be guaranteed. For example, if a thread is doing a get and then a PUT update, that’s two compound operations. In between the two operations, another thread may have changed the key, so your subsequent PUT operation may not be as expected.

Fast-fail mechanism

Fail-fast is an error detection mechanism for Java collections. When multiple threads make structural changes to a collection, fail-Fast may occur.

For example,

Let’s say there are two threads (thread 1 and thread 2). Thread 1 Iterator iterates through the elements of set A, and thread 2 at some point changes the structure of set A (the structure changes, rather than simply changing the contents of the elements). This is when the program throws an exception that creates a fast failure mechanism.

why

Iterators access the contents of the collection directly during traversal and use a modCount variable during traversal. If the contents of the collection change during traversal, the value of modCount is changed. Whenever the iterator iterates over the next element using hashNext () /next (), it checks whether the modCount variable is the expectedmodCount value and returns the traversal if it is; Otherwise, an exception is thrown and the traversal is terminated.

The solution

  • In traversal, add synchronized wherever it is worth changing modCount.
  • Use thread-safe collections

hashmap

Hashmap recommends another article of mine:

A hashmap project

Here’s a quick addition to the consistent Hash algorithm

Consistent Hash:

Client sharding: hash + clockwise (optimized remainder) Node scaling: only adjacent nodes are affected, but there is still data migration scaling: minimum migration data and load balancing are guaranteedCopy the code

Consistent Hash is a good solution to stability problems. You can arrange all storage nodes on the Hash ring that is connected to the end. After calculating the Hash, each key will find the first group of storage nodes clockwise and store them. When a node joins or exits, only the subsequent nodes clockwise from the node on the Hash ring can receive or give data from the node. However, this has the problem of uniformity. Even if storage nodes can be evenly spaced, data will be uneven when the number of storage nodes changes. This kind of non-uniformity, which may be multiplied, is unacceptable in practical engineering.

【 To be continued 】

ConcurrentHashMap and Hashtable?

ConcurrentHashMap combines the advantages of HashMap and Hashtable. HashMap does not consider synchronization, Hashtable does. But the Hashtable locks the entire structure each time a synchronization is executed.

ConcurrentHashMap Locks the hash table into 16 buckets (the default value). Common operations such as GET, PUT, and remove lock only the buckets that are currently used.

Implementation of ConcurrentHashMap (segmented locking)

This class contains two static inner classes, MapEntry, which encapsulates the key-value pairs of the mapping table, and Segment, which acts as a lock.

segment

A Segment is a ReentrantLock. Each Segment guards an element in a HashEntry array. When modifying a HashEntry array, you must first obtain the corresponding Segment lock.

The thread-safe approach to ConcurrentHashMap prior to JDK1.7 was relatively simple:

  • It internally divides data into several “segments”, the number of which is related to the concurrency level, specifically “greater than or equal to the lowest power of 2 of the concurrency level”.
  • Use a separate ReentrantLock for each segment.
  • If the operation involves different segments, it can be executed concurrently. If it is the same segment, lock contention and wait are performed.
  • This design is more efficient than synchronized.

However, after JDK8, ConcurrentHashMap abandoned ReentrantLock in favor of synchronized. The reasons are as follows:

  • Adding multiple segmented locks wastes memory space.
  • In the production environment, the probability of a map competing for the same lock is very small. In fact, pieced locking causes a long wait for update operations.
  • To improve the efficiency of GC

The synchronized keyword +CAS operation in the new ConcurrentHashMap ensures thread safety.

What are the features of TreeMap?

The underlying TreeMap is implemented using red-black trees, and the key-value pairs stored in TreeMap are sorted by key.

  • If the Key is stored as a string, it will be sorted in the dictionary default order
  • If a custom reference type, such as User, is passed in, the object must implement the Comparable interface and override its compareTo method; Or when creating a TreeMap, we must specify the comparator to use.

As follows:

// Method 1: Class User implements Comparable{@override public int compareTo(Object O) {return 0; class User implements Comparable{@override public int compareTo(Object O) {return 0; }} public static void main(String[] args) { When you create TreeMap, New TreeMap<User, Integer>(new Comparator<User>() {@override public int compare(User o1, User o2) {return 0; }}); }Copy the code

What is the difference between the Comparable and Comparator interfaces?

  • The Comparable implementation is relatively simple, but when the comparison rules need to be redefined, the source code must be modified, namely the compareTo method in the User class
  • The Comparator interface does not need to modify the source code. It simply repasses a Comparator with the specified rule when creating a TreeMap.

Companion: The difference between Comparable and comparator?

Comparable comes from the java.lang package, which has a compareTo(Object obj) method to sort the Comparator interface from the java.util package, It has a compare(Object obj1 Object obj2) method for sorting

What are the differences between ArrayList and LinkedList?

  • ArrayList uses a dynamic array implementation underneath, which is essentially a dynamic array
  • LinkedList uses a bidirectional LinkedList implementation underneath and can be used as a stack, queue, or double-ended queue
  • ArrayList is more efficient at random access than LinkedList
  • LinkedList is more efficient than ArrayList in adding and deleting nodes
  • The ArrayList must reserve certain space. If the space is insufficient, the ArrayList will be expanded
  • The cost of LinkedList is that it must store information about the node as well as pointer information about the node

There is also a collection Vector, a thread-safe ArrayList, which is deprecated and not recommended. In a multithreaded environment, we can use CopyOnWriteArrayList instead of ArrayList to ensure thread safety.

What are the differences between HashSet and TreeSet?

  • The underlying HashSet uses a Hash table implementation.
    • Element uniqueness principle: Determine whether elements’ hashCode values are the same. If they are the same, then we’re going to check whether or not the equals method of the element is true
  • The underlying TreeSet is implemented using red-black trees.
    • Ensure that elements are unique through the Comparable or Comparator interface

HashSet and HashMap

In fact, the underlying implementation of a HashSet is still a HashMap, but it uses only the keys in it, as shown below:

  • Key = e, value=PRESENT is used to construct key-value pairs. When e exists in the key of a HashMap, the value will overwrite the original value, but the key will remain unchanged. So if you add an existing e element to a HashSet, the new element is not saved to the HashMap, so this satisfies the property that elements in a HashSet are not duplicated.
  • The contains method of a HashSet uses the containsKey method of a HashMap

How about LinkedHashMap and LinkedHashSet?

The internal Entry of the LinkedHashMap is inherited from hashMap. Node. Both classes implement map. Entry<K,V> The Entry of the LinkedHashMap has not only value, next, but also before and after attributes. Constructor public LinkedHashMap(int initialCapacity,float loadFactor, Boolean accessOrder) AccessOrder true implements the LRU cache algorithm (accessOrder). The underlying LinkedHashSet is implemented using LinkedHashMap, which is similar to the relationship between HashMap and HashSet.

What is the LRU algorithm? How does LinkedHashMap implement the LRU algorithm?

LRU (Least recently used) algorithm filters out data based on historical access records. Its core idea is that “if data has been accessed recently, it has a higher chance of being accessed in the future”. Ideas as follows

  1. Insert new data into the linked list header;
  2. Whenever a cache hit (that is, cached data is accessed), the data is moved to the head of the linked list;
  3. When the list is full, discard the data at the end of the list.

About [Hit Ratio]

The efficiency of LRU is good when there is hot data, but occasional and periodic batch operations will lead to a sharp drop in LRU hit ratio and serious cache contamination.

import java.util.LinkedHashMap; import java.util.Map; public class LRUTest { private static int size = 5; Public static void main(String[] args) {Map<String, String> Map = new LinkedHashMap<String, String>(size, 0.75f, true) { @Override protected boolean removeEldestEntry(Map.Entry<String, String> eldest) { return size() > size; }}; map.put("1", "1"); map.put("2", "2"); map.put("3", "3"); map.put("4", "4"); map.put("5", "5"); System.out.println(map.toString()); map.put("6", "6"); System.out.println(map.toString()); map.get("3"); System.out.println(map.toString()); map.put("7", "7"); System.out.println(map.toString()); map.get("5"); System.out.println(map.toString()); }}Copy the code

What’s the difference between a List and a Set?

  • The List is ordered and the elements can be repeated
  • Sets are unordered (except linkedhashsets), and elements cannot be repeated

Note: The order and disorder here refers to the order and take out the order to keep the same

What is the difference between Iterator and ListIterator?

  • Iterator can iterate over lists and sets; ListIterator can only be used to iterate over list collections
  • Iterator can only traverse collections forward; ListIterator can traverse collections forward and backward
  • ListIterator actually implements the former and adds some new features.

Iterato reference code:

ArrayList<String> list = new ArrayList<>(); list.add("zhangsan"); list.add("lisi"); list.add("yangwenqiang"); Iterator<String> Iterator = list.iterator(); while(iterator.hasNext()){ System.out.println(iterator.next()); }Copy the code

What is the relationship between Collections and Collections?

The relationship between Collection and Collections: Collection is a top-level Collection interface whose sub-interfaces include List and Set. Collections is a collection utility class that can manipulate Collections, such as sorting, binary lookup, copying Collections, finding Max and min values, and so on. All in all: the “S” are mostly tools.

Said in the last

Attach the relationship between the collection interface and the implementation class

If you are interested, you can cooperate with my Java set interview questions.