Performance optimization: Why use SparseArray and ArrayMap instead of HashMap?

background

Performance optimization is a very important module in Android development, where the performance optimization of data structures is quite important. For commonly used HashMap, SparseArray and ArrayMap are officially recommended to replace it.

Java defines an interface java.util.map for mapping in data structures. This interface mainly has four commonly used implementation classes, namely HashMap, Hashtable, LinkedHashMap and TreeMap. The class inheritance relationship is shown in the figure:

First, let’s take a look at HashMap, see its pros and cons, and then compare other data structures and why it should be replaced.

HashMap

A HashMap is made up of an array plus a one-way list, with an initial size of 16 (2 ^ 4),When you put it for the first time, you actually initialize it.

When the list length is greater than 8, it becomes a red-black tree, and when the list length is less than 6, it becomes a linked list. One of the most important fields in the HashMap class is Node[] table, which is an array of hash buckets

static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;    // Used to locate the array index position
        final K key;
        V value;
        Node<K,V> next;   // Next node in the list

        Node(int hash, K key, V value, Node<K,V> next) { ... }
        public final K getKey(a){... }public final V getValue(a) {... }public final String toString(a) {... }public final int hashCode(a) {... }public final V setValue(V newValue) {... }public final boolean equals(Object o) {... }Copy the code

Why red black trees?

In JDK 1.8, the search time of a HashMap is O(logN). The search time of a HashMap is O(logN). The search time is O(logN). The red-black tree is a self-balancing binary search trees, is not an absolute balanced binary tree, it gave up the pursuit of absolute balance, pursuit is roughly balanced, near the time complexity of balanced binary tree and small cases, ensure each insert at most, only need three rotation can achieve balance, to implement more simple also. Thus obtaining higher search performance.
Why is it that the initial value is 2 to the NTH power, and the result of each subsequent expansion is also 2 to the NTH power?

The initial value of a HashMap is 16, which is 2 to the fourth power, and each subsequent expansion is twice as large. Why is that? I think there are several points:
- If we want to store data in a Hashmap, we first need to make sure that it is distributed as evenly as possible. To make sure that it is distributed evenly, we might want to use the modulo method. If the traditional ‘%’ method is not efficient, when the size (length) is always 2 ^ N, H &(Length-1) operation is equivalent to modulo length, that is, h%lenth, but & is more efficient than % and reduces hash collisions.
- And h&(Length-1) related, when the capacity size (n) is 2 to the power of n, the last few bits of n-1 binary are all 1, in the case of random number h, and (n-1) and operation, the distribution will be more uniform, think about it, if the binary number of n-1 is 1110, when the mantissa is 0, and out of the mantissa value is always 0, Then 0001,1001,1101 and other positions with mantissa 1 can never be occupied by entry, resulting in space waste.
In the hashmap source, the put method calls the indexFor method, which finds the entry’s position in the table based on the hash value of the key, and returns h&(Length-1).

It should also be noted that all numbers in the program are stored in binary form in computer memory. Bitwise operations operate directly on the binary bits of an integer in memory. In Java, an operator has a lot of, for example and (&), not (~), or (|), or (^), shift (< < and > >), such as Java operations will eventually be converted into a operations.
Associations between HashMap and other data structures
- The relationship with Hashtable is thread-safe. Many hashtables, such as PUT and get, use synchronized modification. Compared with ConcurrentHashMap, the efficiency of Hashtable decreases dramatically when the size of Hashtable increases to a certain level. Because iteration is locked for a long time (whereas ConcurrentHashMap introduces splitting, locking only a portion of the map), it is not very efficient. Look again at the put method of Hashtable
```
public synchronized V put(K key, V value) {
   // Make sure the value is not null
   if (value == null) {
       throw new NullPointerException();
  }

   // Makes sure the key is not already in the hashtable.HashtableEntry<? ,? > tab[] = table;int hash = key.hashCode();
   int index = (hash & 0x7FFFFFFF) % tab.length;
   @SuppressWarnings("unchecked")
   HashtableEntry<K,V> entry = (HashtableEntry<K,V>)tab[index];
   for(; entry ! =null ; entry = entry.next) {
       if ((entry.hash == hash) && entry.key.equals(key)) {
           V old = entry.value;
           entry.value = value;
           return old;
      }
  }

   addEntry(hash, key, value, index);
   return null;
}
Copy the code
```
Hashtable HashMap Hashtable HashMap Hashtable
- In relation to HashSet, let’s first look at the Add method of HashSet
```
public boolean add(E e) {
   return map.put(e, PRESENT)==null;
}
Copy the code
```
Here the map is a HashMap,e is the element put in, and value has a uniform value:

private static final Object PRESENT = new Object();

A HashSet is a limited HashMap. If you know HashMap, you know HashSet. Although the Set does not place duplicate elements, the underlying Map simply replaces the original values. A HashSet does not provide a get method because, like a HashMap, the Set is unordered inside and can only be retrieved iteratively.
- LinkedHashMap is an array + bidirectional linked list. Unlike HashMap, LinkedHashMap can guarantee the iteration order of elements, which can be either insertion order or access order (LRU principle).
- Relationship to LinkedHashSet, which is inherited from HashSet and the underlying implementation is LinkedHashMap.
- The elements in TreeSet are sorted (not in insertion order, but by keyword size) and cannot be repeated.
HashMap changes in Java7 and Java8

For example, whether to rehash and introduce red-black trees during expansion

Does a list longer than 8 always turn into a red-black tree?

If the list length is greater than 8, it will be converted to a red black tree

private class Key implements Comparable<Key> {

   private final int value;

   Key(int value) {
       this.value = value;
  }

   @Override
   public int compareTo(Key o) {
       return Integer.compare(this.value, o.value);
  }

   @Override
   public boolean equals(Object o) {
       if (this == o) return true;
       if (o == null|| getClass() ! = o.getClass())return false;
       Key key = (Key) o;
       return value == key.value;
  }

   @Override
   public int hashCode(a) {
       return 1; }}Copy the code

As you can see, I overwrote the hashCode method. All keys have hashcode of 1. Here is my test method

public void test(a){
   Map<Key, String> map = new HashMap<>();
   map.put(new Key(1), "13");
   map.put(new Key(2), "23");
   map.put(new Key(3), "33");
   map.put(new Key(4), "43");
   map.put(new Key(5), "53");
   map.put(new Key(6), "63");
   map.put(new Key(7), "73");
   map.put(new Key(8), "83");
   map.put(new Key(9), "93");
   map.put(new Key(10),"103");
   map.put(new Key(11),"104");
   map.put(new Key(12),"123");
}
Copy the code

The first eight numbers are all on the same list, and then more than eight become red-black trees. There’s a key parameter here

static final int MIN_TREEIFY_CAPACITY = 64

Then we look at the linked list into red black tree source, can be drawn

Even list the length more than eight, enter this into the red-black tree method, look at line 756, or to determine the size of the current capacity is less than 64, if less than this value, or rather to expansion into the red-black tree, then you should be able to see the above that the result of the test () method, after the put value 9, expansion is 32, After the 10th value is put, it expands to 64, by which time the list is already over 8 (because the hashcode method is dead), and it doesn’t start converting to a red-black tree until the 11th value is put. Below is a screenshot of me debugging this method for your reference

HashMap concurrency issues

HashMap is not thread-safe, and in Java7, HashMap manipulation by multiple threads can result in a circular linked list, causing an infinite loop. This can be done with ConcurrentHashMap, which in Java7 is implemented as a piecewise lock, or second-level hash table. The implementation in Java8 is efficient concurrency via Unsafe classes such as CAS spin assignment +synchronized +LockSupport blocking, with less readable code. The biggest difference is that JDK8 has a finer lock granularity and ideally the talbe array element size is the maximum number of concurrent elements it can support.
The disadvantage of HashMap is that the expansion factor is 0.75 (the data paper indicates that 0.6~0.75 is the best performance), and the default expansion is twice for each expansion, which is a bit of a waste of space (if only more than one). In fact, it is space for time.

SparseArray

In SparseArray, the Key is an int (avoiding boxing and unboxing), and the Value is an Object. All the keys and values correspond to an array respectively. The iInt values of the Key array are arranged in order. And the position of the Value array is the same as the position of the Key array.

Add will shift, remove will not necessarily shift, mark a value as DELETE, if the next time there is a matching value put directly at that location, there will be no shift. However, there is a disadvantage: Key can only be an int. Finally, SparseLongArray has been extended to Android.

ArrayMap

Like HashMap, keys and values of An ArrayMap can store multiple types. The data storage format of an ArrayMap object is as follows

MHashes is an array of hashcode values of all keys, sorted from smallest to largest. The binary search method is used to find the key from the mHashes array whose value is equal to the hash
MArray is an array of key-value pairs, twice the size of mHashes.

ArrayMap has two very important static member variables mBaseCache and mTwiceBaseCacheSize, which are used for the global caching function of the process where ArrayMap resides:

MBaseCache: an ArrayMap with a cache size of 4. MBaseCacheSize records the number of cached objects. If the number exceeds 10, it will not be cached.
MTwiceBaseCacheSize: An ArrayMap with a cache size of 8. MTwiceBaseCacheSize records the number of cached objects. If the number exceeds 10, it will not be cached.

Many scenarios may start with very little data. To reduce frequent Map creation and collection, ArrayMap uses two cache queues of size 10 to hold Map objects of size 4 and 8, respectively. In order to save memory, there are more conservative memory expansion and memory shrinkage strategies.

FreeArrays () trigger time:

When removeAt() is executed to remove the last element
Clear the situation when clear() is executed
When ensureCapacity() executes allocArrays first and then freeArrays when the current capacity is less than the expected capacity
When put() is full, allocArrays is executed first and then freeArrays is executed

AllocArrays trigger timing:

When ArrayMap’s constructor is executed
When removeAt() is performed, the capacity tightening mechanism is met
When ensureCapacity() executes allocArrays first and then freeArrays when the current capacity is less than the expected capacity
When put() is full, allocArrays is executed first and then freeArrays is executed

ArrayMap resolves Hash conflicts by appending. For example, if two keys have the same Hash (int) value, move all the data one bit later to resolve Hash conflicts by appending.

The expansion ArrayMap

When mSize is greater than or equal to the mHashes array length, the capacity is expanded. After expansion, the old array needs to be copied to the newly allocated array and the old memory needs to be released.

If the number of maps meets the condition osize<4, the size after expansion is 4.
If the number of maps meets the condition 4<= Osize < 8, the size after expansion is 8.
If the number of maps meets the condition osize>=8, the map size after expansion is 1.5 times of the original one.

If the size of the array memory is greater than 8 and the number of stored data (mSize) is less than one third of the array space, you need to tighten the data capacity and allocate new arrays. The old memory is automatically reclaimed by the VM.

If mSize<=8, set the new size to 8;
If mSize> 8, set the new size to 1.5 times mSize.

That is, in the case of large data, the memory array is tightened by 50% when the memory usage is less than a third.

ArrayMap is quoted from this blog.

LinkedHashMap

LinkedHashMap is made up of HashMap+LinkedList, where the LinkedList is used to store the data order. The LinkedHashMap can be sorted by insertion/access order, which can be determined when the constructor is invoked. LinkedHashMap inherits from HashMap and overrides the PUT and get methods.

The data storage of LinkedHashMap is the same as the structure of HashMap in the form of (array + one-way linked list), but before and after variables used to maintain the order are added in each node Entry to maintain a two-way linked list to store the storage order of LinkedHashMap. Instead of using HashMap iterators when calling iterators, you write your own iterators to traverse the bidirectional list.

The Entry is shown below

/** * HashMap.Node subclass for normal LinkedHashMap entries. */
static class LinkedHashMapEntry<K.V> extends HashMap.Node<K.V> {
    LinkedHashMapEntry<K,V> before, after;
    LinkedHashMapEntry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next); }}Copy the code

The internal logical comparison between HashMap and LinkedHashMap is shown in the figure

So if you maintain this bidirectional list when you put and when you get, the variables are sorted.

conclusion

This article summarizes some common data structures in Android development, SparseArray and ArrayMap these two data structures are to be considered first, in the case of saving memory performance and HashMap not too much, at the same time also analyzed the source code of HashMap and LinkedHashmap, Hope to help you, like a thumbs-up and attention!

Here are some of my articles:

Have you mastered the basic, intermediate, and advanced methods of Activity?

Do you understand Handler’s basic, intermediate, and advanced questions?

Reference article:

Juejin. Cn/post / 684490…

Gityuan.com/2019/01/13/…

Performance optimization: Why use SparseArray and ArrayMap instead of HashMap?

background

HashMap

SparseArray

ArrayMap

LinkedHashMap

conclusion

Related Posts

How threads are created in Java, how thread pools work.

An article to get you started on FFmpeg programming

Why use viewLifecycleOwner instead of this in your Fragment