So far, we’ve seen two types of key-value data structures in the Java world: Hash and TreeMap, both of which have their own advantages and disadvantages.

  1. Hash table: O(1) is the fastest to insert and search. If the use of linked list implementation can be achieved without locking; Data ordering requires explicit sorting operations.
  2. Red-black tree: The insertion and search are O(logn), but the constant term is smaller; The complexity of lock-free implementation is very high, and locks are generally required. Data is naturally ordered. This time, however, I introduce a third data structure that implements key-value: SkipList. SkipList is no less efficient than red-black trees, but its principles and implementation complexity are much simpler.

SkipList

What is SkipList? Skip List, known as Skip List, is a data structure that can replace the balanced tree, and its data elements are naturally ordered in ascending order according to the key value by default. The Skip list let sorted data distribution in multilayer list, 1-0 random number decided to climb a data or not, through a “space for time,” algorithm, increased the pointer forward in each node, can be ignored when insert, delete, find some can’t be involved nodes, thus improve the efficiency.

Let’s start with a simple linked list, as follows:

If we need to query 9, 21, 30, we need to compare The Times of 3 + 6 + 8 = 17, then is there an optimization plan? There are! We have distilled some of the elements in this list as a comparison “index”, as follows:

We first compare these indexes to determine whether the next element will go right or down, and because of the “index”, the number of comparisons during retrieval is greatly reduced. Of course, there are not many elements, which is difficult to advantage, but when there are enough elements, this kind of index structure will come into play.

The characteristics of SkipList

SkipList has the following features:

  1. Composed of many layers, the level is randomly generated by a certain probability
  2. Each layer is an ordered linked list, ascending by default, or sorted by the Comparator provided when the map was created, depending on the constructor used
  3. The lowest linked list (Level 1) contains all elements
  4. If an element appears in a Level I list, it also appears in all lists below Level I
  5. Each node contains two Pointers, one to the next element in the same list and one to the element at the next level down

With a few more extensions to the figure above, we can turn it into a typical SkipList structure

SkipList lookup

SkipListd’s search algorithm is relatively simple. For the above example, we want to find element 21, which goes like this:

  1. If I compare 3, if I’m bigger than 3, I’m going to look at (9),
  2. If it’s larger than 9, you go further (25), but if it’s smaller than 25, you go from 9 to 16.
  3. If the node after 16 is still 25, continue to look at the layer below 16
  4. Find the 21st

The red dotted line represents the path.

SkipList insertion

SkipList inserts include:

  1. Find the right location. One thing to be clear here is that when confirming the level K to be occupied by the new node, flip a coin, completely randomly. If the occupied level K is greater than the linked list level, apply for a new level, otherwise insert the specified level
  2. Applying for a New Node
  3. Adjust the pointer

Let’s say the element we want to insert is 23, and a lookup confirms that it is before 25, after 9, 16, and 21. Of course you need to consider the level OF application K.

If level K is greater than 3

Need to apply for new Level (Level 4)

If the level K is equal to 2

Insert directly at Level 2

Here will be involved in an algorithm: by dropping a coin to determine the level of K, the algorithm we through the ConcurrentSkipListMap source code to analyze. Another important point to note is that after inserting elements at layer K, you need to ensure that all layers smaller than layer K should have new nodes.

SkipList delete

The idea of deleting a node is basically the same as that of inserting a node: find the node, delete the node, adjust the pointer.

For example, delete node 9 as follows:

ConcurrentSkipListMap

SkipList’s space-for-time algorithm, O(logn) efficiency of inserts and lookups is no less efficient than red-black trees, but the principle and implementation complexity are much simpler than red-black trees. In general, if you can manipulate linked lists, there is no pressure on SkipList.

ConcurrentSkipListMap is internally implemented using the SkipLis data structure. To implement SkipList, ConcurrentSkipListMap provides three inner classes to build such a linked list structure: Node, Index, and HeadIndex. Node represents the ordered Node of the lowest single linked list, Index represents the Index layer based on Node, and HeadIndex is used to maintain the Index layer. ConcurrentSkipListMap maintains a hierarchy of indexes through HeadIndex, looking from the top to the bottom through Index, narrowing down the query step by step, and by the time you get to the bottom Node, you only need to compare a small amount of data. The relationship in the JDK looks like this:

Node

static final class Node<K.V> {
    final K key;
    volatile Object value;
    volatile ConcurrentSkipListMap.Node<K, V> next;

    /** omit some code */
}
Copy the code

Nodes are structured like a single linked list, with key-value and a next pointing to the next Node.

Index

static class Index<K.V> {
    final ConcurrentSkipListMap.Node<K,V> node;
    final ConcurrentSkipListMap.Index<K,V> down;
    volatile ConcurrentSkipListMap.Index<K,V> right;

    /** omit some code */
}
Copy the code

Index provides a Node based Index Node, one pointing to the next Index right and one pointing to the lower down Node.

HeadIndex

static final class HeadIndex<K.V> extends Index<K.V> {
    final int level;  // Index layer, starting at 1, Node single chain layer is 0
    HeadIndex(Node<K,V> node, Index<K,V> down, Index<K,V> right, int level) {
        super(node, down, right);
        this.level = level; }}Copy the code

HeadIndex has a level inside to define the hierarchy.

ConcurrentSkipListMap provides four constructors, each of which calls the initialize() method.

final void initialize(a) {
    keySet = null;
    entrySet = null;
    values = null;
    descendingMap = null;
    randomSeed = seedGenerator.nextInt() | 0x0100; // ensure nonzero
    head = new ConcurrentSkipListMap.HeadIndex<K,V>(new ConcurrentSkipListMap.Node<K,V>(null, BASE_HEADER, null),
            null.null.1);
}
Copy the code

Note that the initialize() method is not only called in constructors; it is called in clone, clear, and readObject for initialization steps. Note the initialization of randomSeed.

private transient int randomSeed;
randomSeed = seedGenerator.nextInt() | 0x0100; // ensure nonzero
Copy the code

RandomSeed A simple random number generator (described below).

The put operation

CoucurrentSkipListMap provides the PUT () method to associate the specified value with the specified key in this map. The source code is as follows:

public V put(K key, V value) {
    if (value == null)
        throw new NullPointerException();
    return doPut(key, value, false);
}
Copy the code

If value is null, throw a NullPointerException, otherwise call doPut. If value is null, throw a NullPointerException. You then do the actual operation by calling the do**() method.

The doPut() method is quite informative, so let’s analyze it step by step.

private V doPut(K key, V value, boolean onlyIfAbsent) {
    Node<K,V> z;             // added node
    if (key == null)
        throw new NullPointerException();
    / / the comparator
    Comparator<? super K> cmp = comparator;
    outer: for (;;) {
        for(Node<K, V> b = findPredecessor(key, cmp), n = b.next; ;) {/** omit code */
Copy the code

The doPut() method takes three arguments, including key and value, and a Boolean onlyIfAbsent. This parameter determines what to do if the current key is present. If onlyIfAbsent is false, replace value; if true, return value. Interpreted in code as:

 if(! map.containsKey(key))return map.put(key, value);
else
     return map.get(key);
Copy the code

ConcurrentSkipList does not support null keys or values. ConcurrentSkipList does not support null keys or values. The findFossil () method is then called, passing in the key to confirm the location. The findPredecessor() method is simply a confirmation of the location where the key is to be inserted.

private Node<K,V> findPredecessor(Object key, Comparator<? super K> cmp) {
     if (key == null)
         throw new NullPointerException(); // don't postpone errors
     for (;;) {
         // Start with head, head is the highest level headIndex
         for (Index<K,V> q = head, r = q.right, d;;) {

             // r ! = null: there are other nodes on the right of the node to be compared
             if(r ! =null) {
                 Node<K,V> n = r.node;
                 K k = n.key;
                 // value == null, indicating that the node has been deleted
                 // Filter this node through unlink()
                 if (n.value == null) {
                     // delete the r node
                     if(! q.unlink(r))break;           // restart
                     r = q.right;         // reread r
                     continue;
                 }

                 // value ! = null, the node exists
                 // If the key is greater than the key of r, proceed further
                 if (cpr(cmp, key, k) > 0) {
                     q = r;
                     r = r.right;
                     continue; }}If dowm == null, the pointer has reached the lowest level
             if ((d = q.down) == null)
                 returnq.node; q = d; r = d.right; }}}Copy the code

The findPredecessor() method is clear: Find predecessors. Start at the top headIndex and go right until right is null or the key of the Node on the right is greater than the current key. Then go down and repeat the process until down is null. So the insertion location should be the lowest Node list.

ConcurrentSkipListMap gives the method an additional function by determining whether the value of a node is null, which indicates that the node has been deleted, and by calling unlink().

final boolean unlink(Index<K,V> succ) {
    returnnode.value ! =null && casRight(succ, succ.right);
}
Copy the code

Deleting a node is as simple as changing the right pointer.

What do you do after finding predecessor nodes with findPredecessor()? See the following:

for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
       // Next! = null
       if(n ! =null) {
           Object v; int c;
           Node<K,V> f = n.next;

           // The main reason for inconsistent read is concurrency
           if(n ! = b.next)// inconsistent read
               break;

           // n.value == null, the node has been deleted
           if ((v = n.value) == null) {   // n is deleted
               n.helpDelete(b, f);
               break;
           }

           // Node B has been deleted
           if (b.value == null || v == n) // b is deleted
               break;

           // The node is greater than, move forward
           if ((c = cpr(cmp, key, n.key)) > 0) {
               b = n;
               n = f;
               continue;
           }

           // c == 0 indicates that a node with the same key is found. Determine the node based on the onlyIfAbsent parameter
           // onlyIfAbsent ==false, replace value with casValue
           // onlyIfAbsent == true, return the value
           if (c == 0) {
               if (onlyIfAbsent || n.casValue(v, value)) {
                   @SuppressWarnings("unchecked") V vv = (V)v;
                   return vv;
               }
               break; // restart if lost race to replace value
           }
           // else c < 0; fall through
       }

       // Wrap key-value as a node and insert
       z = new Node<K,V>(key, value, n);
       if(! b.casNext(n, z))break;         // restart if lost race to append to b
       break outer;
   }
Copy the code

Once the appropriate location is found, the node is inserted at that location. Inserting a Node is a simple process: wrap key-value as a Node and add it to the list using casNext(). Of course, there is a lot of checking before insertion.

After inserting nodes at the lowest level, what is the next step? Create an index. As mentioned by the previous blog, when inserting nodes, a coin flip is used to determine the level of inserting new nodes. Due to the possibility of concurrency, ConcurrentSkipListMap uses ThreadLocalRandom to generate random numbers. As follows:

int rnd = ThreadLocalRandom.nextSecondarySeed();
Copy the code

The idea of flipping a coin to determine a level is very simple. By flipping a coin, level + 1 if the coin is heads, otherwise stop, as follows:

// Toss a coin to determine the level
while (((rnd >>>= 1) & 1) != 0)
    ++level;
Copy the code

If the SkipList level is greater than the maximum level, a new level is added. If the SkipList level is greater than the maximum level, a new level is added. If the SkipList level is lower than the maximum level, a new level is added.

level <= headIndex.level

// If the level determined is lower than the highest level head.level, the highest level index is generated directly
// Since each level of down needs to be identified, it needs to be generated from the lowest level up
if (level <= (max = h.level)) {
    for (int i = 1; i <= level; ++i)
        idx = new ConcurrentSkipListMap.Index<K,V>(z, idx, null);
}
Copy the code

Starting from the bottom layer, each layer smaller than level initializes an index. Each node points to the newly added node, down points to the item of the next layer, and next is null to the right. The whole process is simple: initialize an index for each level less than level and add it to the original index chain.

level > headIndex.level

// leve > head.level adds a layer
 else { // try to grow by one level
     // Add a new layer
     level = max + 1;

     // Initialize the level item object
     @SuppressWarnings("unchecked")
     ConcurrentSkipListMap.Index<K,V>[] idxs =
             (ConcurrentSkipListMap.Index<K,V>[])newConcurrentSkipListMap.Index<? ,? >[level+1];
     for (int i = 1; i <= level; ++i)
         idxs[i] = idx = new ConcurrentSkipListMap.Index<K,V>(z, idx, null);

     //
     for (;;) {
         h = head;
         int oldLevel = h.level;
         // The hierarchy has expanded and needs to be restarted (with new thread nodes added)
         if (level <= oldLevel) // lost race to add level
             break;
         // New header HeadIndex
         ConcurrentSkipListMap.HeadIndex<K,V> newh = h;
         ConcurrentSkipListMap.Node<K,V> oldbase = h.node;
         // Generate a new HeadIndex node that points to the new level
         for (int j = oldLevel+1; j <= level; ++j)
             newh = new ConcurrentSkipListMap.HeadIndex<K,V>(oldbase, newh, idxs[j], j);

         // HeadIndex CAS replacement
         if (casHead(h, newh)) {
             h = newh;
             idx = idxs[level = oldLevel];
             break; }}Copy the code

If the level determined by a coin toss is greater than the maximum level, a new level needs to be added. The processing logic is as follows:

  1. Initialize an array of corresponding indexes with the size of level + 1, and create an index for each unit: Node = new Z, down = next-level index, and right = null
  2. Expansion is performed through the for loop. Add a HeadIndex from the highest level, where parameters: Node, down are the highest level Node and HeadIndex, right is the index of the created level, and level is the level of the corresponding level. Finally, CAS replaces the current head with the head of the newly added layer. We have found the previous node, inserted node, determined the hierarchy and generated the corresponding Index, but did not insert the Index into the corresponding layer, so the following code is to insert the Index into the corresponding layer.
// Start at the level of the insertion
  splice: for (int insertionLevel = level;;) {
      int j = h.level;
      // Start with headIndex
      for (ConcurrentSkipListMap.Index<K,V> q = h, r = q.right, t = idx;;) {
          if (q == null || t == null)
              break splice;

          // r ! = null; This is to find the insertion node position of the corresponding level, note that this is only found horizontally
          if(r ! =null) {
              ConcurrentSkipListMap.Node<K,V> n = r.node;

              int c = cpr(cmp, key, n.key);

              // n.value == null, r is moved right
              if (n.value == null) {
                  if(! q.unlink(r))break;
                  r = q.right;
                  continue;
              }

              // key > n.key
              if (c > 0) {
                  q = r;
                  r = r.right;
                  continue; }}// Insert the node where it is located
          // The current layer is the top layer
          if (j == insertionLevel) {
              // Establish a connection
              if(! q.link(r, t))break; // restart
              if (t.node.value == null) {
                  findNode(key);
                  break splice;
              }
              // The insertion layer of the flag --, if == 0, indicating that the end of the insertion, exit the loop
              if (--insertionLevel == 0)
                  break splice;
          }

          // Insert the next node
          if(--j >= insertionLevel && j < level) t = t.down; q = q.down; r = q.right; }}Copy the code

The code is viewed in two parts: one is to find where the node is inserted at the corresponding level, and the second is to insert at that location and then move down.

At this point, the ConcurrentSkipListMap PUT operation ends. A bit too much code, so here’s a summary:

  1. The predecessor Node is first found by the findPredecessor() method
  2. Create a Node based on the returned elder Node and key-value, and set next through CAS
  3. Set Node Node and index Node. The level is determined by flipping a coin. If the level is larger than the existing maximum level, a new level is added, and then a new Item list is created.
  4. Finally, insert the newly created Item list into the SkipList structure.

Get operation

The get operation is much simpler than the PUT operation, and is really just the first step of the PUT operation:

private V doGet(Object key) {
      if (key == null)
          throw new NullPointerException();
      Comparator<? super K> cmp = comparator;
      outer: for (;;) {
          for (ConcurrentSkipListMap.Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
              Object v; int c;
              if (n == null)
                  break outer;
              ConcurrentSkipListMap.Node<K,V> f = n.next;
              if(n ! = b.next)// inconsistent read
                  break;
              if ((v = n.value) == null) {    // n is deleted
                  n.helpDelete(b, f);
                  break;
              }
              if (b.value == null || v == n)  // b is deleted
                  break;
              if ((c = cpr(cmp, key, n.key)) == 0) {
                  @SuppressWarnings("unchecked") V vv = (V)v;
                  return vv;
              }
              if (c < 0)
                  breakouter; b = n; n = f; }}return null;
  }
Copy the code

Similar to the first step of put operation, the findPredecessor() method is first called to find the predecessor node, and then the predecessor node is searched along the right. Meanwhile, the node whose value is NULL is deleted is also assumed the responsibility in this process.

The remove operation

The remove operation deletes the specified key node as follows:

public V remove(Object key) {
    return doRemove(key, null);
}
Copy the code

Call doRemove() directly, where remove has two arguments, one is key and the other is value, so doRemove provides both remove key and key-value.

final V doRemove(Object key, Object value) {
       if (key == null)
           throw new NullPointerException();
       Comparator<? super K> cmp = comparator;
       outer: for (;;) {
           for (ConcurrentSkipListMap.Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
               Object v; int c;
               if (n == null)
                   break outer;
               ConcurrentSkipListMap.Node<K,V> f = n.next;

               // Inconsistent reading, restart
               if(n ! = b.next)// inconsistent read
                   break;

               // Node n has been deleted
               if ((v = n.value) == null) {        // n is deleted
                   n.helpDelete(b, f);
                   break;
               }

               // Node B has been deleted
               if (b.value == null || v == n)      // b is deleted
                   break;

               if ((c = cpr(cmp, key, n.key)) < 0)
                   break outer;

               / / moves to the right
               if (c > 0) {
                   b = n;
                   n = f;
                   continue;
               }

               /* * Find the node */

               // value ! = null Indicates that key-value verification is required at the same time
               if(value ! =null && !value.equals(v))
                   break outer;

               // CAS replaces value
               if(! n.casValue(v,null))
                   break;
               if(! n.appendMarker(f) || ! b.casNext(n, f)) findNode(key);// retry via findNode
               else {
                   // Clear the node
                   findPredecessor(key, cmp);      // clean index

                   // head.right == null indicates that there are no nodes in this layer, delete this layer
                   if (head.right == null)
                       tryReduceLevel();
               }
               @SuppressWarnings("unchecked") V vv = (V)v;
               returnvv; }}return null;
   }
Copy the code

FindPredecessor () was called to find the predecessor node, and then moved to the right, and then compared. After finding it, CAS was used to replace the value with NULL, and then determine whether the node was the only index in this layer. If so, the tryReduceLevel() method was called to remove this layer and complete deletion.

The remove method only sets the value of the Node to null, but does not actually delete the Node. In fact, from the above operation, we can see that they will determine whether the value of the Node is null when looking for the Node. Call unLink() to cancel the association, as follows:

if (n.value == null) {
    if(! q.unlink(r))break;           // restart
    r = q.right;         // reread r
    continue;
}
Copy the code

The size operation

ConcurrentSkipListMap’s size() operation, unlike ConcurrentHashMap, does not maintain a global variable to count the number of elements, so it needs to be iterated every time it is called.

public int size(a) {
    long count = 0;
    for(Node<K,V> n = findFirst(); n ! =null; n = n.next) {
        if(n.getValidValue() ! =null)
            ++count;
    }
    return (count >= Integer.MAX_VALUE) ? Integer.MAX_VALUE : (int) count;
}
Copy the code

Call the findFirst() method to find the first Node and use Node next to count. Statistics are returned, at most integer.max_value. Note that this is safe for concurrent threads.