Let’s look at a problem: if we want to store a set of ordered ints, how can we do that?

  1. An array of

Perhaps most students first think of data implementation, storing ordered data sets in the data, you can use dichotomy to search, relatively high efficiency, but it is not friendly to add and delete operations, because these operations need to move the elements behind the position.

  1. The list

Are there data structures that make queries, additions, and deletions faster? So that’s the skip list that we’re going to introduce today, so let’s look at a couple of pictures, it’s easy to understand.

Jump table

Skip list – Underlying data link

As shown in the figure above, for an ordered linked list, if you want to find a node with the value of 50, you need to traverse from the first node to find the node with the value of 50.

Let’s add an index to the list, as shown below:

Skip list – Level 1 index

If we query by the first level index (the orange query route), we can see that we can traverse at least half of the nodes.

A little slow? , then add another layer, and another layer, as shown in the picture:

Skip table – Secondary index
Skip table – Tertiary index

Is it faster to find the nodes we need? Of course, there are not enough nodes here. If there are too many nodes, the search efficiency will be improved more obviously.

If you need to find a node in the middle, say 42, the process might look something like this:

Skip table – Tertiary index – Find 42

Insert the node

If you understand the data structure of the skip list, it is easy to understand the node insertion operation. Basically, you can achieve two steps: insert data into the lowest linked list, and then adjust the index;

Each layer of index list whether to need to add new nodes, actually there is no standard answer, we can try to be the index of average distribution, commonly used is to decide whether to need to add or adjust the random judgment 】 【 index, when a new node insert, by probability algorithm to judge the node needs to be inserted into a few levels of nodes.

Such as:

  • The underlying data link list has N elements, randomly select N/2 elements as level 1 index, and randomly select N/4 elements as level 2 index… All the way to the top index;
  • If a new data node is inserted, there is a 1/2 probability that no level 1 index is inserted, a 1/4 probability that level 1 index needs to be inserted, and a 1/8 probability that level 2 index needs to be inserted, and so on.
  • Note that when you insert a level 2 index, you also insert a level 1 index; That is, when an index of level N is inserted, it must also be inserted at level 1 to (n-1).
Skip table – Tertiary index – Insert node

Remove nodes

Skip table node deletion is even easier, deleting data nodes and removing index nodes (if any) at each level.

To sum up:

  1. Skip lists can be thought of as multiple ordered linked lists (lowest data linked list + multi-level index linked list);

  2. In the process of search, search from the longest layer, find the corresponding interval and then search to the next layer;

  3. Each node has two Pointers to the right and below;

  4. When inserting a new node, determine randomly whether the node needs to be inserted into an index and the maximum number of index levels.

  5. When an index of level N is inserted, the index of level 1 to (n-1) must also be inserted.

  6. Skip lists are easier to implement than balanced trees such as red-black trees and do not need to maintain balance;

  7. One implementation of ZSet in Redis is a combination of skipList and hashTable;

  8. Google’s LevelDB and Facebook’s RocksDB both use skip tables as data structures.

Uncle will point code | article “original”


Please pay attention to the uncle who can code