From queue container selection to algorithm step by step iteration, this is an article from scratch to push the idea of “non-blocking thread safety queue” method. (There is a plot twist ~)

Queue container design

If an array is used as a container for a queue, it must be locked because an array is a contiguous memory address. In multi-threaded scenarios, reading and writing the same memory address must be mutually exclusive.

The chain structure

The chain structure does not have this worry. Each node of the chain corresponds to a different memory address. In multi-threaded scenarios, there is no concurrency problem in taking the head node and inserting the tail node. (At least reduce the probability of concurrency problems)

A generic queue should hold any type of element.

We declare a link with a generic:

/ / node
private static class Node<E> {
    E item; // Node data field
    Node<E> next; // The next pointer to the node
}
Copy the code

Mutable fields

Can multiple threads manipulate item and next fields on the same Node?

Of course it’s possible! In order for all threads to see the latest item and next, they need to be “visible” :

private static class Node<E> {
    volatile E item;
    volatile Node<E> next;
}
Copy the code

Volatile literally means “volatile” and is used to describe threads that share local copies of variables:

In order to improve the running efficiency of threads, each thread has its own independent memory space (local memory), into which shared variables of main memory are copied (A1, A2 is the copy of A in the figure), and subsequent operations on shared variables are carried out in local memory.

If the initial value of a is 0, threads 1 and 2 are started, and two copies a1 and A2 are added to A. Both threads then incrementally augment A concurrently.

Thread 1 does a1++ and a1 = 1. Thread 2 knows nothing about what thread 1 does, and a2 = 1 does a2++. The correct value for the shared variable is 2.

This error is caused by the fact that the thread memory space is independent of each other, so that the thread cannot sense the operation of the other thread on the shared variable. That is, “the operation of the current thread on the shared variable is not visible to other threads.”

The solution, of course, is to make it “visible”.

Java’s volatile keyword, used to tell the compiler that a copy of the shared variable in thread local memory is “volatile” and that threads should not believe it, thus:

  1. A thread should always synchronize the latest value to main memory as soon as it writes a shared variable.
  2. When a thread reads a shared variable, it should always go to main memory to get the latest value.

By reading and writing to main memory, threads establish a communication mechanism that makes their operations on shared variables “visible.”

In order to facilitate the team head out of the team, the team into the team, have to arrange two Pointers, respectively pointing to a tail.

public class MyQueue<E> {
    volatile Node<E> head; // The current header
    volatile Node<E> tail; // End of current chain
}
Copy the code

Similarly, headers and tails are shared variables in multithreaded scenarios and must be volatile to keep them visible.

Stage summary:

  • Each thread has its own separate memory space (local memory)
  • Variables that can be accessed by multiple threads at the same time are called “shared variables” and are stored in main memory. For efficiency purposes, threads copy shared variables from main memory to local memory, where subsequent operations on shared variables occur.
  • Thread-local memory is independent of each other, so they are not aware of other people’s operations on shared variables, that is, operations on shared variables by the current thread are not visible to other threads.
  • The volatile keyword is Java’s solution to the visibility problem of shared variables.
  • Declaring a variable volatile tells the compiler that its copy in thread local memory is “volatile” and that threads should not trust it. A thread should always synchronize the latest value to main memory as soon as it writes a shared variable. When a thread reads a shared variable, it should always go to main memory to get the latest value.
  • In multi-threaded scenarios, the head and tail Pointers of queues, i.e. the data field and pointer field of nodes, need to be visible.

Out of the team

Basic version

The algorithm for singly linked list is very simple, especially when head is known:

// Team version 1
public E poll(a) {
    Node<E> p = head;
    E item  = p.item; // store header data temporarily
    head = head.next; // Update the header pointer
    p.item = null; // Empty the content of the old node
    p.next = null; // The next pointer to the old node is null
    return item;
}
Copy the code

That’s fine in a single thread, but multithreading can be a problem.

Non-blocking thread-safe version

One thread is reading the item field of a header, and another thread may have nulled it.

You can of course use Synchronize to wrap operations on the head, but this is’ locked ‘and’ blocked ‘, meaning that only one thread can acquire the lock at a time and the rest of the competitors are suspended, waiting for the lock to be released and then woken up. Suspension – wake up is time consuming.

The better performing “non-blocking” approach is CAS, which stands for compare and swap. When you want to update a variable value, you need to provide three parameters: 1. Current value, 2. Expected value, 3. The new value. Replace the current value with a new value only if the current value is equal to the expected value.

Where, the expected value is the value temporarily stored when the thread is interrupted. When the thread resumes execution, the current value is compared with the temporary value at that time. If the value is equal, it indicates that other threads have not modified it during the interrupted process, so it is safe to update its value at this time.

The CAS operation, encapsulated in sun.misc.Unsafe, is described as follows:

private static final sun.misc.Unsafe U = sun.misc.Unsafe.getUnsafe();
static <E> boolean casItem(Node<E> node, E cmp, E val) {
    // Set the node item field value to val by comparing it with the expected value CMP.
    return U.compareAndSwapObject(node, ITEM, cmp, val);
}
Copy the code

CasItem () safely assigns the new value val to Node.item by comparing it with the expected value CMP field in multithreading.

In addition, we need a CAS operation to update the header:

// The header is unchained and the header pointer is updated
final void updateHead(Node<E> h, Node<E> p) {
    if(h ! = p) { Node<E> h1 = h;// hold the current header pointer temporarily
        casHead(h, p); // Update the header pointer to p's CAS contention
        h1.next = null; // Next is null}}private boolean casHead(Node<E> cmp, Node<E> val) {
    // Assign the new value val safely by comparing the head pointer to the expected value CMP
    return U.compareAndSwapObject(this, HEAD, cmp, val);
}
Copy the code

CAS is usually used in conjunction with polling, because when multiple threads modify a shared variable at the same time, only one thread succeeds (returns true) and the others fail (returns false). The failed thread tries the CAS again through polling until it succeeds.

Write a thread-safe algorithm for fetching header nodes:

// Release version 2
public E poll(a) {
    / / polling
    for (;;) {
        E item = head.item;/ / the staging of the item
        Node<E> next = head.next; // hold the secondary header
        // CAS contention with header item field null
        if (casItem(head, item, null)) {
            for(;;) {
                // CAS contention that points the header to the secondary header
                if (updateHead(head, next)) returnitem; }}}}Copy the code

Infinite loop revision

This is thread-safe for modifying the item field on the header, but the algorithm is a bit sloppy. If one thread loses a race and another thread successfully nulls the header item field, the failed thread will try again and again because of polling, but the current value of head.item is null instead of a temporary item. So CAS polling will always fail and the program will fall into an infinite loop.

To solve the dead-loop, modify the logic as follows:

// Release version 3
public E poll(a) {
    / / polling
    for (Node<E> h = head, p = h, q;;) {
        E item = p.item;/ / the staging of the item
        // If the header item is empty, no CAS contention is performed
        if(item ! =null && casItem(p, item, null)) {
            for(;;) {
                // the outgoing node is p, and the subsequent node of p is the new header
                // If p is the tail, the head pointer stays at pNode<E> newHead = ((q = p.next) ! =null)? q : p;// Update the CAS contention for the header
                if (updateHead(head, newHead)) returnitem; }}// q is the subsequent node of p
        else if ((q = p.next) == null) { 
            // The chain is empty, update the header pointer to the last non-null node, return null
            updateHead(h, p);
            return null;
        } 
        // There are subsequent nodes
        else {
            p = q; // work node p looks back for the node where the first item is not null}}}Copy the code

Three working Pointers have been added:

  1. P: indicates the node where the first item field is not empty.
  2. Q: subsequent node of p, used for backward traversal.
  3. H: Temporarily stores the head pointer when entering the loop.

When one thread loses the race to empty the item, another thread succeeds in emptying the header item. Because of polling, the failed thread continues to try, and the next round of the for loop executes to the condition item! = null stops contention because the item field that p currently points to has been null by another thread. At this point, WITH the help of Q, P traverses backwards to find the node where the first item is not empty. If it is found, the competition will be launched again. If it is not found at the end of the chain, it means that the chain has been emptied and null will be returned directly.

Head. Next = null; head. Item = null;

(It is called empty chain version 1 because it is not the final form, which can cause design problems. See the next section for details.)

Smart off-chain Sentinel edition

Offer () is non-blocking, and it can be interrupted during any statement execution, which is bad.

Consider the following scenario:“Thread A is enqueuing and has just entered the for loop, that is, it has just temporarily stored the current head pointer in H, P, when it is interrupted, another enqueuing thread gets A chance to execute and exits the head node first. Thread A finally gets to continue executing…” On the left side of the figure, thread A is about to execute the initial queue operation, but it is interrupted, and another thread unlocks the head node, Node1, and emptying next. When thread A resumes execution, the hen becomes duck scenario is shown on the right side of the figure above.

According to the algorithm of version 3, the item field of the previously temporary head node P has been empty, so it will traverse backwards to locate the node whose first item is not empty. But here’s the awkwardness, because when you unchain the header, you leave its next empty. When the p pointer moves back one node, it is found to be a null pointer. The algorithm of version 3 determines that in this case, the linked list is emptied, so null is returned. But there’s still Node2 on the chain…

Therefore, the determination of “unchained node” and “chain end node” must be distinguished. Modify the unchain method:

final void updateHead(Node<E> h, Node<E> p) {
    // If the CAS race succeeds, spin the header (next field points to itself)
    if(h ! = p && casHead(h, p)) lazySetNext(h, h); }// point head.next to yourself
static <E> void lazySetNext(Node<E> node, Node<E> val) {
    U.putOrderedObject(node, NEXT, val);
}
Copy the code

The head node disconnection operation is divided into two steps:

  1. Safely point the header pointer to the new node.
  2. Spin the head node so that the next pointer points to itself.

The effect of these two steps is shown below:

After separating the judgment conditions of the dislinked node from the chain end node, the dislinked node has the effect of “sentry” :

  • The judgment condition of the dislinked node isp == p.next
  • The judgment condition of the end of the chain isp.next == null

Corresponding modified queue out algorithm:

// Release version 4
public E poll(a) {
    restartFromHead:
    // Level 1 polling
    for(;;) {
        // Layer 2 polling
        for (Node<E> h = head, p = h, q;;) {
            E item = p.item;
            // Find the real non-empty header
            if(item ! =null && casItem(p, item, null)) {
                for(;;) { Node<E> newHead = ((q = p.next) ! =null)? q : p;if (updateHead(head, newHead)) returnitem; }}/ / short chain
            else if ((q = p.next) == null) { 
                updateHead(h, p);
                return null;
            } 
            // Sentry: p indicates that the node has been disconnected
            else if (p == q)
                continue restartFromHead;
            // When there are subsequent nodes
            else{ p = q; }}}}Copy the code

Add a new logical branch else if (p == q) to indicate that the current traversal node is unchained. Unchained means that the head node has been updated to a new location. Therefore, the task of “traversing from the beginning to find the node where the first item is not empty” should start from the new head, that is, p should be assigned by the new head. To achieve this effect, we have to use the goto operation, implemented in Java with continue.

Performance optimization final edition

Version 4 already works correctly in multi-threaded scenarios, but there is one area that can be optimized.

Although CAS are non-blocking races, races consume CPU resources. Therefore, there is no competition if there is no competition. In the current algorithm, “updating the chain pointer” is the competition that can be delayed.

// Team version 5
public E poll(a) {
    restartFromHead:
    for(;;) {
        for (Node<E> h = head, p = h, q;;) {
            E item = p.item;
            // Find the real non-null node
            if(item ! =null && casItem(p, item, null)) {
                // Conditionally update the header pointer
                if(p! =h) { Node<E> newHead = ((q = p.next) ! =null)? q : p;if (updateHead(head, newHead)) return item;
                }
                return item;
            } 
            else if ((q = p.next) == null) { 
                updateHead(h, p);
                return null;
            }
            else if (p == q)
                continue restartFromHead;
            else{ p = q; }}}}Copy the code

Added condition p! To update header pointer =h, that is, the head pointer is updated only if it lags.

Because head lags, p and H must point to different nodes. P’s mission is to start from head and move backwards to the node where the first item is not empty. If head does not lag, then p does not need to be traversed backwards, and then P must be equal to h.

Does not updating the header pointer in time cause the algorithm to fail? Don’t. Since the algorithm doesn’t really believe in head, it arranges for the working pointer P to traverse to find the real head. But not being up to date directly halves the number of races, improving performance.

Stage summary:

The queuing algorithm of the non-blocking thread safe queue is summarized as follows:

  • Always start from the head pointer and look back for the real header (item is not empty). If found, set the header item field to NULL (CAS ensures thread safety) to indicate that the node is out of queue. If not found, return NULL.
  • Decoupling is achieved by “spinning” the node pointer field away from the chain, i.e. the next field points to itself (CAS is thread-safe). The spin acts as a sentry, causing the node to disengage as you traverse backwards to find the real head node.
  • Lagging header pointer: “out of queue”, “unchain”, “update header pointer” are not paired. Theoretically, each out-queue should be dislinked and updated header pointer. In order to improve performance, two out-queues correspond to one dislinked + updated header pointer, resulting in intermittent lag of header nodes.
  • Since it is non-blocking, the queue exit operation can be interrupted by other threads. If it is interrupted by queue entry, it is basically harmless because the queue head and queue tail are two different memory addresses, and there is no thread safety issue. If it is interrupted by another queuing thread, it may occur that the desired node is queued by another thread. Locate the new head node by detecting spin + starting from scratch.

The team

In the same vein, step by step write a team entry algorithm freehand.

Basic version

Queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue queue

// Join team version 1
public boolean offer(E e) {
    Node<E> newNode = createNode(e)// Build a new node
    tail.next = newNode;// The chain ends into a new node
    tail = newNode;// Update the tail pointer
    return true;
}
Copy the code

Non-blocking thread-safe version

But it gets a little more complicated in multithreading, where a new node is being inserted into a tail node at the same time that another thread is deleting it.

The variable tail is shared by multiple threads, and it requires thread-safety considerations. Use CAS as in the queue exit operation.

For tail-insert scenarios, the expected value of the next pointer to the tail node is null; otherwise, it indicates that the tail node has been modified by another thread, such as when another writer thread has pre-empted the insertion of a new tail node. The new value of the last pointer to next is, of course, newNode.

CAS encapsulation is as follows:

// java.util.concurrent.ConcurrentLinkedQueue
private static final sun.misc.Unsafe U = sun.misc.Unsafe.getUnsafe();
static <E> boolean casNext(Node<E> node, Node<E> cmp, Node<E> val) {
    return U.compareAndSwapObject(node, NEXT, cmp, val);
}
Copy the code

CastNext () is an implementation of the CAS mechanism that securely assigns a new value to the next pointer on a Node in multiple threads.

In addition, you need a casTail() :

private boolean casTail(Node<E> cmp, Node<E> val) {
    return U.compareAndSwapObject(this, TAIL, cmp, val);
}
Copy the code

It safely assigns new values to the tail pointer in multithreading.

Write a thread-safe enqueue algorithm using this idea:

// Join team version 2
public boolean offer(E e) {
    Node<E> newNode = createNode(e)
    / / polling
    for (;;) {
        // Tail insert CAS contention
        if (casNext(tail, null, newNode)) { 
            // If the race succeeds, update the tail pointer CAS race
            for(;;) {if (casTail(tail, newNode)) return true; }}}}Copy the code

Infinite loop revision

This is thread-safe, but the algorithm is a little bit worse. Suppose one thread loses the CAS race, and another thread wins the race and successfully inserts a new node at the end of the chain. The thread will continue to try because the polling failed, but all subsequent attempts will fail because the expected value is no longer NULL, and the program is stuck in an infinite loop.

To solve the dead-loop, modify the logic as follows:

// Join team version 3
public boolean offer(E e) {
    Node<E> newNode = createNode(e)
    // add the working pointer p to the end of the loop
    for (Node p = tail;;) {
        // the working pointer q represents the subsequent node of p
        Node q = p.next;
        // Find the real CAS tail
        if (q == null) {
            if (casNext(p, null, newNode)) { 
                for(;;) {
                    if (casTail(tail, newNode)) return true; }}}else {
            p = q; // the p pointer moves back one node}}}Copy the code

Two working Pointers, P and Q, are added to work together to traverse the queue backwards to find the real end of the chain.

When one thread loses the CAS contention and the other thread successfully inserts a new node at the end of the chain, the losing thread will continue to if (q == null) because of polling, but q must not be null (because the other thread has successfully inserted a new node). If (q == null) is true again, the CAS race will be attempted again. If (q == null) is true again, the CAS race will be attempted again.

Loser’s Dilemma

The second version of the tail-insertion algorithm looks much better, but there are still areas where it can be optimized.

Consider the loser’s dilemma: if one thread loses the CAS race, the other thread inserts 10 nodes at the end of the chain. Unfortunately, the failed thread was never scheduled to execute, while other threads frantically inserted 90 nodes at the end of the chain. Finally, the failed thread resumes execution. Will it have to traverse 100 nodes before it can compete with CAS again? What if, in the course of 100 iterations, another thread inserted a new node first? Don’t you have to go through it endlessly? It feels like the failed thread will never get a chance to insert a new node…

The solution is to cut corners. After the contention fails, the working pointer p traverses backwards to “find the real tail node”. Since another thread has successfully inserted a new node, tail must have been updated.

// Join team version 3
public boolean offer(E e) {
    Node<E> newNode = createNode(e)
    // add the working pointer t at the beginning of the loop to hold a copy of tail
    for (Node t = tail, p = t;;) {
        Node q = p.next;
        if (q == null) {
            if (casNext(p, null, newNode)) { 
                for(;;) {
                    if (casTail(tail, newNode)) return true; }}}else {
            // Performance optimization: if other threads insert new nodes stealthily, they point directly to the latest end of the chainp = t ! = (t = tail) ? t : q; }}}Copy the code

Tail pointer t is a copy of the tail pointer in the current thread memory. Since tail is a thread shared variable, when other threads modify tail, they can compare whether t and tail are equal to determine whether the tail has been shifted.

This comparison is represented in the program as t! = (t = tail), when “loser’s dilemma” occurs, the value of tail and t must be different. In this case, the working pointer P can be directly cut to the latest list tail T.

Performance optimized edition

The queue exit algorithm adopts “lagging head node” to reduce CAS competition, and the queue entry algorithm adopts the same “lagging tail node” :

// Join team version 4
public boolean offer(E e) {
    Node<E> newNode = createNode(e)
    for (Node t = tail, p = t;;) {
        Node q = p.next;
        // CAS tail is inserted when p is the tail
        if (q == null) {
            if (casNext(p, null, newNode)) { 
                // Only update tail when it is behind the real tail
                if(p! =t) casTail(t, newNode);return true; }}else{ p = t ! = (t = tail) ? t : q; }}}Copy the code

Update tail pointer is not updated if p and t are equal. Under what circumstances are these two equal? Tail and tail are equal only if they point to the real end of the chain as they enter the loop. So this condition can be interpreted as “update the tail pointer when a new node is inserted at the end of the chain and the tail pointer is found to be behind the real tail of the queue.”

The tail update process is shown below:

The queue currently has only one node, so it is the head and tail, and the tail pointer points to it. When polling is entered, the working pointer P is assigned to tail. Since tail is already pointing to the real tail node, p does not need to continue traversing backwards and directly inserts a new node. In this case, the value of P is still the same as that of tail, and the tail pointer is not updated.

What happens if you continue to insert new nodes in this scenario?

When polling starts, the tail node is already behind (state 1), so p has to traverse backwards to find the real tail node Node2 (state 2), and insert the new node. At this point, p is offset, so its value must be different from that of tail.

Similarly, the tail pointer is updated every time two nodes are inserted.

Will the algorithm fail if the tail pointer is not updated in time? Don’t. Since the algorithm doesn’t really believe in tail, it arranges the working pointer P to find the real tail by walking through it.

Head and foot correction

In scenarios where inbound and outbound queues can be concurrent, the head pointer may pass the tail pointer, for example: Thread A is trying to add new nodes to the end of the chain, but the other thread has queued up two nodes at A time, and thread A is resuming execution…

In this case, how does thread A find the correct insertion location?

Here is the scenario when thread A enters the offer() poll and is interrupted by another thread before it can proceed:

The left side of the figure above is the initial state of the polling thread. After Node1 is successfully dequeued, it will become the state shown on the right, that is, node1.item is null and the head pointer is not updated.

What happens if the ejected thread continues to eject Node2?

At this point, head node has lagged behind (state 1), so the working pointer P will find the first item non-null node Node2 (state 2) later, and continue to set Node1 node as a sentinel node (state 3) after Node2. Item is empty. When thread A resumes execution, it faces the following situation:

Because the tail pointer is not updated when the poll() method is executed, the head pointer crosses the tail pointer.

Use the version 4 algorithm to simulate what would happen in this scenario: q is a subsequent node of P, and embarrassingly, p ext is p itself, so q is never null and the program spins and never stops.

The solution is to add a conditional branch where p == q:

// Join team version 5
public boolean offer(E e) {
    Node<E> newNode = createNode(e)
    for (Node t = tail, p = t;;) {
        Node q = p.next;
        // Find the tail, CAS competes for tail insertion
        if (q == null) {
            if (casNext(p, null, newNode)) { 
                if(p! =t) casTail(t, newNode);return true; }}// If head spins over tail
        else if (p == q) {
            // reset p to the correct positionp = (t ! = (t = tail)) ? t : head; }// All is well, except that the real endpoint is not found yet
        else{ p = t ! = (t = tail) ? t : q; }}}Copy the code

When the spin occurs, reset P to the correct position. There are two possibilities for the correct location. When no new node is inserted at the tail, i.e. tail has not changed (t == tail), reset p to head. This is the scenario illustrated above.

When P realizes that the spin has occurred (state 1), it points itself to head (state 2). The subsequent node of P is empty, i.e. the true tail is found, so Node3 is inserted here, becausep! =tThe tail pointer is lagging, so tail is updated to point to the latest tail node Node3 (state 3).

There is another scenario that can occur after spin, where you start with the latest tail.

Rewrite the head foot inversion script: “chain currently has two nodes, and the tail pointer lag A node, the thread A chain tail is preparing to add new nodes, but out team beat her to the operation of another thread, it relief team two nodes, and then it also conducted A team operation, thread A resume execution at this time…” Thread A of recovery execution faces the following scenario:

The tail pointer has changed, so it is more efficient to point P directly to tail than to point to head.

Stage summary:

The enqueuing algorithm for non-blocking thread safe queues is summarized as follows:

  • Always start from the tail pointer and traverse backwards to find the real tail (next is empty), which can be found in any case, and make its next field a new node (CAS keeps it thread-safe).
  • Lagging tail pointer: In theory, the tail pointer should be updated every time joining the team. In order to improve performance, a delayed update strategy is adopted, that is, two joining the team corresponds to one update of the tail pointer.
  • Since it is non-blocking, the dequeueing operation may be interrupted by another thread:
  1. If it is interrupted by another joining thread, the chain tail may move back several nodes. In this case, the “shortcut” strategy is adopted to quickly locate the new chain tail.
  2. If interrupted by another outgoing thread, headfoot reversal may occur, where the head pointer crosses the tail pointer. At this point, you need to start again based on the latest head or tail pointer and iterate backwards to find the real tail before enlisting.

frank

If you are careful, you will find that this is actually the source code of ConcurrentLinkedQueue ~~, which is actually a source code analysis ~~~

I try to explain this algorithm in the way of “starting from scratch step by step with freehand copy”, I wonder if it can reduce the difficulty of understanding? Welcome to leave a message ~~

conclusion

  • Each thread has its own separate memory space (local memory)
  • Variables that can be accessed by multiple threads at the same time are called “shared variables” and are stored in main memory. For efficiency purposes, threads copy shared variables from main memory to local memory, where subsequent operations on shared variables occur.
  • Thread-local memory is independent of each other, so they are not aware of other people’s operations on shared variables, that is, operations on shared variables by the current thread are not visible to other threads.
  • The volatile keyword is Java’s solution to the visibility problem of shared variables.
  • Declaring a variable volatile tells the compiler that its copy in thread local memory is “volatile” and that threads should not trust it. A thread should always synchronize the latest value to main memory as soon as it writes a shared variable. When a thread reads a shared variable, it should always go to main memory to get the latest value.
  • In multi-threaded scenarios, the head and tail Pointers of queues, i.e. the data field and pointer field of nodes, need to be visible.
  • CASIs a way to keep shared variable threads safe. It is non-blocking, that is, it does not cause one thread to execute and other threads to block and wait to wake up. It called thecompare and swap“, which means “compare and exchange”. When you want to update a variable value, you need to provide three parameters: 1. Current value, 2. Expected value, 3. The new value. Replace the current value with a new value only if the current value is equal to the expected value. Where, the expected value is the value temporarily stored when the thread is interrupted. When the thread resumes execution, the current value is compared with the value temporarily stored at that time. If the value is equal, it indicates that other threads have not modified it during the interrupted process, so it is safe to update its value at this time.
  • CAS is usually used in conjunction with polling. When multiple threads simultaneously compete to modify the value of a shared variable, only one wins (returns true) and the others fail (returns false). The failing thread usually polls until it succeeds.

The queuing algorithm of the non-blocking thread safe queue is summarized as follows:

  • Always start from the head pointer and look back for the real header (item is not empty). If found, set the header item field to NULL (CAS ensures thread safety) to indicate that the node is out of queue. If not found, return NULL.
  • Decoupling is achieved by “spinning” the node pointer field away from the chain, i.e. the next field points to itself (CAS is thread-safe). The spin acts as a sentry, causing the node to disengage as you traverse backwards to find the real head node.
  • Lagging header pointer: “out of queue”, “unchain”, “update header pointer” are not paired. Theoretically, each out-queue should be dislinked and updated header pointer. In order to improve performance, two out-queues correspond to one dislinked + updated header pointer, resulting in intermittent lag of header nodes.
  • Since it is non-blocking, the queue exit operation can be interrupted by other threads. If it is interrupted by queue entry, it is basically harmless because the queue head and queue tail are two different memory addresses, and there is no thread safety issue. If it is interrupted by another queuing thread, it may occur that the desired node is queued by another thread. Locate the new head node by detecting spin + starting from scratch.

The enqueuing algorithm for non-blocking thread safe queues is summarized as follows:

  • Always start from the tail pointer and traverse backwards to find the real tail (next is empty), which can be found in any case, and make its next field a new node (CAS keeps it thread-safe).
  • Lagging tail pointer: In theory, the tail pointer should be updated every time joining the team. In order to improve performance, a delayed update strategy is adopted, that is, two joining the team corresponds to one update of the tail pointer.
  • Since it is non-blocking, the dequeueing operation may be interrupted by another thread:
  1. If it is interrupted by another joining thread, the chain tail may move back several nodes. In this case, the “shortcut” strategy is adopted to quickly locate the new chain tail.
  2. If interrupted by another outgoing thread, headfoot reversal may occur, where the head pointer crosses the tail pointer. At this point, you need to start again based on the latest head or tail pointer and iterate backwards to find the real tail before enlisting.

Recommended reading

Source code analysis series of articles as follows:

Why Kotlin coroutines | CoroutineContext designed indexed set? (1) – Nuggets (juejin. Cn)

Kotlin source | magic weapon to reduce the complexity of code – the nuggets (juejin. Cn)

Read the source code long | the original knowledge can expand the View click area – the nuggets (juejin. Cn)

RecyclerView how to achieve the scrolling? (2) | Fling – the nuggets (juejin. Cn)

How does RecyclerView roll? (a) | unlock reading source new posture – the nuggets (juejin. Cn)

RecyclerView performance optimization | what is in the cache mechanism? – the nuggets (juejin. Cn)

RecyclerView interview question | what item in the table below is recycled to the cache pool? – the nuggets (juejin. Cn)

RecyclerView interview questions | rolling table when the item is being filled or recycled? – the nuggets (juejin. Cn)

RecyclerView animation principle | how to store and use animation attribute values? – the nuggets (juejin. Cn)

RecyclerView animation principle | pre – layout, post – the relationship between the layout and scrap the cache – the nuggets (juejin. Cn)

RecyclerView animation principle | change the posture to see the source code (pre – layout) — the Denver nuggets (juejin. Cn)

Read the original code length knowledge | like speech, writing code also want to leave room! ? – the nuggets (juejin. Cn)

Reading knowledge source long | Android caton true because “frame”? – the nuggets (juejin. Cn)

Reading knowledge source long | better RecyclerView table a click listener – the nuggets (juejin. Cn)

Android Touch event distribution “pass” and “return” (1) – Dig gold (juejin.cn)

Android Touch event distribution “pass” and “return” (2) – Dig gold (juejin. Cn)

Memory optimization: SparseArray – Nuggets of Contradictions (juejin. Cn)

Android custom controls | View drawing principle (drawing?) – the nuggets (juejin. Cn)

Android custom controls | View drawing principle (pictures?) – the nuggets (juejin. Cn)

Android custom controls | View drawing principle (pictures?) – the nuggets (juejin. Cn)

RecyclerView caching mechanism | recycling where? – the nuggets (juejin. Cn)

What RecyclerView caching mechanism | recycling? – the nuggets (juejin. Cn)

RecyclerView caching mechanism | how to reuse table? – the nuggets (juejin. Cn)