In-depth analysis of AQS and CAS, read the good

preface

Unconsciously writing articles has been nearly half a year, originally before writing articles just for their own summary of knowledge, unconsciously pay attention to more and more friends.

Now write an article not only to consider their own can understand, but also to consider whether readers can understand, consider the quality of the output of the article.

Now each writing is like a piece of art, carefully carved, each time processing. The “logic”, “easy to understand”, as well as “the layout of the article”, should be carefully considered.

“There is no savior, if there is one, it is yourself. There is no miracle, if there is one, it is just another name for hard work.”

Think about their graduation almost a year to come over the road, look at their own now, everything is worth it, will continue to work hard in the future, see more and more strong.

Without more words, let’s go directly to the dry goods. Today, we will have an in-depth understanding of CAS and AQS. The article adopts hierarchical and illustrated way to analyze layer by layer, so that readers can have a deep understanding.

AQS profile

AQS (AbstractQueuedSynchronizer) as the “abstract queue synchronizer,” simply say “AQS is an abstract class”, is an abstract class AbstractQueuedSynchronizer, don’t implement any interface, “Only methods for getting and releasing synchronized states are defined.”

It provides a “FIFO queue” in which non-competing threads wait while multiple threads compete for resources, and “defines a synchronization framework for multiple threads to access shared resources.”

There are two types of locks in AQS: Exclusive and Share.

An “exclusive lock” is “only one thread running at a time,” such as ReentrantLock. ReentrantLock previously wrote a detailed source code article, if you like to take a look [].

A shared lock is one that can run on multiple threads at the same time, such as Semaphore, CountDownLatch, and ReentrantReadWriteLock.

AQS source code analysis

AQS source can be seen for“State shared variable“, the use of“The volatile keyword“To modify, so that“Visibility is guaranteed“For those unfamiliar with the volatile keyword, see []. As you can see from the source code above, the modification to state is providedsetStateandcompareAndSetState.“So why provide these two changes to state?“

Because the compareAndSetState method is “usually used before the lock is acquired” and the current thread is not the lock holder, there may be thread-safety issues with state changes, so “atomic operations on state changes need to be ensured”.

The setState method is usually used for the thread currently holding the lock to modify the “state shared variable”. Since there is no competition and it is thread safe, there is no need to use CAS operation.

Analysis of AQS source code implementation, next we look at the principle of AQS implementation. Here AQS implementation source code and theory will be relatively simple, because there is no specific implementation class involved.

AQS implementation principle

Mentioned above to the AQS maintains a “FIFO queue”, and “the queue type a two-way linked list”, each Node in the list for “Node Node”, “Node class is an inner class of AbstractQueuedSynchronizer”.

Let’s take a look at AQS Node inner class source code, so that we can help us to AQS internal implementation more clear:

 static final class Node {
static final Node SHARED = new Node(); static final Node EXCLUSIVE = null; static final int CANCELLED = 1; static final int SIGNAL = -1; static final int CONDITION = -2; static final int PROPAGATE = -3; volatile int waitStatus; volatile Node prev; volatile Node next; volatile Thread thread; Node nextWaiter; final boolean isShared() { return nextWaiter == SHARED; } final Node predecessor() throws NullPointerException { Node p = prev; if (p == null) throw new NullPointerException(); else return p; } Node() { // Used to establish initial head or SHARED marker } Node(Thread thread, Node mode) { // Used by addWaiter this.nextWaiter = mode; this.thread = thread; } Node(Thread thread, int waitStatus) { // Used by Condition this.waitStatus = waitStatus; this.thread = thread; }}Copy the code
Copy the code


As you can see, the above Node class is relatively simple. It only maintains the properties of each Node. The most important basic components of Node inner classes are these:
    volatile Node prev;
    volatile Node next;
    volatile Thread thread;
Copy the code
“According to the analysis of source code and the principle of thread competition sharing resources“In terms of the realization principle of AQS, I have drawn a picture here:In the FIFO queue,“The head node owns the lock“That is, the head node is the lock holder, and the tail pointer points to the last waiting thread node in the queue, except for the head node and the tail node“Precursor pointer“and“Pointer to the subsequent“
A “shared variable state” is maintained in AQS to identify whether the current resource is held by threads. When multiple threads compete, they will judge whether the state is 0 and try to change the state to 1
Analysis of AQS source code implementation and principle implementation, but AQS inside the specific is not to do the specific implementation of synchronization, if you want to understand the specific implementation principle of AQS, you need to see the specific implementation of AQS class, here to ReentrantLock as an example.
Implementation principles of ReentrantLock
If multiple threads compete for shared resources, “the thread that failed to compete will be added to the end of the FIFO queue.”
In the specific implementation of ReentrantLock, here to ReentrantLock in the implementation of unfair lock as an example, because of the implementation of fair lock, has written a previous article analysis.
Let’s look at the implementation logic of the source code of the newly added node:Directly entered when the thread competing for the lock resource failsacquire(1)Method, let’s look at the implementation of acquire(1) :It can be seen from the source code that the implementation of Acquire (1) mainly has the following three steps:

tryAcquire(arg): Attempts to acquire the lock again.
addWaiter(Node.EXCLUSIVE): If the lock fails to be acquired, the current thread is assembled into a Node Node to perform the operation.
acquireQueued(addWaiter(Node.EXCLUSIVE), arg))The: acquireQueued method takes the addWaiter returned header as an argument, and the internal implementation does the lock spin and determines if thread suspension should be executed.

Let’s take a look at tryAcquire (arg) source, from the above have a see the value of arg is 1, the specific implementation source code is as follows:
 From the analysis of the source code,tryAcquire(arg)To determine whether the value of state has been released,“If the lock is released, the current thread will perform a CAS operation to set state to 1. If it is not released, the current thread will determine whether the lock can be reentrant“.
Analysis of thetryAcquire(arg)Let’s seeaddWaiter, the implementation of the team operation source code is as follows:From the above source analysis, it can be seen that the new thread added to the bidirectional linked list using tail insertion method, the specific implementation of the schematic diagram as shown below.From the above analysis, when a thread is in a queue, it mainly performs these steps“The prev pointer of the newly added node points to the old tail node, and the next pointer of the old tail node points to the newly added node“This is the common one“Two-way list tail interpolation“In the operation.
“Finally point tail to the newly added node”, so that the newly added node is enqueued, and then we can analyze the source code.
The premise here, of course“The queue is not empty“If it is empty, it will not go to the above logic, but walkenq(node)To initialize the node, let’s seeenq(node)Operation, source code as follows:After performing the above in-pair operation, proceedacquireQueuedMethod, to see the specific implementation of its source code:From the above source code can be seen, involving“The exit of head node“Operation, and will“The node of the current thread is promoted to the head node“.
Because only“The head node is the lock holder“, so for the queue exit operation of head node, the direction of head will change at any time. Here I draw a schematic diagram as follows:The concrete implementation is shown in the figure above“The original head node is de-queued, that is, the original head node next pointer is null, the original second node prev pointer is null“.
Finally, the head pointer points to the second node, but thread2 also changes the value of the shared state variable state to complete the release of the lock.
When the lock is released, it is executedshouldParkAfterFailedAcquire(p, node) &&parkAndCheckInterrupt()To determine if the current thread should be suspended, let’s look at its source code implementation: 在shouldParkAfterFailedAcquireIn the implementation, when the precursor node state quantitywaitStatusforSIGNAL“, it will hang up.Through the above analysis, we have a clear understanding of the implementation of AQS, mainly on the implementation classReentrantLockThe implementation principle is analyzed in depth, and is based on“Not fair lock“and“An exclusive lock“The implementation of the.
At the bottom of AQS, a “FIFO queue” is maintained. When multiple threads compete for shared resources, “the failed thread will be added to the queue”. In the implementation of “unfair lock”, the newly added thread node will “spin” to try to acquire the lock.
After analyzing AQS, let’s analyze CAS. “So what is CAS?”
Introduction of the CAS
In the analysis of the source code of the specific implementation of ReentrantLock, it can be seen that all operations involving setting shared variables will point to CAS operations to ensure atomic operations.
CAS(Compare and swap) primitive understanding is the meaning of comparison and exchange, “CAS is an optimistic lock implementation”.
There are three values in the CAS algorithm implementation: “updated variable”, “old value”, and “new value”. When modifying a shared resource, it is compared with the original value. If the value is equal to the original value, it is changed to the new value.
Therefore, under the algorithm implementation here, even without locking, the visibility of data can be guaranteed, even if the discovered data is changed, if the data has been updated, the write operation will fail.
But CAS can also cause ABA problems, “What is AN ABA problem?” Please listen to me in detail
ABA problem
The ABA problem is that if two threads read a shared variable state=1 at the same time, both threads have already assigned a copy of state to their working memory.
When thread pair state changes state=state+1 and writes to main memory, then thread pair state=state-1 writes to main memory, then main memory state changes twice, but changes back to the original value.
When thread two changes state, it succeeds. This is an ABA problem. The solution to the ABA problem is to add a version number, and the version number will be compared every time a comparison is made.
The ABA problem can be avoided because the version is only increased, such as the time of the version number, which is different from moment to moment.
CAS performance Analysis
Compared with the implementation of “synchronized blocking algorithm”, the implementation of “CAS adopts the non-blocking algorithm of optimistic lock”. Generally, the CPU takes longer to execute the context switch of the thread than the instruction set of the CPU, so the PERFORMANCE of CAS operation has been greatly improved.
However, all algorithms are not perfect. In the CAS operation, the one without successful update will spin, which will also consume CPU resources and is unfriendly to CPU.