preface

In concurrent programming, locking is a performance-consuming operation, and only one thread can enter the block at a time to change the value of a variable, as shown in the following code


synchronized void function(int b){a = a + b; }Copy the code

If synchronized is not added, multi-thread modification of a value will lead to incorrect results and thread-safety problems. But locking is a performance-consuming operation. Whether it’s holding the lock, unlocking it, waiting for the lock, or blocking, it’s very performance expensive. So can we leave it unlocked?

You can.

What does that mean? Let’s look at the code above, which is divided into several steps:

  1. Read a
  2. Add a and B
  3. Assign the calculated value to A.

As we know, this is not an atomic operation, and there is a problem with multithreading: when two threads access A at the same time, both get the value of A, and tell a to increment a by 1, and then assign the calculated value to A at the same time, it causes a to increment only by 1, when in fact we want to increment by 2.

What’s the problem? Step 3, for the assignment of A, if there is a judgment that A has been modified by another thread, you need to recalculate. Like this:


void function(int b) {
   int backup = a;
   int c = a + b;
   compareAndSwap(a, backup, c);
}

void compareAndSwap(int backup ,int c ){
       if(a == backup) { a = c; }}Copy the code

If the value of a is the same as the value of the backup, it means that A has not been changed by another thread, and then it can be changed.

Here’s the problem: How to keep the compareAndSwap method thread-safe because it has multiple steps, is not atomic, and does not use locks. In fact, this is just pseudo-code. What is CAS (compareAndSwap)?

1. What is CAS

CAS (compareAndSwap), Chinese called comparison swap, a locking atom algorithm. It takes three parameters, CAS (V, E, N), where V represents the value of the variable to be updated, E represents the expected value, and N represents the new value. V is set to N only if V is equal to E. If V and E are different, another thread has done two updates, and the current thread does nothing. Finally, CAS returns the true value of the current V. CAS operates with an optimistic attitude, always thinking it can successfully complete the operation.

When multiple threads operate on a variable using CAS at the same time, only one will win and update successfully, while the rest will fail. The thread that failed is not suspended, but is simply notified of the failure and allowed to try again, and of course the implementing thread is allowed to abort the operation. Based on this principle, CAS operations can detect interference from other threads to the current thread even if there is no lock.

Using CAS makes the program seem a bit more complex than locking, but because it is non-blocking, it is inherently immune to deadlock problems, and there is very little interaction between threads. More importantly, there is no overhead associated with lock contention, and no overhead associated with frequent scheduling between threads, so it has better performance than the lock-based approach.

In short, CAS requires you to give an additional expected value, which is what you think the variable should look like right now. If the variable is not what you think it is, it has been changed by someone else. You just have to read it back and try again.

So how does this CAS work? That is, how can comparison and interchange, which are really two operations, become one atomic operation?

2. Underlying CAS principles

Thanks to the development of the hardware instruction set, we could actually use synchronization to make these two operations atomic, but it wouldn’t make sense to do so. So we have to rely on hardware to make sure that a semantically multiple operation can be done with a single processor instruction. Common examples of this type of instruction are:

  1. Test and Set up (tetst-and-set)
  2. Fetch and Increment
  3. Swap
  4. Compare and Swap
  5. Load Linked/ store-conditional

The first three are from the 20th century, when most processors already have them, and the last two are new additions to modern processors. Moreover, the purpose and function of the two instructions are similar. In IA64 and x86 instruction sets, CMPXCHG instruction completes CAS function, and in SPARC-TSO, CASA instruction is also implemented. In ARM and PowerPC architecture, A pair of LDREx/Strex instructions are required to perform LL/SC functions.

The CPU implements atomic instructions in two ways:

  1. Atomicity is guaranteed by bus locking. When a processor outputs a lock on the bus, requests from other processors will be blocked, so that the processor can exclusively use the shared memory. But this method is too costly. Hence the following approach.

  2. Atomicity is guaranteed by cache locking. The so-called cache lock Refers to the memory area if the cached processor cache line, and is locked during the Lock operation, so when he perform Lock operation is written back to memory, the processor do not claim to LOCK# signals on the bus, and modify the internal memory address, and allowed his cache consistency mechanism to ensure the atomicity of operation, Because the cache consistency mechanism prevents changes in memory areas cached by more than two processors at the same time (the same visibility principle as volatile), it invalidates locked rows when other processors write back to them.

Note: There are two situations in which the processor does not use cache locking.

  1. The processor invokes bus locking when the data of the operation cannot be cached within the processor or when the data of the operation spans more than one cache line.
  2. Some processors do not support cache locking. For Intel 486 and Pentium processors, even locked memory areas will be called bus locking in the processor’s cache line.

3. How does Java implement atomic operations

Java in version 1.5 provides a Java. Util. Concurrent. Atomic package, the package of all the classes are atomic operations:

How do you use it? Look at the code

  public static void main(String[] args) throws InterruptedException {
    AtomicInteger integer = new AtomicInteger();
    System.out.println(integer.get());


    Thread[] threads = new Thread[1000];

    for (int j = 0; j < 1000; j++) {
      threads[j] = new Thread(() ->
          integer.incrementAndGet()
      );
      threads[j].start();
    }

    for (int j = 0; j < 1000; j++) { threads[j].join(); } System.out.println(integer.get()); }}Copy the code

In the code above, we started 1000 threads to increment the AtomicInteger variable. The result is our expected 1000, indicating that no synchronization problems have occurred.

Let’s look at his internal implementation, and we find the compareAndSet method for this class, which is compareAndSet. Let’s look at the method implementation:

This method calls the Unsafe class’s compareAndSwapInt method, which takes several parameters, including the memory address of the variable, the expected value, the updated value, and the object itself. It completely conforms to our previous definition of CAS. So, what is unsafe?

This class is in the rt.jar package, but not under the familiar Java package, but under the Sun.misc package. And all are class files, no comments, as his name suggests: unsafe.

Can we construct him? No, except for reflection.

Let’s look at his source code:

The getUnsafe method checks the class that called the getUnsafe method. If the ClassLoader of the class is not null, the getUnsafe method will be null. When the class loader is a Bootstrap loader, the Bootstrap loader has no objects, which means that the loaded class is most likely from rt.jar.

In the latest Version of Java 9, this class has been hidden. Because this class uses Pointers. But the downside of Pointers is that they are insecure.

4. Disadvantages of CAS

CAS seem to be very bad, however, it still has disadvantages, most notably the ABA problem. If A variable A is changed to B and then changed to A, the CAS mechanism is not detectable, but has actually been modified. That’s fine if it’s a primitive type, but what about a reference type? There are multiple variables in this object, how do I know if it’s changed? Smart you must have thought of, add a version number ah. Check the version number every time you modify it. If the version number is changed, it indicates that it is changed. Even if you are still A, it does not work.

In Java. Util. Concurrent. Atomic package, have the AtomicReference to guarantee the atomicity of reference, but I feel a little less and less use of synchronization and mutex, could be more efficient.

conclusion

Today we understand the principle of CAS from a variety of perspectives, and the algorithm is particularly important, as evidenced by the CPU’s special design of an instruction to implement it. The unSafe CAS algorithm is everywhere in the JDK source code. Without CAS, there would be no concurrent 1.5 container. Ok, that’s all for today.

Good luck!!