Do you really understand the CAS mechanism? (Deep into operating system analysis)

Learning Java concurrent programming, CAS mechanism is a must to master the knowledge point. This article mainly carries on an analysis from the emergence reason to the principle. I hope it helps.

1. Why is CAS required?

Why do you need CAS? Let’s start with an error. We often use the volatile keyword to denote that a variable is globally shared and has both visibility and order. But there’s no atomicity. For example, a++ is a common operation. This operation can be broken down into three steps:

(1) Read a from memory

(2) Add 1 to a

(3) Write the value of A back into memory

This works fine in a single thread, but in multiple threads there are all kinds of problems. Because maybe one thread increments a by one, and before it can write to memory, another thread reads the old value. This causes thread insecurity. How to solve this problem? The most common way to do this is to use AtomicInteger to decorate A. We can look at the code:

public class Test3 {

 // Define a with AtomicInteger

 static AtomicInteger a = new AtomicInteger();

 public static void main(String[] args) {

  Test3 test = new Test3();

  Thread[] threads = new Thread[5];

  for (int i = 0; i < 5; i++) {

   threads[i] = new Thread(() -> {

    try {

     for (int j = 0; j < 10; j++) {

      // Use getAndIncrement to increment

      System.out.println(a.incrementAndGet());  

      Thread.sleep(500);

     }

    } catch (Exception e) {

     e.printStackTrace();

    }

   });

   threads[i].start();

  }

 }

}

Copy the code

Now we use the AtomicInteger class and call the incrementAndGet method to incrementA. How is this incrementAndGet implemented? We can take a look at the source code of AtomicInteger.

/ * *

* Atomically increments by one the current value.

* @return the updated value

* /

public final int incrementAndGet(a) {

    return unsafe.getAndAddInt(this, valueOffset, 1) + 1;

}

Copy the code

We can see at this point that usafe is actually calling getAndAddInt, but we can’t see that right now, so let’s dive into the source code and see how it’s implemented,

public final int getAndAddInt(Object var1, long var2, int var4) {   

 int var5;     

 do {          

  var5 = this.getIntVolatile(var1, var2);   

 } while(!this.compareAndSwapInt(var1, var2, var5, var5 + var4));    

 return var5;   

}

Copy the code

At this point, it becomes a little clearer that the underlying method is called compareAndSwapInt, which is essentially the CAS mechanism. So if we want to understand how AtomicInteger’s atomic operations are implemented, we have to understand the CAS mechanism, which is why we need to understand the CAS mechanism.

2. CAS analysis

1. Basic meaning

CAS full spell is also called compareAndSwap, from the name of the meaning we know is the meaning of comparison exchange. What do you compare and exchange?

It takes three parameters, CAS (V, E, N), where V represents the value of the variable to be updated, E represents the expected value, and N represents the new value. V is set to N only if V is equal to E. If V and E are different, another thread has done two updates, and the current thread does nothing. Finally, CAS returns the true value of the current V.

To illustrate this process, let’s use an example I gave earlier:

Like getting engaged to your son. Your son is the memory location. You thought your son was with Yang Guifei, but when you got engaged, you found xi Shi beside your son. What to do at this point? You don't do anything in anger. If your son's side is expected to be Yang Guifei, you look very happy to give them engaged, also known as the implementation of the operation. Now you get the idea.

CAS operates with an optimistic attitude, always thinking it can successfully complete the operation. So CAS is also called optimistic lock, but what is a pessimistic lock? Pessimistic lock is what we used to know as synchronized. The pessimistic lock idea can be understood as a thread that wants to acquire the lock but cannot, and must be released by someone else.

2. Underlying principles

To understand the underlying principle, it is best to dive into the source code, which we have seen above is actually Usafe method to complete, in this method using the compareAndSwapInt CAS mechanism. So now it’s worth taking a closer look:

public final class Unsafe {

    // compareAndSwapInt is a native method

    public final native boolean compareAndSwapInt(

     Object o, 

     long offset,

        int expected,

        int x

    );

    // There are still many ways to do this

}

Copy the code

The first parameter is the object we operate on, the second parameter is the address offset of object A, the third parameter is what we expect the value of a to be, and the fourth parameter is the actual value of a.

But here we will see that compareAndSwapInt is a native method, which means that further down is C code, and if we are curious, we can go further.

UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapInt(JNIEnv *env, jobject unsafe, 

           jobject obj, jlong offset, jint e, jint x))

  UnsafeWrapper("Unsafe_CompareAndSwapInt");

  oop p = JNIHandles::resolve(obj);

  // Calculate the address of value based on the offset valueOffset

  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);

  // Call the function CMPXCHG in Atomic to do the comparison exchange

  return (jint)(Atomic::cmpxchg(x, addr, e)) == e;

UNSAFE_END

Copy the code

First, we use Jint to calculate the address of the value. Then, we use Atomic CMPXCHG to compare and exchange values based on the address. Now the problem is thrown to this CMPXCHG, which actually implements this function. Look a little deeper and the truth is not far off.

unsigned Atomic::cmpxchg(unsigned int exchange_value,

                         volatile unsigned int* dest, 

                         unsigned int compare_value) {

    assert(sizeof(unsigned int) == sizeof(jint), "more work to do");

  / *

* Call overloaded functions on different platforms based on the operating system type,

During precompilation, the compiler decides which platform overloaded functions to call

* /

    return (unsigned int)Atomic::cmpxchg((jint)exchange_value, 

                     (volatile jint*)dest, (jint)compare_value);

}

Copy the code

Now different CMPXCHG overloaded functions will be called on different operating systems. I’m using Windows 10 now, so let’s take a look at the implementation of this platform, don’t worry about it further:

inline jint Atomic::cmpxchg (jint exchange_value, volatile jint* dest, 

       jint compare_value) {

  int mp = os::is_MP();

  __asm {

    mov edx, dest

    mov ecx, exchange_value

    mov eax, compare_value

    LOCK_IF_MP(mp)

    cmpxchg dword ptr [edx], ecx

  }

}

Copy the code

This part of the code is a little bit related to the assembly instruction code, which is completely close to the truth. The first three move instructions represent the following values to the previous register. Then LOCK_IF_MP is called and the following CMPXCHG assembly instruction is swapped. Now we don’t know how the LOCK_IF_MP and CMPXCHG are swapped, but let’s dig a little deeper at the end.

The truth came. He came. He really came.

inline jint Atomic::cmpxchg (jint exchange_value, 

                             volatile jint* dest, jint compare_value) {

  //1. Check whether the CPU is multi-core

  int mp = os::is_MP();

  __asm {

    //2. Put the parameter values into the register

    mov edx, dest   

    mov ecx, exchange_value

    mov eax, compare_value 

    //3. LOCK_IF_MP

    cmp mp, 0

    // if mp = 0, the thread is running on a single CPU. At this point, JE will jump to the L0 marker and directly execute CMPXCHG instruction

    je L0

    _emit 0xF0

//5

L0:

    / *

* Compare and swap. A brief explanation of the following instruction, familiar with assembly friends can skip the following explanation:

* CMPXCHG: Compare and swap instruction

* Dword: double word, double word, double word

* PTR: pointer; used in conjunction with the preceding dword to indicate that the accessed memory unit is a double-word unit

* This instruction means:

Compare the value in the EAX register (compare_value) with the value in the [EDX] two-word memory cell,

If they are the same, exchange_value in the ECX register is stored in the [EDX] memory cell.

* /

    cmpxchg dword ptr [edx], ecx

  }

}

Copy the code

At this point, I believe you understand the mechanism that CAS really implements, which is ultimately done by the operating system’s assembly instructions.

3. Advantages and disadvantages of CAS mechanism

(1) Advantages

As we mentioned at the beginning of the article, CAS is an optimistic lock, and it is a non-blocking lightweight optimistic lock. What is non-blocking? It’s basically a thread that wants to acquire a lock, and the other thread will give a response indicating whether the lock can be acquired. In the case of low resource competition, synchronized has high performance. Compared with synchronized weight lock, synchronized performs more complex locking, unlocking, and awakening operations.

(2) Disadvantages

Disadvantages are also a very important topic because of a well-known problem called ABA. Suppose A variable, A, is changed to B and then changed to A, the CAS mechanism is undetectable, but has actually been changed. That’s the ABA problem,

ABA problems can cause a lot of problems, such as data inconsistency and so on. We can give an example to illustrate.

You have a bottle of water on the table, and someone else drinks it up and pours it back up. When you drink it again, the water is the same as before, so you mistake it for the same glass of water. If you knew the truth, would you use it again if someone else did? To take a more yellow example,Your girlfriend came back after someone else slept with you, the same girlfriend?

ABA can be addressed in a number of ways that have been given in other articles.

Do you really understand the CAS mechanism? (Deep into operating system analysis)

1. Why is CAS required?

2. CAS analysis

1. Basic meaning

2. Underlying principles

3. Advantages and disadvantages of CAS mechanism

Related Posts

Python Enumerated types Enum Usage description

Flink on Zeppelin Best practices for flow computation processing

What are the fundamentals and principles of concurrent programming?