First, some optimistic locks and pessimistic locks:

Pessimistic locking: Always assume the worst, every time you go to get the data you think someone else will change it, so every time you get the data you lock it, so that someone else tries to get the data it will block until it gets the lock. Traditional relational database inside used a lot of this locking mechanism, such as row lock, table lock, read lock, write lock, etc., are in the operation before the first lock. Another example is the implementation of the synchronized keyword in Java.

Optimistic lock: as the name implies, is very optimistic, every time to get the data, I think others will not modify, so I will not lock, but when updating, I will judge whether others have to update the data during this period, you can use the version number and other mechanisms. Optimistic locks are suitable for multi-read applications to improve throughput. Optimistic locks are provided by databases similar to write_condition. In Java. Java util. Concurrent. Atomic package this atomic variable classes is to use the optimistic locking a way of implementation of CAS.

Two: an implementation of optimistic lock -CAS(Compare and Swap) :

Problems with locks:

Prior to JDK1.5, Java relied on the synchronized keyword, which coordinates access to shared state by using a consistent locking protocol. This ensures that whichever thread holds the lock on shared variables has exclusive access to them. An exclusive lock is a pessimistic lock, so synchronized is a pessimistic lock.

Pessimistic locking has the following problems:

1. In the multi-threaded competition, locking and releasing locks will lead to a lot of context switching and scheduling delay, resulting in performance problems.

2. A lock held by one thread causes all other threads requiring the lock to suspend.

3. If a thread with a higher priority waits for a thread with a lower priority to release the lock, priority inversion may occur, causing performance risks.

In contrast to these problems of pessimistic locking, another lock that is more effective is optimistic locking. In fact, optimistic locking is: every time an operation is completed without locking, it is assumed that there are no concurrent conflicts, and if it fails because of concurrent conflicts, it is retried until it succeeds.

Optimistic locking:

As mentioned above, Optimistic Locking is actually a kind of thinking. Compared with pessimistic locks, optimistic locks assume that data will not have concurrent conflicts in general, so when data is submitted for update, it will formally detect whether the data has concurrent conflicts. If the data is found to have concurrent conflicts, the error message will be returned to the user, so that the user can decide what to do.

The concept of optimistic locking mentioned above has already explained its implementation details: it is mainly two steps: conflict detection and data update. One typical implementation is Compare and Swap (CAS).

CAS:

CAS is an optimistic locking technology. When multiple threads try to update the same variable using CAS, only one thread can update the value of the variable. However, all other threads fail.

The CAS operation contains three operands — the memory location to be read and written to (V), the expected original value to be compared (A), and the new value to be written to (B). If the value of memory location V matches the expected original value A, the processor automatically updates that location value to the new value B. Otherwise the processor does nothing. In either case, it returns the value of that location before the CAS instruction. (Some special cases of CAS will only return whether the CAS was successful, not extract the current value.) CAS effectively says “I think position V should contain the value A; If this value is included, place B in this position; Otherwise, do not change the location, just tell me the current value of the location.” This is actually the same principle as optimistic lock collision check + data update.

Again, optimistic locking is a thought. CAS is an implementation of this idea.

JAVA support for CAS:

The addition of java.util.concurrent (J.U.C) in JDK1.5 builds on CAS. Compared with synchronized, CAS is a common implementation of non-blocking algorithms. So J.U.C gets a big performance boost.

Take an example of AtomicInteger in java.util.Concurrent to see how thread safety can be ensured without locking. The getAndIncrement method is equivalent to the ++ I operation.

public class AtomicInteger extends Number implements java.io.Serializable {
    private volatile int value;

    public final int get() {
        return value;
    }

    public final int getAndIncrement() {
        for (;;) {
            int current = get();
            int next = current + 1;
            if (compareAndSet(current, next))
                return current;
        }
    }

    public final boolean compareAndSet(int expect, int update) {
        returnunsafe.compareAndSwapInt(this, valueOffset, expect, update); }}Copy the code

In the absence of locking, the value field uses a volatile primitive to ensure that data is visible across threads. This can be read directly when fetching the value of a variable. And then let’s see how plus plus I does that.

GetAndIncrement uses the CAS operation. Each time the data is read from the memory and then the CAS operation is performed with the result after +1. If the result is successful, the CAS operation is returned.

CompareAndSet uses JNI (Java Native Interface) to complete the operation of CPU instructions:

 public final boolean compareAndSet(int expect, int update) {   
     returnunsafe.compareAndSwapInt(this, valueOffset, expect, update); }     Copy the code

Unsafe.com pareAndSwapInt(this, valueOffset, expect, update); Logic similar to the following:

1 if (this == expect) {
2     this = update
3     return true; 4}else{5return false; 6}Copy the code

Compare this == expect, replace this = update, and compareAndSwapInt to implement atomicity of the two steps. Refer to the CAS principle

CAS principle:

CAS is implemented through code that calls JNI. And compareAndSwapInt is to use C to call the CPU underlying instructions to achieve.

The following describes how to implement the CAS by analyzing a common CPU (Intel x86).

The following is the source code for the compareAndSwapInt() method of the sun.misc.Unsafe class:

public final native boolean compareAndSwapInt(Object o, long offset,
2                                               int expected,
3                                               int x);
Copy the code

You can see that this is a local method call. The C++ code for this native method to be called in sequence in the JDK is:

#define LOCK_IF_MP(mp) __asm cmp mp, 0 \
                       __asm je L0      \
                       __asm _emit 0xF0 \
                       __asm L0:

inline jint     Atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
  // alternative for InterlockedCompareExchange
  int mp = os::is_MP();
  __asm {
    mov edx, dest
    mov ecx, exchange_value
    mov eax, compare_value
    LOCK_IF_MP(mp)
    cmpxchg dword ptr [edx], ecx
  }
}
Copy the code

As shown in the source code above, the program decides whether to add the LOCK prefix to the CMPXCHG instruction based on the current processor type. If the program is running on multiple processors, prefix CMPXCHG instructions with lock (Lock CMPXCHG). Conversely, if the program is running on a single processor, the lock prefix is omitted (the single processor itself maintains sequential consistency within the single processor and does not require the memory barrier effect provided by the Lock prefix).

CAS faults:

1. ABA problems:

For example, thread one fetches A from location V, and another thread Two fetches A from location V, and two does something to change the data from location V to A. Then thread one performs the CAS operation and finds that there is still A in memory, and thread One succeeds. Although the CAS operation for thread one was successful, there may be a lurking problem. As follows:

Thread T1 already knows that a. next is B, and wants to use CAS to replace the stack with B:

Head.com pareAndSet (A, B);

Before T1 executes the above instruction, thread T2 intervenes and pushes A and B off the stack, and then pushes HD, C and A. At this time, the stack structure is shown as follows, and object B is in A free state:

At this time, it is the turn of thread T1 to perform CAS operation, and it is found that the top of the stack is still A, so CAS succeeds and the top of the stack is changed to B, but actually B.ext is null, so the situation becomes:

       

There is only one element in the stack, and the linked list of C and D no longer exists in the stack, so C and D are discarded for no reason.

Since Java1.5, the JDK atomic package has provided a class AtomicStampedReference to address ABA issues. The compareAndSet method of this class first checks whether the current reference is equal to the expected reference, and whether the current flag is equal to the expected flag, and if all are equal, sets the reference and flag values to the given updated value atomically.

Public Boolean compareAndSet(V expectedReference,// expectedReference V expectedReference,// updated reference int expectedStamp, // Expected stamp int newStampCopy the code

Actual application code:

1 private static AtomicStampedReference<Integer> atomicStampedRef = new AtomicStampedReference<Integer>(100, 0); 2, 3... 4 5 atomicStampedRef.compareAndSet(100, 101, stamp, stamp + 1);Copy the code
  1. Long cycle time and high overhead:

Spin CAS (unsuccessful, loop until successful), if unsuccessful for a long time, can be very expensive for the CPU to execute. The JVM can improve performance if it supports pause instruction provided by the processor. Pause instruction has two functions. First, it can delay de-pipeline instruction so that the CPU does not consume too many execution resources. Second, it improves CPU execution efficiency by avoiding CPU pipeline flush due to memory order violation during loop exit.

    

3. Atomic operations that can only be guaranteed for one shared variable:

When performing operations on a Shared variables, we can use a loop CAS way to ensure that atomic operation, but for more than one Shared variables, circulation CAS will not be able to ensure the atomicity of operation, this time can use the lock, or there is a tricky way, is to combine multiple Shared variables into a Shared variable to operation. Let’s say we have two shared variables I =2,j=a, combine ij=2a, and then use CAS to manipulate ij. Since Java1.5, the JDK has provided the AtomicReference class to ensure atomicity between reference objects. You can place multiple variables in an object to perform CAS operations.

Usage of CAS and Synchronized:

1. In the case of less resource competition (less thread conflict), synchronized synchronized lock is used for thread blocking and wake up switch, as well as switching between user-mode kernel states, which wastes extra CPU resources; CAS, on the other hand, is hardware-based, does not need to enter the kernel, does not need to switch threads, and operates with less spin, thus achieving higher performance.

2. In the case of serious resource competition (serious thread conflict), CAS has a high probability of spin, which wastes more CPU resources and is less efficient than synchronized.

Synchronized has been improved since jdk1.6. The underlying implementation of synchronized mainly relies on lock-free queues. The basic idea of synchronized is spin-back blocking, which continues to compete for locks after competitive switching, sacrificing fairness slightly, but achieving high throughput. Performance similar to CAS can be achieved with fewer thread collisions; In the case of serious thread conflicts, the performance is much higher than that of CAS.

Implementation of concurrent package:

Because Java CAS has memory semantics for both volatile reads and volatile writes, Java threads now communicate in the following four ways:

1. Thread A writes A volatile variable, and thread B reads it.

2. Thread A writes the volatile variable, and thread B updates the volatile variable with CAS.

3. Thread A updates A volatile variable with CAS, and thread B updates the volatile variable with CAS.

4. Thread A updates A volatile variable with CAS, and thread B reads the volatile variable.

Java CAS will use modern processors provide efficient machine level atomic instructions, these atoms instruction atomically to memory read – to – write operations, this is the key to achieve synchronization in multiprocessor (in essence, can support atomic reading – to – writing instruction computing machines, calculate the Turing machine is order equivalent asynchronous machine, So any modern multiprocessor will support some kind of atomic instruction that can perform atomic read-modif-write operations on memory. Meanwhile, read/write of volatile variables and CAS enable communication between threads. Taken together, these features form the building blocks for the implementation of the concurrent package. If we look closely at the source code implementation of the Concurrent package, we will find a common implementation pattern:

1. First, declare the shared variable volatile;

2. Then, the atomic conditional update of CAS is used to achieve synchronization between threads;

3. At the same time, the thread communication is realized with volatile read/write and CAS memory semantics.

AQS, non-blocking data structure and the atomic variable classes (Java. Util. Concurrent. Atomic package’s classes), the concurrent is the base class in the package using this model, and the top class in the concurrent bag is dependent on the base class. As a whole, the implementation of the Concurrent package looks like this:

CAS in the JVM (allocation of objects in the heap) :

The Java call to New Object () creates an object, which is allocated to the JVM’s heap. So how does this object actually stay in the heap?

First, the amount of space that a new object() needs when it is executed is already determined, because Java data types have a fixed amount of space (Google if you don’t know how). The next job is to find the space in the heap to hold the object. In the case of single threads, there are generally two allocation strategies:

1. Pointer collision: This generally applies to memory that is absolutely neat (depending on the reclamation policy), and the job of allocating space is to move the pointer to the free memory side by the size of the object.

2. Free list: This applies to memory irregularities where the JVM maintains a list of memory areas that are free and of what size. When allocating space to an object, look up the appropriate area in the free list and allocate it.

But the JVM cannot run in single-threaded state all the time; that would be inefficient. Since allocating memory to another object is not an atomic operation, it is not safe to look up the free list, allocate memory, modify the free list, and so on. There are also two strategies for addressing concurrency security:

1. CAS: The VIRTUAL machine uses the CAS and retry mode to ensure atomicity of update operations. The principle is the same as described above.

2. TLAB: If CAS is used, performance will still be affected, so the JVM proposes a more advanced optimization strategy: Each thread preallocates a small chunk of memory in the Java heap, known as the Local Thread Allocation buffer (TLAB), which is allocated directly to the TLAB when it needs to allocate memory internally, avoiding thread collisions. The CAS operation is performed to allocate more memory space only when the buffer memory is used up and needs to be reallocated. You can use the -xx :+/ -usetlab parameter to specify whether a VM uses TLAB. (JdK5 and later versions enable TLAB by default.)

Reprint to: www.cnblogs.com/qjjazry/p/6…