JVM_05 Memory model

JVM foundation chapter final notes, dark JVM said or quite good, simple, not too much redundant content, the subsequent should also update the JVM surface series, this section is also a lot of emphasis, especially related to lock optimization, such as CAS, light weight lock, biased lock, etc

Java memory model

Many people confuse the Java memory structure with the Java MemoryModel, which stands for Java MemoryModel (JMM).

In simple terms, the JMM defines a set of rules and guarantees for the visibility, ordering, and atomicity of shared data (member variables, arrays) read and written by multiple threads

1.1 atomic

We talked about atomicity in threads, but here’s a quick reminder:

So the question is, are two threads increment and decrement a static variable with an initial value of 0 5000 times each, and the result is 0?

1.2 Problem Analysis

These results can be positive, negative, or zero. Why is that? Because Java increments static variables, decrement is not an atomic operation.

For example, for I ++ (I is a static variable), it actually produces the following JVM bytecode instructions:

getstatic i   // Get the value of static variable I
iconst_1      // Prepare constant 1
iadd          / / add
putstatic i   // Store the modified value into the static variable I
Copy the code

And the corresponding I — is similar:

getstatic i  // Get the value of static variable I
iconst_1     // Prepare constant 1
isub         / / subtraction
putstatic i  // Store the modified value into the static variable I
Copy the code

The Java memory model is as follows: To complete the increment of static variables, the increment and decrement need to exchange data between main memory and thread memory:

If more than 8 lines of code are executed sequentially (without interleaving), there is no problem:

// Assume I starts at 0
getstatic     i   // thread 1- get the value of static variable I
iconst_1          Thread 1- Prepare constant 1
iadd              // thread 1- increment thread I =1
putstatic     i   // Thread 1- store the modified value into static variable I. Static variable I =1
getstatic     i   // thread 1- get the value of static variable I
iconst_1          Thread 1- Prepare constant 1
isub              // thread 1- subtracting thread I =0
putstatic     i   // Thread 1- store the modified value into static variable I. Static variable I =0
Copy the code

But in multithreading these eight lines of code can interlace (why? Think about it) : negative numbers:

// Assume I starts at 0
getstatic   i   // thread 1- get the value of static variable I
getstatic   i   // thread 2- get the value of static variable I
iconst_1        Thread 1- Prepare constant 1
iadd            // thread 1- increment thread I =1
putstatic   i   // Thread 1- store the modified value into static variable I. Static variable I =1
iconst_1        Thread 2- Prepare constant 1
isub            // Thread 2- subtracting thread I =-1
putstatic   i   Thread 2- Store the modified value into static variable I. Static variable I =-1
Copy the code

When a positive number occurs:

// Assume I starts at 0
getstatic    i   // thread 1- get the value of static variable I
getstatic    i   // thread 2- get the value of static variable I
iconst_1         Thread 1- Prepare constant 1
iadd             // thread 1- increment thread I =1
iconst_1         Thread 2- Prepare constant 1
isub             // Thread 2- subtracting thread I =-1
putstatic    i   Thread 2- Store the modified value into static variable I. Static variable I =-1
putstatic    i   // Thread 1- store the modified value into static variable I. Static variable I =1
Copy the code

1.3 Solutions

'synchronized' syntax: ```` Java synchronized {to be used as atomic code}Copy the code

Synchronized to solve concurrency problems:

static int i = 0;
static Object obj = new Object();
public static void main(String[] args) throws InterruptedException {
    Thread t1 = new Thread(() -> {
        for (int j = 0; j < 5000; j++) {
            synchronized(obj) { i++; }}}); Thread t2 =new Thread(() -> {
        for (int j = 0; j < 5000; j++) {
            synchronized(obj) { i--; }}}); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println(i); }Copy the code

How to think about it: You can think of OBj as a room, and threads T1 and T2 as two people.

When thread t1 executes to synchronized(obj), it is as if t1 entered the room and backhanded locked the door, executing count++ code inside the door.

If T2 is synchronized(obj), it finds that the door is locked and can only wait outside.

T1 will unlock the door and exit obj only when it has finished executing the synchronized{} block. The T2 thread can then enter the OBJ room, lock the door, and execute its count– code

Note: in the above example, t1 and T2 threads must lock the same OBj object with synchronized. If T1 locks M1 object and T2 locks M2 object, it is like two people entering two different rooms, which cannot achieve synchronization effect.

2. The visibility

2.1 Inescapable cycles

The main thread’s changes to the run variable are not visible to the T thread, so the T thread cannot stop:

static boolean run = true;
public static void main(String[] args) throws InterruptedException {
    Thread t = new Thread(()->{
        while(run){
        / /...}}); t.start(); Thread.sleep(1000);
    run = false; // Thread T does not stop as expected
}
Copy the code

Why is that? Analyze:

  1. In the initial state, thread T has just read the value of run from main memory into working memory.

2. Because t threads frequently read run values from main memory, the JIT compiler will cache run values in its own working memory cache to reduce access to run in main memory and improve efficiency

3. After 1 second, the main thread changes the value of run and synchronizes it to main memory, while T reads the value of the variable from the cache in its own working memory. The result is always the old value

2.2 Solutions

Volatile (volatile keyword)

It can be used to modify member and static variables. It prevents a thread from looking up the value of a variable from its own working cache. It must fetch its value from main memory

2.3 the visibility

This ensures that changes to volatile variables made by one thread are visible to another thread across multiple threads. This does not guarantee atomicity, but only for one writer thread and multiple readers.

getstatic run    // Thread T gets run true
getstatic run    // Thread T gets run true
getstatic run    // Thread T gets run true
getstatic run    // Thread T gets run true
putstatic run    // Thread main changes run to false, only once
getstatic run    // Thread T gets run false
Copy the code

Compare the thread-safe example we used earlier: two threads, one I ++ and one I –, are only guaranteed to see the latest values, not interleaving instructions

// Assume I starts at 0
getstatic    i   // thread 1- get the value of static variable I
getstatic    i   // thread 2- get the value of static variable I
iconst_1         Thread 1- Prepare constant 1
iadd             // thread 1- increment thread I =1
putstatic    i   // Thread 1- store the modified value into static variable I. Static variable I =1
iconst_1         Thread 2- Prepare constant 1
isub             // Thread 2- subtracting thread I =-1
putstatic    i   Thread 2- Store the modified value into static variable I. Static variable I =-1
Copy the code

Note that synchronized blocks ensure atomicity of code blocks and visibility of variables within code blocks. The disadvantage is that synchronized is a heavyweight operation with relatively lower performance

If you add system.out.println () to the loop in the previous example, thread T correctly sees changes to the run variable even without the volatile modifier. Think about why?

Println () = system.out.println () = system.out.println () order

3.1 Weird results

int num = 0;
boolean ready = false;

Thread 1 executes this method
public void actor1(I_Result r) {
    if(ready) {
        r.r1 = num + num;
    } else {
        r.r1 = 1; }}Thread 2 executes this method
public void actor2(I_Result r) {
    num = 2;
    ready = true;
}
Copy the code

I_Result is an object with a property R1 that holds the result. How many possible results are there?

Some students do this analysis

Case 1: Thread 1 executes first, ready = false, so the else branch is 1

Case 2: Thread 2 executes num = 2 first, but does not have time to execute ready = true, thread 1 executes, still enters the else branch, and results in 1

Case 3: Thread 2 executes to Ready = true, thread 1 executes, this time into the if branch, and the result is 4 (because num has already been executed), but I’m telling you, the result can also be 0 😁😁😁, believe it or not!

In this case, thread 2 executes ready = true, switches to thread 1, enters the if branch, adds 0, and cuts back to thread 2 to execute num = 2

I believe many people have fainted 😵😵😵

This phenomenon, called instruction reordering, is a run-time optimization of the JIT compiler that requires extensive testing to reproduce:

Using Java concurrency pressure measuring tool jcstress wiki.openjdk.java.net/display/Cod…

mvn archetype:generate -DinteractiveMode=false -
DarchetypeGroupId=org.openjdk.jcstress -DarchetypeArtifactId=jcstress-java-test-
archetype -DgroupId=org.sample -DartifactId=test -Dversion=1.0
Copy the code

Create a Maven project that provides the following test classes

@JCStressTest
@Outcome(id = {"1", "4"}, expect = Expect.ACCEPTABLE, desc = "ok")
@Outcome(id = "0", expect = Expect.ACCEPTABLE_INTERESTING, desc = "!!!!" )
@State
public class ConcurrencyTest {

    int num = 0;
    boolean ready = false;
    @Actor
    public void actor1(I_Result r) {
        if(ready) {
            r.r1 = num + num;
        } else {
            r.r1 = 1; }}@Actor
    public void actor2(I_Result r) {
        num = 2;
        ready = true; }}Copy the code

perform

mvn clean install
java -jar target/jcstress.jar
Copy the code

Outputs the results we are interested in, and extracts one of them:

*** INTERESTING tests
Some interesting behaviors observed. This is for the plain curiosity.

2 matching test results.
    [OK] test.ConcurrencyTest
    (JVM args: [-XX:-TieredCompilation])
     Observed state Occurrences Expectation Interpretation
            0     1.729          ACCEPTABLE_INTERESTING !!!!
            1     42.617.915     ACCEPTABLE ok
            4     5.146.627      ACCEPTABLE ok
        [OK] test.ConcurrencyTest
      (JVM args: [])
    Observed state Occurrences Expectation Interpretation
             0    1.652          ACCEPTABLE_INTERESTING  !!!!
             1    46.460.657     ACCEPTABLE ok
             4    4.571.072      ACCEPTABLE ok
Copy the code

As you can see, there were 638 cases where the result was zero, which is relatively rare, but it happened.

3.2 Solutions

A volatile variable that disables instruction reordering

@JCStressTest
@Outcome(id = {"1", "4"}, expect = Expect.ACCEPTABLE, desc = "ok")
@Outcome(id = "0", expect = Expect.ACCEPTABLE_INTERESTING, desc = "!!!!" )
@State
public class ConcurrencyTest {
    
    int num = 0;
    volatile boolean ready = false;
    @Actor
    public void actor1(I_Result r) {
        if(ready) {
            r.r1 = num + num;
        } else {
            r.r1 = 1; }}@Actor
    public void actor2(I_Result r) {
        num = 2;
        ready = true; }}Copy the code

The result is:

*** INTERESTING tests
    Some interesting behaviors observed. This is for the plain curiosity.
    0 matching test results.
Copy the code

3.3 Understanding order

The JVM can adjust the order in which statements are executed without compromising correctness. Consider the following code

static int i;
static int j;

// Perform the following assignment in a threadi = ... ;// A time-consuming operationj = ... ;Copy the code

As you can see, it makes no difference whether I or j is executed first. So, when the code above is actually executed, it can be either

i = ... ;// A time-consuming operationj = ... ;Copy the code

It can also be

j = ... ; i = ... ;// A time-consuming operation
Copy the code

This feature is called “instruction rearrangement”. In multi-threading, “instruction rearrangement” will affect the correctness, such as the singleton realization of the famous double-checked locking mode

public final class Singleton {
    private Singleton(a) {}private static Singleton INSTANCE = null;
    public static Singleton getInstance(a) {
        // Enter the inner synchronized block only if the instance is not created
        if (INSTANCE == null) {
            synchronized (Singleton.class) {
            // Maybe another thread has already created the instance, so check again
                if (INSTANCE == null) {
                    INSTANCE = newSingleton(); }}}returnINSTANCE; }}Copy the code

The above implementation features are:

  • Lazy instantiation
  • Synchronized is used only when getInstance() is used for the first time

INSTANCE = new Singleton() = new Singleton()

0: new            #2     // class cn/itcast/jvm/t4/Singleton
3: dup
4: invokespecial  #3     // Method "<init>":()V
7: putstatic      #4     // Field
INSTANCE:Lcn/itcast/jvm/t4/Singleton;
Copy the code

The sequence of steps 4 and 7 is not fixed. Perhaps the JVM will optimize to assign the reference address to the INSTANCE variable before executing the constructor. If two threads t1 and T2 execute in the following time sequence:

time1T1 thread executes to INSTANCE =newSingleton(); time2T1 thread allocates space and generates a reference address for the Singleton object (0) time3The T1 thread assigns the reference address to INSTANCE, and INSTANCE! =null(7) time4The T2 thread enters the getInstance() method and finds INSTANCE! =null(synchronizedOutside the block), return the INSTANCE time directly5The t1 thread executes the Singleton constructor (4)Copy the code

T1 has not yet fully executed the constructor, and if there is a lot of initialization going on in the constructor, t2 will have an uninitialized singleton

It is possible to disable instruction reordering by using volatile on INSTANCE, but note that volatile is effective in JDK 5 and older

3.4 happens-before

Happens-before specifies which writes are visible to other threads’ reads. It is a set of rules for visibility and order. Aside from the following happens-before rule, the JMM does not guarantee that one thread’s writes to a shared variable are visible to other threads’ reads to that shared variable

  • Writes to a variable before m is unlocked by a thread are visible to subsequent reads of m by other threads
static int x;
static Object m = new Object();

new Thread(()->{
    synchronized(m) {
        x = 10; }},"t1").start();

new Thread(()->{
    synchronized(m) { System.out.println(x); }},"t2").start();
Copy the code
  • The thread ofvolatileThe write of a variable is visible to subsequent reads of the variable by other threads
volatile static int x;

new Thread(()->{
    x = 10;
},"t1").start();

new Thread(()->{
    System.out.println(x);
},"t2").start();
Copy the code
  • Write to a variable before the thread starts, and read to the variable after the thread starts
static int x;

x = 10;

new Thread(()->{
    System.out.println(x);
},"t2").start();
Copy the code
  • A write to a variable before the thread terminates is visible to a read after other threads know it has ended (for example, other threads wait for it to end by calling T1.isalive () or t1.join())).
static int x;

Thread t1 = new Thread(()->{
    x = 10;
},"t1");
t1.start();

t1.join();
System.out.println(x);
Copy the code
  • Writing a variable before t1 interrupts t2 (interrupt) is visible for reading a variable after other threads know t2 isInterrupted (via t2.interrupted or t2.isinterrupted)
static int x;
public static void main(String[] args) {
    Thread t2 = new Thread(()->{
        while(true) {
            if(Thread.currentThread().isInterrupted()) {
                System.out.println(x);
                break; }}},"t2");
    t2.start();
    
    new Thread(()->{
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        x = 10;
        t2.interrupt();
    },"t1").start();

    while(! t2.isInterrupted()) { Thread.yield(); } System.out.println(x); }Copy the code
  • Writes to the default value of a variable (0, false, null) are visible to other threads reading the variable
  • It’s transitive, if x Hb -> y and y Hb -> z then x Hb -> z

Variables are either member variables or static member variables

4. CAS and atomic classes

4.1 the CAS

CAS stands for Compare and Swap, which embodies the idea of optimistic locking, such as multiple threads performing +1 operations on a shared integer variable:

// Keep trying

while(true) {
    intOld value = shared variable;// Get the current value 0
    intResult = old value +1; // Add 1 to the old value of 0

    /* If another thread changes the shared variable to 5, this thread's correct result 1 will be invalidated, and compareAndSwap will return false and try again until: CompareAndSwap returns true, which means that no other thread is interfering with */ while this thread is making changes
    if(compareAndSwap (old value, result)) {// On success, exit the loop}}Copy the code

When obtaining a shared variable, use volatile to ensure that the variable is visible. The combination of CAS and volatile enables lock-free concurrency, which is suitable for scenarios with low competition and multi-core cpus.

  • Because synchronized is not used, threads don’t get blocked, which is one of the efficiency gains
  • However, if the competition is intense, you can imagine that retries will occur frequently and efficiency will suffer

The CAS layer relies on an Unsafe class to call the CAS instruction directly from the underlying operating system. Here is an example of using the Unsafe object directly for thread safety protection

import sun.misc.Unsafe;
import java.lang.reflect.Field;
public class TestCAS {
public static void main(String[] args) throws InterruptedException {
DataContainer dc = new DataContainer();
int count = 5;
Thread t1 = new Thread(() -> {
for (int i = 0; i < count; i++) { dc.increase(); }}); t1.start(); t1.join(); System.out.println(dc.getData()); }}class DataContainer {
    private volatile int data;
    static final Unsafe unsafe;
    static final long DATA_OFFSET;
    
    static {
        try {
            // Unsafe objects cannot be called directly, only reflected
            Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
            theUnsafe.setAccessible(true);
            unsafe = (Unsafe) theUnsafe.get(null);
        } catch (NoSuchFieldException | IllegalAccessException e) {
            throw new Error(e);
        }
        try {
            // The offset of the data attribute in the DataContainer object, which is used to access the attribute directly from Unsafe
            DATA_OFFSET =
        unsafe.objectFieldOffset(DataContainer.class.getDeclaredField("data"));
        } catch (NoSuchFieldException e) {
            throw newError(e); }}public void increase(a) {
            int oldValue;
            while(true) {
                // Get the old value of the shared variable. You can add a breakpoint on this line and modify data debugging to further understand
                oldValue = data;
                // CAS attempts to change data to the old value + 1, and returns false if the old value has been changed by another thread in the meantime
                if (unsafe.compareAndSwapInt(this, DATA_OFFSET, oldValue, oldValue +
                1)) {
                return; }}}public void decrease(a) {
        int oldValue;
        while(true) {
            oldValue = data;
            if (unsafe.compareAndSwapInt(this, DATA_OFFSET, oldValue, oldValue -
            1)) {
                return; }}}public int getData(a) {
        returndata; }}Copy the code

4.2 Optimistic locks and pessimistic locks

  • CAS is based on the idea of optimistic locking: at best, it doesn’t matter if another thread changes the shared variable, but I’ll try again if I’m at a disadvantage.

  • Synchronized is based on the idea of pessimistic lock: in the most pessimistic estimation, you have to prevent other threads from modifying the shared variable, you can’t change the lock after I change the lock, you can only have a chance.

4.3 Atomic operation classes

Juc (java.util.concurrent) provides atomic operation classes to provide thread-safe operations, such as AtomicInteger and AtomicBoolean, which are implemented using CAS technology + volatile.

AtomicInteger can be used to rewrite the previous example:

// Create an atomic integer object
private static AtomicInteger i = new AtomicInteger(0);
public static void main(String[] args) throws InterruptedException {
    Thread t1 = new Thread(() -> {
        for (int j = 0; j < 5000; j++) {
            i.getAndIncrement(); // get and increment i++
                // i.incrementAndGet(); // Increment and get ++ I}}); Thread t2 =new Thread(() -> {
        for (int j = 0; j < 5000; j++) {
            i.getAndDecrement(); // Get and decrement I --}}); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println(i); }Copy the code

In a Java HotSpot virtual machine, each object has an object header (including a class pointer and a Mark Word). Mark Word usually stores the hash code and generation age of this object. When locking, these information will be replaced with marker bits, thread lock record pointer, heavyweight lock pointer, thread ID and other contents according to the situation. But multi-threaded access is staggered (i.e., no contention), which can be optimized using lightweight locks. It’s like:

Student (Thread A) uses the book to occupy A seat, attends half of the class, leaves the room (CPU time is up), returns, finds that the book has not changed, indicating that there is no competition, and continues his class.

If another student (thread B) arrives during this time, thread A is notified of concurrent access, and thread A upgrades to A heavyweight lock,

Enter the heavyweight lock process.

The heavyweight lock is not as simple as using A textbook to hold A seat. You can imagine thread A putting an iron fence around the seat before leaving

Suppose two methods synchronize blocks, using the same object to lock them

static Object obj = new Object();
public static void method1(a) {
    synchronized( obj ) {
        // synchronize block Amethod2(); }}public static void method2(a) {
    synchronized( obj ) {
        // synchronize block B}}Copy the code

Each thread’s stack frame contains a lock record structure, which can store the Mark Word of the locked object

5.2 lock expansion

If the CAS operation fails during an attempt to add a lightweight lock, there is a case where another thread has added a lightweight lock to the object (in contention), and lock inflation is required to change the lightweight lock to a heavyweight lock.

static Object obj = new Object();
public static void method1(a) {
    synchronized( obj ) {
        / / the synchronized block}}Copy the code

5.3 weight lock

Spin can also be used to optimize for heavyweight lock contention, so that if the current thread spins successfully (i.e. the thread holding the lock has exited the block and released the lock), then the current thread can avoid blocking.

After Java 6, spin-locking is adaptive. For example, if an object has just performed a successful spin operation, it will spin more times, assuming that the probability of successful spin operation is high. Otherwise, they spin less or they don’t spin at all, but they’re smarter.

  • Spin takes up CPU time. Single-core CPU spin is wasteful. Multi-core CPU spin takes advantage.
  • For example, if a car stalls at a red light, it is equivalent to spinning (a short wait is a good deal) and blocking (a long wait is a good deal).
  • After Java 7, you can’t control whether to turn on the spin function

Spin retry successful

Spin retry failure

5.4 biased locking

Lightweight locks without contention (on their own thread) still need to perform the CAS operation each time they reenter. Java 6 introduced biased locking to further optimize: only the first time the CAS is used to set the thread ID to the Mark Word header of the object, then if the thread ID is found to be its own, there is no contention and no need to re-cas.

  • Undo bias requires upgrading the thread holding the lock to a lightweight lock, in which all threads are paused (STW)
  • Accessing an object’s hashCode also unlocks bias locks
  • If the object is accessed by multiple threads, but there is no contest, then the object biased to Thread T1 can still be biased to Thread T2 again. Rebiased resets the Thread ID of the object
  • Undo bias and rebias are done in batches, in units of class
  • If undo bias reaches a certain threshold, all objects of the entire class become unbiased
  • You can actively use -xx: -usebiasedlocking to disable biased locking

You can refer to this paper: www.oracle.com/technetwork…

149958.pdf

Suppose two methods synchronize blocks, using the same object to lock them

static Object obj = new Object();
public static void method1(a) {
    synchronized( obj ) {
        // synchronize block Amethod2(); }}public static void method2(a) {
    synchronized( obj ) {
        // synchronize block B}}Copy the code

5.5 Other Optimizations

1. Reduce the lock time

Synchronize code blocks as short as possible

2. Reduce the granularity of locks

Splitting a lock into multiple locks improves concurrency, for example:

  • ConcurrentHashMap
  • LongAdder is divided into base and cells. If there is no concurrent contention or if the array of cells is being initialized, CAS is used to accumulate values to base, if there is concurrent contention, it initializes the array of cells, how many threads are allowed to modify it in parallel, and then sums each cell in the array, Plus base is the final value
  • LinkedBlockingQueue Uses different locks for joining and leaving the queue, with only one lock efficiency relative to LinkedBlockingArray

high

3. Lock coarsening

Multiple loops into a synchronized block are not as good as multiple loops within a synchronized block

In addition, the JVM may be optimized to coarse-lock multiple appends into one (since they are all locking the same object, there is no need to re-enter multiple times).

new StringBuffer().append("a").append("b").append("c");
Copy the code

4. Eliminate the lock

The JVM performs an escape analysis of the code, such that a lock object is a local variable within a method and cannot be accessed by other threads, and the just-in-time compiler ignores all synchronization operations.

5. Read and write separation

CopyOnWriteArrayList

ConyOnWriteSet

Reference:

Wiki.openjdk.java.net/display/Hot…

Luojinping.com/2015/07/09/…

www.infoq.cn/article/jav…

www.jianshu.com/p/9932047a8…

www.cnblogs.com/sheeva/p/63…

Stackoverflflow.com/questions/4…