Three main features of concurrent programming

atomic

For an operation or multiple operations, either all operations are performed without interruption, all operations are performed, or none of them are performed.

For access to basic data types, reads and writes are atomic (with possible exceptions for longs and doubles).

If you need a wider range of atomicity guarantees, you can use the synchronized keyword.

visibility

When a variable modifies a shared variable, other threads can immediately see the updated value.

In addition to volatile, which guarantees visibility of shared variables, both synchronized and final provide visibility.

Synchronized: a variable must be synchronized back to main memory before unclock is executed.

Final: Once a field modified by final is initialized in the constructor and the constructor does not pass a reference to this, the value of the final field is visible to other threads.

order

That is, the order in which the program is executed is in the order in which the code was written. Volatile ensures order by prohibiting instruction reordering. The synchronized keyword also guarantees order, obtained by the rule that a variable can be locked by only one thread at a time.

What is the CPU cache model

Why do caches exist?

When the computer executes the program, each instruction is executed in the CPU, and in the execution of the instruction process, it is bound to involve the reading and writing of data. Because the program is running in the process of temporary data is stored in main memory of (physical memory), then there is a problem, because the CPU execution speed quickly, and write and read data from memory to memory data process compared to the speed of the CPU executes instructions are much more slowly, so if any time to the operation of the data will be made by interaction and memory, Can greatly slow down the execution of instructions.

In order to solve the problem of CPU processing speed and memory mismatch, CPU Cache appeared.

Photo source: JavaGuide

Cache consistency issues

When a program is running, it copies the data needed for an operation from main memory to the CPU’s cache. Then the CPU can read and write data directly from its cache when performing a calculation. When the operation is complete, the data in the cache is refreshed to main memory.

Running in a single thread is fine, but running in a multi-threaded environment is problematic. For a simple example, look at this code:

 

i = i + 1;
Copy the code

According to the above analysis, it can be divided into the following steps:

  • Read the value of I from main memory and copy a copy to the cache.
  • CPU execute execute add 1 to I to write data to the cache.
  • After the operation, the data in the cache is flushed to memory.

What can happen in a multithreaded environment?

  • Initially, both threads read the value of I and store it in their respective CPU caches.
  • Thread T1 increments by 1 and writes the latest value of I, 1, to memory.
  • The value of I in the cache of thread T2 is still 0, and the increment operation is performed, and the latest value of I, 1, is written to memory.

The final result is I = 1 instead of I = 2, concluding that if a variable is cached across multiple cpus (which is typically the case with multithreaded programming), then cache inconsistencies are possible.

How to resolve cache inconsistency

There are generally two solutions to cache inconsistencies (both at the hardware level) :

By locking the bus with LOCK#

In the early days of cpus, cache inconsistencies were solved by placing a LOCK# lock on the bus. Since the CPU communicates with other components through the bus, locking the bus prevents other cpus from accessing other components (such as memory) so that only one CPU can use the variable’s memory. For example, if a thread is executing I = I +1, and the LCOK# lock is issued on the bus during the execution of this code, then the other CPU can read the variable from the memory of variable I until the code is fully executed, and then perform the corresponding operation. This solves the cache inconsistency problem.

However, there is a problem that during the lock of the bus, other cpus cannot access the memory, resulting in inefficiency, hence the following cache consistency protocol.

Through the cache consistency protocol

One of the best known is Intel’s MESI protocol, which ensures that copies of shared variables used in each cache are consistent.

When the CPU to write information, if found operating variables are Shared variables, namely there are also a copy of the variable in the other CPU, sends a signal to notify the other CPU will be the variable cache line for invalid state, so when other CPU need to read the variable, found himself in the cache cache the variable cache line is invalid the sniffer mechanism: Each processor checks if its own cached value is out of date by sniffing the data propagated across the bus, and it reads it back from memory.

Based on the MESI consistency protocol, each processor needs to constantly iterate from main memory sniffing and CAS, and ineffective interactions can lead to bus bandwidth spikes and bus storms.

What is the JMM memory model

JMM [Java Memory Model] : The Java Memory Model is defined in the Java Virtual Machine specification. The Java Memory Model is standardized, masking the differences between underlying computers, so that Java programs can achieve consistent Memory access effects on various platforms.

It describes the rules for accessing various variables (thread shared variables) in a Java program, as well as the low-level details of storing and reading variables from memory in the JVM.

< p style=”margin: 0px; padding: 0px; padding: 0px;” The Java memory model does not restrict the execution engine from using the processor’s registers or cache to speed up instruction execution, nor does it restrict the compiler from reordering instructions. That is, in the Java memory model, there are also cache consistency issues and instruction reordering issues.

The provisions of the JMM

All shared variables are stored in main memory. Variables here refer to instance variables and class variables. Local variables are not included, because local variables are thread private, so there is no contention problem.

Each thread has its own working memory (similar to the cache above). All thread operations on variables must be done in working memory, not directly in main memory.

Each thread cannot access the working memory of other threads.

Java guarantees three major features

atomic

In Java, reads and assignments to variables of primitive data types are atomic operations, that is, they are uninterruptible and either performed or not performed.

To better understand the above statement, take a look at these four examples:

 

x = 10; //1 y = x; //2 x ++; //3 x = x + 1; / / 4Copy the code
  1. Only statement 1 is atomic: it directly assigns the value 10 to x, meaning that the thread executing the statement writes the value 10 directly to working memory.
  2. Statement 2 actually consists of two operations: reading the value of x, and then writing the value of x to the working memory. Although the two steps are atomic operations respectively, they do not count as atomic operations together.
  3. Statements 3 and 4 say: read the value of x, increment by 1, and write the new value.

Points to note:

  • On a 32-bit platform, reading and assigning 64-bit data requires two operations, and atomicity cannot be guaranteed. In today’s 64-bit JVMS, reading and assigning 64-bit data is guaranteed to be atomic.
  • The Java memory model only guarantees that basic reads and assignments are atomic operations, and that atomicity for broader operations can be achieved through synchronized and Lock.

visibility

Java provides the volatile keyword to ensure visibility.

When a shared variable is volatile, it guarantees that the value is immediately updated to main memory, and that it will read the new value in memory when another thread needs to read it.

Visibility is also guaranteed by synchronized and Lock, which ensure that only one thread at a time acquies the Lock and executes the synchronization code, flushing changes to variables into main memory before releasing the Lock. So visibility is guaranteed.

order

In the Java memory model, the compiler and processor are allowed to reorder instructions, but the reordering process does not affect the execution of a single-threaded program, but affects the correctness of multithreaded concurrent execution.

In Java, order can be guaranteed through the volatile keyword, as well as through synchronized and Lock.

The Java memory model has some innate orderlessness, provided the two operations satisfy the happens-before principle, from Understanding the Java Virtual Machine in Depth:

  • Program order rules: within a single thread, according to the code sequence, in front of the first operation in writing on the back of the operation [let the program look like according to the code executed sequentially, the virtual machine will only there is no data dependencies instructions for reorder, can only guarantee the correctness of the single thread of execution result, multithreading results correctness cannot guarantee 】
  • Lock rule: An unLock operation occurs first and a subsequent lock operation occurs with the same lock volume
  • Volatile variable rules: Writes to a variable occur first before reads to that variable occur later
  • Transfer rule: If operation A precedes operation B and operation B precedes operation C, it follows that operation A precedes operation C
  • Thread start rule: The start() method of the Thread object occurs first for each action of the Thread
  • Thread interrupt rule: A call to the threadinterrupt () method occurs before code in the interrupted thread detects the occurrence of an interrupt event
  • Thread termination rules: All operations in a Thread occur before the Thread terminates. Thread termination can be detected by the end of thread.join () method and the return value of thread.isalive ()
  • Object finalization rule: The finalization of an object occurs first at the beginning of its Finalize () method

If the order of two operations cannot be deduced from the happens-before principle, they are not guaranteed to be ordered, and the virtual machine can reorder them at will.

The problem solved by Volatile

  • This ensures visibility when different threads operate on shared variables (member variables of a class, static member variables of a class). When one thread changes the value of a variable, the new value is immediately visible to other threads.
  • Disallow instruction reordering.

For a simple example, look at this code:

 

Thread 1 Boolean volatile stop = false; while(! stop){ doSomething(); } // thread 2 stop = true;Copy the code
  1. Thread 1 and thread 2 each have their own working memory. Thread 1 and thread 2 each first copy the value of the stop variable into their working memory.
  2. The shared stop variable is volatile. Thread 2 changing stop to true will immediately write to main memory.
  3. Thread 2 writes to main memory, causing the cache line stop in thread 1’s working memory to be invalid.
  4. The cache line for the stop variable in thread 1’s working memory is invalid, causing thread 1 to read the stop value from main memory again.

Does volatile guarantee atomicity? How to solve it?

Volatile does not guarantee atomicity. For example, adding I ++ to a volatile variable does not guarantee correctness in multiple threads.

Solutions:

  • Use the synchronized keyword or Lock to ensure that a block of code can only be executed by one thread at a time.
  • Use atomic classes under the JUC package, such as AtomicInteger, etc. Atomic uses CAS for Atomic operations.

How volatile is implemented

The following excerpt is from Understanding the Java Virtual Machine:

A look at the assembly code generated with and without the volatile keyword showed that the volatile keyword added a lock prefix.

The LOCK prefix directive actually acts as a memory barrier (also known as a memory fence) that provides three functions:

  • It ensures that instruction reordering does not place subsequent instructions in front of the memory barrier, nor does it place previous instructions behind the barrier; That is, by the time the memory barrier instruction is executed, all operations in front of it have been completed;
  • It forces changes to the cache to be written to main memory immediately.
  • If it is a write operation, it invalidates the corresponding cache line in the other CPU.

The difference between volatile and synchronized

Volatile reads have the same performance cost as normal variables, but writes are slower because they require many memory-barrier instructions to be inserted into the native code to keep the processor from executing out of order. Even so, the overall cost of volatile is lower than that of locking in most scenarios.

  • Volatile applies only to variables, while synchronized modifies methods and blocks of code.
  • Volatile guarantees visibility, but not atomicity. Synchronized can guarantee both. If you just do multiple thread assignments to a shared variable and nothing else, volatile is recommended because it is more lightweight.
  • The volatile keyword addresses the visibility of variables across multiple threads, while the synchronized keyword addresses the synchronization of access to resources across multiple threads.

Conditions for use of volatile

Two conditions must be met to use volatile:

  • Writes to variables do not depend on the current value.
  • This variable is not contained in an invariant with other variables.

Volatile and double-checked locks implement singletons

Implement the singleton pattern with double-checked locks:

 

Public class Singleton {private volatile static Singleton instance; Private Singleton() {} public static Singleton getInstance() {public static Singleton getInstance() { Synchronized (singleton.class) {if (instance == null) {instance = new Singleton(); } } } return instance; }}Copy the code

Reason for using volatile: Prevents instruction reordering.

instance= new Singleton(); This step is an instantiation process, which is actually divided into three parts:

  1. Allocate memory for instance:memory = allocate();
  2. Instantiate instance.ctorInstance(memory);
  3. Point instance to the allocated memory address.instance = memory;

Due to the reordering nature of the JVM, instructions may be executed in 1,3,2 order. In a multithreaded environment, it is possible that a thread might get an uninitialized instance.

Example: after thread A executes 1 and 2, thread B calls getInstance and finds that instance is not null and returns the instance that did not execute instruction 3.