The volatile keyword for concurrent programming

The volatile keyword is arguably the lightest synchronization mechanism provided by the Java VIRTUAL machine, but it is not easily understood properly and completely, so many programmers are used to avoiding it and using synchronized when dealing with multithreaded data contention. Understanding the semantics of volatile variables is important for understanding other features of multithreaded operations, and in this article we will show you what the semantics of volatile are. Because volatile has more to do with the Java Memory Model (JMM), we will cover the Java Memory Model before we introduce volatile.

1. Memory model

“Let computer concurrent execution several computing tasks” and “more fully exploit the power of computer processors” cause-and-effect relationship between seems logical, in fact, the relationship between them is not as simple as I thought, one of the important source of complexity is the vast majority of computing tasks can only rely on processor can complete “computing”, The processor must at least interact with memory, such as reading operational data, storing operational results, and so on, and this I/O operation is very difficult to eliminate (you can’t do all the computation with registers alone). Because of the difference of several orders of magnitude between the computer’s memory and the processor’s speed, modern computer systems have to buffer between memory and the processor by adding a layer of caches that can read and write data as fast as possible: Copying the data needed for an operation to the cache allows the operation to proceed quickly. When the operation is complete, it is synchronized from the cache back to memory so that the processor does not have to wait for slow memory reads and writes.

Cache-based storage interaction resolves the processor-memory speed conflict well, but it also adds complexity to computer systems because it introduces a new problem: Cache Coherence. In a multiprocessor system, each processor has its own cache, and they share the same Main Memory, as shown in the figure below. When multiple processors work on the same main memory area, it is possible that their cache data will be inconsistent. If this happens, whose cache data will be used when synchronizing back to main memory? To solve the problem of consistency, it is necessary for each processor to follow some protocols when accessing the cache, and to operate according to the protocols when reading and writing. Such protocols include MSI, MESI (Illinois Protocol), MOSI, Synapse, Firefly, and Dragon Protocol. The term “memory model,” which will be mentioned several times in this article, can be understood as a process abstraction for reading and writing access to a particular memory or cache under a particular operating protocol. Physical machines with different architectures can have different memory models, and Java virtual machines have their own memory models, and the memory access operations described here are highly comparable to the cache access operations of hardware.

The interaction between processor, cache, and main memory

In addition to increasing the cache, the processor may optimize the out-of-order Execution Of input code in Order to maximize the number Of units within the processor. The processor reorganizes the out-of-order Execution Of input code after the computation to ensure that the result is consistent with the sequential Execution. However, it is not guaranteed that the sequence of each statement in the program is the same as that in the input code. Therefore, if there is a calculation task that depends on the intermediate results of another calculation task, the sequence cannot be guaranteed by the sequence of the code. Similar to out-of-order execution optimization for processors, there is a similar optimization for Instruction Reorder in the Just-in-time compiler for the Java virtual machine.

1.1 JMM Memory model

The Java Virtual Machine specification attempts to define a Java memory model to mask the differences in memory access between different hardware and operating systems, so that Java programs can achieve consistent memory access effects on various platforms. Prior to this, the mainstream programming language (such as C/C + +) directly using the memory model of physical hardware and operating system, therefore, due to differences in memory model, different platforms could lead to a program on a platform of concurrent completely normal, and the concurrent access on another platform is often wrong, so in some scenarios you must to write programs for different platforms.

1.2 Main Memory and Working Memory

The main goal of the Java memory model is to define the access rules for variables in a program, the low-level details of storing variables into and out of memory in the virtual machine. Variables include instance fields, static fields, and elements that make up array objects, but exclude local Variables and method parameters, which are private to the thread and will not be shared, so there will be no competition problems.

The Java Memory model specifies that all variables are stored in Main Memory. Each thread also has its own Working Memory (similar to the processor cache mentioned above). The Working Memory of the thread stores the main Memory copy of the variables used by the thread. All operations (reading, assigning, etc.) on variables must be carried out in the Working Memory. You cannot read or write variables directly from main memory. Different threads cannot directly access variables in each other’s working memory, and the transfer of variable values between threads needs to be completed through the main memory. The interaction among threads, main memory and working memory is shown in the figure.

1.3 Memory Interaction

The Java memory model defines the following eight operations to accomplish the specific protocol of interaction between main memory and working memory, namely the implementation details of how a variable is copied from main memory to working memory and synchronized from working memory back to main memory. Virtual machine implementations must ensure that each of the following operations is atomic and non-divisible (load, store, read, and write operations are allowed on some platforms with exceptions for double and long variables, yes).

Lock: A variable acting on main memory that identifies a variable as a thread-exclusive state.
Unlock: A variable that acts on main memory. It releases a locked variable so that it can be locked by another thread.
Read: A variable acting on main memory that transfers the value of a variable from main memory to the thread’s working memory for subsequent load action.
Load: Variable acting on working memory, which puts the value of the variable obtained from main memory by the read operation into a copy of the variable in working memory.
Use: variable applied to working memory, which passes the value of a variable in working memory to the execution engine. This operation is performed whenever the virtual machine reaches a bytecode instruction that requires the value of the variable to be used.
Assign: a working memory variable that assigns a value received from the execution engine to the working memory variable. This operation is performed whenever the virtual machine accesses a bytecode instruction that assigns a value to the variable.
Store: Variable applied to working memory that transfers the value of a variable in working memory to main memory for subsequent write operations.
Write: a variable operating on main memory that places the value of a variable in main memory obtained from the store operation in working memory.

If you want to copy a variable from main memory to working memory, read and load are performed sequentially. If you want to synchronize variables from working memory back to main memory, store and write are performed sequentially. Note that the Java memory model only requires that these two operations be executed sequentially, not consecutively. That is, other instructions can be inserted between read and load, and between store and write. For example, when accessing variables A and B in main memory, one possible order is read A, read B, load B, load A. In addition, the Java memory model specifies that the following rules must be met when performing the eight basic operations described above:

One of the read and load, store and write operations is not allowed to occur separately, that is, a variable is read from main memory but not accepted by working memory, or a write back is written from working memory but not accepted by main memory.
A thread is not allowed to discard its most recent assign operation, which means that after a variable has changed in working memory, it must synchronize the change back to main memory.
A thread is not allowed to synchronize data from the thread’s working memory back to main memory without any assign operation.
A variable can only be created in the main memory. It is not allowed to use an uninitialized variable in the working memory. In other words, the use and store operations on a variable must be performed before the load and use operations are performed on the variable.
A variable can be locked by only one thread at a time. However, the lock operation can be repeated by the same thread several times. After the lock operation is performed several times, the variable can be unlocked only after the same number of UNLOCK operations are performed.
If you perform a lock operation on a variable, the value of the variable will be emptied from working memory. Before the execution engine can use the variable, you need to re-initialize the value by performing a load or assign operation.
It is not allowed to unlock a variable that has not been previously locked by a lock operation, nor to unlock a variable that has been locked by another thread.
Before an unlock operation can be performed on a variable, the variable must be synchronized back to main memory (store, write).

These eight memory access operations and the above rule restrictions, along with the special provisions for volatile described later, have completely determined which memory access operations in A Java program are safe concurrently. Because this definition is so rigorous and cumbersome that it is difficult to practice, an equivalent judgment principle of this definition, the antecedent principle, is introduced below to determine whether an access is safe in a concurrent environment.

2. The three features of Volatile

Volatile guarantees visibility and order, but not atomicity.

2.1 atomic

Atomicity: An operation or operations that are either all performed without interruption or none performed at all.

Let’s start by looking at which atomic operations are and which are not atomic operations, just to get an idea:

int k = 5;  //代码1
k++;        //代码2
int j = k;  //代码3
k = k + 1;  //代码4
Copy the code

Of the four codes above, only code 1 is an atomic operation.

Code 2: Contains three operations. 1. Read the value of variable k. 2. Increase the value of k by 1. 3. Assign the calculated value to variable K.
Code 3: Contains two operations. 1. Read the value of variable k. 2. Assign the value of k to j.
Code 4: Contains three operations. 1. Read the value of variable k. 2. Increase the value of k by 1. 3. Assign the calculated value to variable K.

Note: When actually compiled into bytecode, the number of bytecode bars in this code may differ from the operands above, but to make it easier to understand, and because these operations are generally illustrative, I use these operations for analysis.

The above example is simply an analysis of several common situations. In terms of the low-level instructions (the 8 instructions mentioned above for inter-memory operations), atomic variable operations directly guaranteed by the Java memory model include read, load, assign, use, Store and write. We can generally think that access and read of basic data types are atomic. If a wider atomicity guarantee is needed, the Java memory model also provides lock and UNLOCK operations to meet this requirement, although the virtual machine does not open them up directly to the user. It does, however, provide higher-level bytecode instructions monitorenter and Monitorexit to implicitly use these two operations, which are reflected in Java code as the synchronized keyword, Thus, operations between synchronized blocks are also atomic.

2.2 the visibility

Visibility: Visibility means that when one thread changes the value of a shared variable, other threads immediately know about the change.

To get an idea of visibility, let’s take a look at the following example:

// thread A executes code int k = 0; //1 k = 5; Int j = k; int j = k; / / 3Copy the code

In the example above, if thread A executes first and thread B executes later, what is the value of j?

The answer is no. Because even though thread A has updated k to 5, the operation is done in thread A’s working memory, and the variables updated by working memory are not immediately synchronized back to main memory, the value of k obtained by thread B from main memory is uncertain. This is the visibility problem. After thread A makes changes to variable K, thread B does not immediately see the value changed by thread A.

The Java memory model provides visibility by relying on main memory as a transfer medium by synchronizing the new value back to main memory after the variable is modified and flushing the value from main memory before the variable is read. This applies to both volatile and common variables. Special rules for volatile ensure that new values are immediately synchronized to main memory and flushed from main memory immediately before each use. Thus, it can be said that volatile guarantees visibility of variables in multithreaded operations in a way that normal variables do not.

In addition to volatile, Java has two other keywords for visibility, synchronized and final. The visibility of a block is obtained by the rule that a variable must be synchronized back to main memory (store, write) before it can be unlocked. The visibility of the final keyword is: Fields modified by final are visible in the constructor once the initialization is complete and the constructor does not pass a reference to “this” (this reference escape is a dangerous thing because other threads may access the “half-initialized” object through this reference).

2.3 order

Ordering: All operations in a thread must follow the order of the program.

To get an idea of order, let’s look at the following example:

int k = 0; int j = 1 k = 5; // code 1 j = 6; / / code 2Copy the code

By orderliness, code 1 in this example should be executed before code 2, but is that really the case?

The answer is no. The JVM does not guarantee the order in which code 1 and code 2 are executed, because there is no data dependence between the two lines of code.

int k = 1; Int j = k; / / code 2Copy the code

In a single thread, will code 1 be executed before code 2?

The answer is yes, because code 2 depends on the execution of code 1, so the JVM does not reorder these two lines of code.

The Java language provides the keywords volatile and synchronized to ensure the order of operations between threads. Volatile contains semantics that forbid instruction reordering. Synchronized is acquired by the rule that a variable can only be locked by one thread at a time, which determines that two synchronized blocks holding the same lock can enter only serially.

Having introduced three important features of concurrency, the synchronized keyword appears to be a “universal” solution to any of the three needs. Indeed, much of the concurrency control can be done with synchronized. Synchronized’s versatility also indirectly contributes to its abuse by programmers, and the more versatile concurrency control is, the greater the performance impact is usually.

3. The reorder

3.1 What is reordering

Reordering is a process by which compilers and processors reorder instruction sequences to optimize program performance.

3.2 What are reorders

Compiler optimized reordering. The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program.
Instruction – level parallel reordering. Modern processors use instruction-level parallelism to superimpose multiple instructions. If there is no data dependency, the processor can change the execution order of the machine instructions corresponding to the statement.
Memory system reordering. Because the processor uses caching and read/write buffers, this makes the load and store operations appear to be out of order.

3.3 Why Reorder

To improve performance.

3.4 How Do I Disable Reordering

You can disable reordering of a particular type of handler by inserting a memory barrier instruction. For example, the volatile keyword mentioned in this article does this.

4. First come first

The Java language has a happens-before principle. This principle is very important because it is the primary basis for determining whether data is content-free or thread safe-and it allows us to resolve all the questions of whether two operations might collide in a concurrent environment with a few rules.

Now let’s see what the “prior occurrence” principle means. First occurrence is defined in the Java memory model of partial order relation between two operations, if A first occurred in operation, B is actually happening B before operation, operation impact can be operating B observed, “affect” including modify the Shared memory variable values, sent messages, call the method, etc. It’s not hard to understand, but what does it mean? We can illustrate this with an example of the three lines of pseudocode shown in the code.

// Execute k=1 on thread A; // execute j=k on thread B; // Execute k=2 in thread C;Copy the code

Assuming that the operation “k=1” in thread A takes place before the operation “j=k” in thread B, it can be determined that the value of variable j must be equal to 1 after the operation of thread B is executed. This conclusion can be drawn based on two aspects: one is that the result of “k=1” can be observed according to the principle of prior occurrence; Second, thread C has not “appeared”, and no other thread will modify the value of variable K after the operation of thread A. Now let’s think about thread C, we still have the antecedent relationship between thread A and thread B, and thread C appears between the operations of thread A and thread B, but thread C has no antecedent relationship with thread B, so what’s the value of j going to be? The answer is no! Both 1 and 2 are possible, because the effect of thread C on variable K may or may not be observed by thread B, in which case thread B has the risk of reading expired data, which does not have multithreaded safety.

Below are some “natural” antecedents under the Java memory model that already exist without the assistance of any synchronizer and can be used directly in coding. If the relationship between two operations is not in this column and cannot be deduced from the following rules, they are not guaranteed order and the virtual machine can reorder them at will.

Program Order Rule: In a thread, in Order of Program code, operations written earlier take place before those written later. To be precise, it should be the control flow sequence rather than the program code sequence, since branches, loops, and so on are to be considered.
Monitor Lock Rule: An UNLOCK operation occurs first when a subsequent Lock operation is performed on the same Lock. It must be emphasized here that the same lock, and “behind” refers to the chronological order.
Volatile Variable Rule: Writes to a volatile Variable occur first before reads to that Variable occur later, again in chronological order.
Thread Start Rule: The Start () method of a Thread object precedes every action of the Thread.
Thread Termination Rule: All operations ina Thread occur first when the Thread terminates. We can detect that the Thread has terminated by means of the end of thread.join () method, the return value of thread.isalive (), etc.
Interruption Rule: A call to the interrupt() method occurs when the interrupted Thread code detects that the Interruption has occurred. The interrupt() method detects that the Interruption has occurred.
Finalizer Rule: An object’s initialization completes (the end of constructor execution) before its Finalize () method.
Transitivity: If operation A precedes operation B and operation B precedes operation C, operation A precedes operation C.

These are the only antecedent rules in the Java language that do not require any synchronization. Here’s how to use these rules to determine whether operations are sequential or thread safe for reading or writing shared variables. The reader can also get a sense of the difference between chronological and antecedent in the following example.

private int value=0;
pubilc void setValue(int value){ 
    this.value=value;
}
public int getValue(){ 
    return value;
}
Copy the code

The above code is A perfectly normal set of getter/setter methods. Suppose there are threads A and B. Thread A calls setValue(1) first and thread B calls getValue() on the same object.

Since the two methods are called by thread A and thread B and are not in the same thread, the program order rule does not apply here. Because there is no synchronization block, lock and unlock operations do not occur naturally, so the pipe locking rule does not apply. Because the value variable is not decorated by the volatile keyword, the volatile variable rules do not apply; The thread start, terminate, interrupt, and object termination rules that follow are also completely irrelevant. Since there is no applicable antecedent rule, the last transitivity is also out of the question, so we can determine that although thread A is ahead of thread B in the operation time, we cannot determine the return result of thread B’s “getValue()” method, in other words, the operation is not thread-safe.

So how do you fix this? There are at least two simple alternatives: either define getter/setter methods as synchronized methods, which apply the pipe locking rule; Either make the value volatile. Since setter methods modify the value independently of the original value, and the use of the volatile keyword is applicable, the rules for volatile variables can be used to implement antecedent relationships.

Based on the above example, we can draw a conclusion: an operation that is “pre-occurring in time” does not mean that the operation will be “pre-occurring in time”. If an operation is “pre-occurring”, can it be inferred that the operation must be “pre-occurring in time”? Unfortunately, this inference is also not true, a typical example is the much-mentioned “instruction reorder”, which is illustrated in the code below.

Int I =1; int j=2;Copy the code

The two assignment statements in the code list are in the same thread. According to the program order rule, the operation “int I =1” occurs first in “int j=2”, but the code “int j=2” may be executed first by the processor. This does not affect the correctness of the principle of first occurrence, because we have no way of sensing this in this thread.

The above two examples together prove a conclusion: there is basically no relationship between the time order and the antecedent principle, so we should not be disturbed by the time order when we measure the concurrency safety problem, everything must be based on the antecedent principle.

conclusion

Work each thread has its own memory, the data in the memory is not real-time refresh back into main memory, so in the case of concurrent, likely to thread A changed its member variable k value, but the thread B didn’t read to thread A revised value, this is because the thread A working memory has not been refresh back into main memory, cause the thread B unable to read the latest values.
In working memory, each time a volatile variable is used, the latest value must be flushed from main memory. This ensures that the current thread can see the value of the volatile variable changed by other threads.
In working memory, each change to a volatile variable must be immediately synchronized back to main memory. This ensures that other threads can see their changes to the volatile variable.
Volatile variables are not optimized by instruction reordering, ensuring that code is executed in the same order as the program.
Volatile guarantees visibility, not atomicity, and partial order (only for variables that are volatile).
The purpose of instruction reordering is to improve performance. Instruction reordering only guarantees that the final execution result will not be changed under single thread, but cannot guarantee the execution result under multi-thread.
To implement the memory semantics of volatile, the compiler inserts a memory barrier into the instruction sequence to prevent reordering when the bytecode is generated.

Refer to the article

Java concurrency: volatile in detail