preface

Previously we talked about the causes of thread insecurity in concurrent programming, mainly due to the visibility of shared variables, reordering, atomicity, and a little bit about the memory model. What is the memory model? Why is it necessary to understand the Java memory model? Let’s talk about the Java memory model in this article

What is the Java Memory model

The main goal of the Java memory model is to define the access rules for variables in a program, the low-level details of storing variables into and out of memory in the virtual machine. The variables here are different from the variables in Java programming. They only include instance fields, static fields and elements that form array objects. In fact, they are called instance variables and member variables in Java, but they do not include local variables and method parameters.

In addition, the memory model also describes the multithreading mechanism for reading and writing shared variables, as we have mentioned in previous articles, so here we will repeat. The Java memory model specifies that all variables are stored in main memory and that each thread has its own working memory. Thread’s working memory holds the main memory of the variables of the thread used in the copy of a copy of the thread to Shared variables all the operations must be done in working memory, no direct operation of main memory variables, but variable copy to local memory, in local memory operation has been completed, then the synchronous back into main memory, Different threads cannot directly access variables in each other’s working memory.

It is important to note that the hierarchy of main memory and working memory is not the same as the heap, stack, and method area of JVM runtime memory. On a more basic level, main memory corresponds directly to physical hardware memory. Working memory, on the other hand, is something that the virtual machine (or hardware, or operating system) may prefer to store in registers and cache first.

JMM’s solution to concurrency problems

The Definition of the Java memory model revolves around atomicity, visibility, and reordering of variables, with rules designed to ensure atomicity, visibility, and order of shared variables.

  • Atomicity: We say that an operation on a shared variable must be atomic, because if it is not atomic, then multithreading can cause data inconsistencies without additional synchronous operations.

  • Visibility: We usually talk about memory visibility. If two threads A and B operate on A shared variable D at the same time, and thread B does not get the result of thread A’s operation on D in time, we say that thread A’s operation is invisible to thread B’s operation.

  • Reordering: In multithreaded environments, operations on shared variables that are non-atomic can be out of order.

Java provides two language-level keywords Synchronized and Volatile to address reordering, atomicity, and visibility issues.

The static keyword can also be used to solve the problem, but it is not related to the memory model. The static keyword has a special loading mechanism.

The volatile keyword is the lightest synchronization mechanism provided by the Java memory model, ensuring the ordering and visibility of shared variables for multithreaded operations. The synchronized keyword, on the other hand, is a mutex that ensures that only one method or code block can enter the critical region at any one time during runtime, and that the memory visibility of shared variables is guaranteed. Because the protected area can only be accessed by one thread at a time, the atomicity problem is also guaranteed. You can see synchronized in action in the picture below.

Java memory model analysis DCL

Let’s take a look at the use of volatile and synchronized as a concrete example

The following is a way of writing a double-checked singleton pattern, which we rarely use in practice, but which we can use specifically to analyze synchronized and volatile. In practice, we should use the singleton pattern of enumerations, static code blocks, and static inner classes more.

public class Singleton {

private static Singleton instance = null;

private int a;

private Singleton(a) {

a = 4;

}

public static Singleton getInstance(a) {

if (instance == null) { // 1. First check

synchronized (Singleton.class) { / / 2

if (instance == null) { // 3. Second check

instance = new Singleton(); / / 4

}

}

}

return instance;

}

}

Copy the code

The above method is not correct, the above method is not guaranteed to be a singleton in the case of multiple threads, let’s analyze.

Instance = new Singleton(); instance = new Singleton(); Instead of an atomic operation, we know that new an object has the following steps.

A. Create object instances and allocate memory space.

B. Initialize object headers and attributes.

C. Assign a reference to an object to instance.

If thread A has reached Step 4 at this point, since step 4 is not an atomic operation, it is possible that A reorder has occurred between operations B and C, causing the object to be directly assigned to instance without completing the initialization of its attributes. When thread B reaches step 1, it finds that instance is not null and directly returns Singlotan. However, thread B gets an incomplete object. The modification is made by using volatile, and with volatile step 4 will not be reordered.

Memory model guarantees atomicity

Here I would like to introduce atomicity separately.

The Java memory model defines eight atomic operations to perform operations on memory.

  1. Read: This transfers the value of a variable from main memory to the thread’s working memory for later load.
  2. Load: Puts the variable values from the read operation from main memory into a copy of the variable in working memory
  3. Use: Passes the value of a variable in working memory to the execution engine, which is used whenever the virtual machine reaches a bytecode instruction that needs to use the value of the variable.
  4. Assign: a working memory variable that assigns a value received from the execution engine to the working memory variable. This operation is performed whenever the virtual machine accesses a bytecode instruction that assigns a value to the variable.
  5. Store: Variable applied to working memory that transfers a variable value from working memory to main memory for subsequent write operations.
  6. Write: a variable operating on main memory that puts the value of the variable from the working memory for the store operation into the main memory variable.
  7. Lock identifies a variable as a thread-exclusive state
  8. Unlock: Release a variable that is in a locked state

Instructions 1-6 above, which are used by Java programs to interact with computers, are at the assembly level.

int x = 10;             //语句1

int y = x; //语句2

x++; //语句3

x = x + 1; //语句4

Object z = new Object();5 / / statement

Copy the code

In each of the above cases, only statement 1 is an atomic operation. To illustrate, statement 1 uses only an assign instruction to do this, because it simply assigns a constant to x. But in other cases, multiple instructions are required. Although the memory model can guarantee the atomicity of the eight individual instructions, there is no way to guarantee the atomicity of the instructions combined together. The atomicity of these combined instructions can only be achieved by additional operations, such as locking.

The memory model dictates that to copy a variable from main memory to working memory, read and load operations are performed sequentially, and to synchronize variables back to main memory, store and write operations are performed sequentially. Note that the memory model only specifies sequential execution, but not sequential execution. That is, other instructions can be inserted between read and load, store and write. For example, the main memory variables A and B can be accessed in the order of read A, read B, load B, load A. This also explains why atomicity leads to thread insecurity.

Happens-befor for memory models

If all of the ordering of your code had to be done through volatile and synchronized, it would be cumbersome for users. When we write code, we don’t have to worry about reordering our code because the Java language has a happens-before principle that determines whether data is competing. We can solve all the problems of whether there may be conflicts between two operations in a concurrent environment in a package with a few simple rules. Here are the rules:

  1. Procedural order rule: In a thread, happens-before actions written earlier than actions written later, in code order.
  2. Monitor lock rule: Unlock operations on a monitor lock must be performed before the same monitor lock is locked.
  3. Volatile variable rule: Writes to a volatile variable are happens-before reads to that variable.
  4. Transitivity: if A happens-before B, and B happens-before C, then it can be concluded that A happens-before C
  5. Thread start rule: The start () method is called on a thread and must be executed before the thread performs any operation.
  6. Thread termination rule: Any operation in a thread must be performed before other threads detect that the thread is terminated.
  7. Finalizer rule: The constructor of an object must complete before starting the object finalizer.
  8. Interrupt rule: A call to the threadinterrupt method, happens-before the interrupt thread’s code detects the occurrence of the interrupt event.

Happens-befor illustrates memory visibility between operations. == If the result of one operation needs to be visible to another operation, there must be a happens-befor relationship between the two operations. The two operations can be within one thread or between different threads. == A happens-before relationship between two operations does not mean that they must be executed in the order specified by the happens-before principle. The reorder is not illegal if the result of the reorder is the same as the result of the happens-before relationship.

As-if-serial for memory model

As-if-serial is a lot easier to understand than happen-befor. All operations can be reordered for optimization, but the result of the reordered operation must not be changed. The compiler, runtime, and processor must obey the as-if-serial semantics. Note that as-if-serial is only guaranteed for single-threaded environments, not multithreaded environments.

What to make of the above statement, for example

int a = 1;

int b = 2;

Copy the code

There is no data dependency between the two assignment operations, so the two operations can be reordered. It is possible to assign a value to variable B first and then to variable A. But if the operation is

int a = 1;

int b = a + 1;

Copy the code

These two operations cannot be reordered. Because the value of variable B depends on variable A. And only in a single-threaded environment. In a multi-threaded environment, there is no guarantee.

Volatile for memory models

The JMM rule definition for volatile reads and writes:

  1. Write memory semantics: When writing a volatile variable, the JMM flusher the shared variable from the thread’s local memory to main memory as soon as the operation is complete.
  2. Read memory semantics: When a volatile variable is read, the JMM invalidates the thread’s local memory, and the thread then reads the shared variable from main memory.

JMM reordering rules for volatile variables

  1. If the first operation is a volatile read, no matter what the second operation is, it cannot be reordered. This operation ensures that operations that follow volatile reads are not reordered by the compiler to those that follow volatile reads.
  2. If the second operation is a volatile write, then no matter what the first operation is, it cannot be reordered. This operation ensures that operations prior to volatile writes are not reordered by the compiler after volatile writes.
  3. If the first operation is volatile write and the second operation is volatile read, reorder cannot be performed.

How does the JMM implement the semantic rules for volatile

Implementation of visibility

If a volatile variable is declared volatile, the JVM sends an instruction prefixed with Lock to the processor when the variable is written, ensuring that the main memory is immediately updated if other threads make changes to the volatile variable. In a multiprocessor environment, in order to ensure that each processor cache is consistent, each processing will be spread by sniffer on the bus data to check whether its own cache expiration, when the processor found himself cache line corresponding to the memory address is modified, will replace the current processor cache line set to invalid state, when the processor wants to modify the data operation, Forces data to be re-read from system memory into the processor cache. This step ensures that any declared volatile variables obtained by other threads are retrieved from main memory.

Implementation of orderliness

When the bytecode is generated, the compiler inserts a memory barrier into the sequence of instructions to prevent a particular type of handler from reordering. The JMM adopts a conservative strategy with the following rules:

  • Insert a StoreStore barrier before each volatile write. Ensure that all normal writes preceding volatile writes are flushed to main memory before volatile writes.
  • Insert a StoreLoad barrier after each volatile write. Ensure that all normal writes preceding volatile writes are flushed to main memory before volatile writes.
  • Insert a LoadLoad barrier in front of each volatile read. Disallows the processor to reorder volatile reads from normal reads below.
  • Insert a LoadStore barrier after each volatile read. Disallows the processor to reorder volatile reads from normal writes below.

Final memory model

For final domains, the compiler and processor follow two reordering rules.

  1. There is no reordering between a write to a final field within a constructor and a subsequent assignment of a reference to the constructed object to a reference variable.
  2. There is no reordering between the first reading of a reference to an object containing a final field and the subsequent first reading of the final field.

Take a look at the above two rules using examples and the following analysis.

public class FinalTest {

int i; // Common variables

final int j; / / final variables

static FinalTest obj;



public void FinalTest(a) { // constructor

i = 1; // Write the normal field

j = 2; / / write final domain

}



public static void writer(a) { // Write thread A executes

obj = new FinalTest();

}



public static void reader(a) { // Read-thread B executes

FinalTest object = obj; // Read the object reference

int a = object.i; // Read the normal field

int b = object.j; / / read the final domain

}

}

Copy the code

Final domain write reordering rules

Reordering rules that write final fields prohibit reordering of final fields out of the constructor. The implementation of this rule has two aspects.

  1. The JMM forbids the compiler from reordering writes to final fields outside the constructor.

  2. The compiler inserts a StoreStore barrier after the final field is written, but before the constructor return. This barrier prevents the processor from reordering the writes of final fields out of the constructor.

    The following one this execution sequence is what might happen

The writes to the normal field are reordered by the compiler outside the constructor, and the reads to the normal field read the pre-initialization values. This is not the case with final fields, which are “rule-bound” in the constructor to ensure that the thread fetching the value gets the correct result. Reordering rules for writing final fields ensure that an object’s final field is properly initialized before its reference is visible to any thread, a guarantee that normal fields do not have.

Final domain read reordering rules

In a thread, the JMM disallows reordering of handlers by first reading an object reference and by first reading the final field that the object contains (note that this rule applies only to processors). The compiler inserts a LoadLoad barrier before reading final field operations.

There is an indirect dependency between the first read object reference and the first read of the final field that the object contains. Because the compiler obeys indirect dependencies, the compiler does not reorder these two operations. Most processors also abide by indirect dependencies, and most processors do not reorder these two operations. This rule is specifically for a few processors that allow reordering of operations that have indirect dependencies (such as alpha processors).

Operations on the normal field of the read object are reordered by the handler to precede the read object reference. Reading A normal field that has not yet been written by writer thread A. This is an incorrect read. A reorder that reads A final field will “qualify” the read field after it has been initialized by thread A, which is A correct read.

Reorder rules for reading final fields ensure that references to objects containing the final field of an object are read before reading the final field of the object. In this example program, if the reference is not NULL, the final field of the reference object must have been initialized by thread A.

Reordering rules for writing final fields impose the following constraints on the compiler and processor if the final field is a reference type:

There is no reordering between writing to the member field of a final reference object inside a constructor and then assigning a reference to the constructed object outside the constructor to a reference variable. This rule also reverses the validation of why volatile is necessary in the DCL singleton.

conclusion

With that said, we basically have a clear idea of the Java memory model. We know what the Java memory model is and what it can do, but in real development we only use certain keywords. Why learn the memory model?

Me according to my personal view is that, even though the development can only use to write the keyword, but understand the memory model for the realization of these key words or some specification or necessary, understand the Java memory model, can help us to write concurrent programs, let us to judge the safety of the code, determine whether the code is thread-safe, Reduce unnecessary errors. In addition, when I encounter concurrent problems, I can also help us quickly locate the problems and provide timely solutions. So understanding the memory model is an essential step in writing good Java concurrent programs.

Reference:

Java Concurrent Programming

In-depth Understanding of the Java Virtual Machine

Understanding the Java Memory Model in Depth