Fundamentals of the Java Memory Model (PART 1)

1. Two key problems with the concurrent programming model

Two key issues need to be addressed in concurrent programming:

How do threads communicate with each other

How are threads synchronized

Let’s explore the above two questions

Communication refers to the mechanism by which threads exchange information. In imperative programming, there are two communication mechanisms between threads: shared memory and messaging. In the shared memory concurrency model, threads share the common state of the program and communicate implicitly through the common state in write-read memory. In the concurrent model of messaging, there is no common state between threads, which must communicate by sending messages.

Synchronization is the mechanism used in a program to control the relative order in which operations occur between different threads. In the shared memory concurrency model, synchronization is performed explicitly. Programmers must specify that a method or block of code needs to be executed mutually exclusive between threads. In the concurrent model of message delivery, synchronization is implicit because the message must be sent before the message is received.

Java concurrency is a shared memory model in which communication between Java threads is always implicit and completely transparent to the programmer. If we don’t understand the workings of implicit thread communication when programming, we are likely to encounter all sorts of strange memory visibility problems.

2. Abstract structure of Java memory model

In Java, all instance fields, static fields, and array elements are stored in heap memory, which is shared between threads within the heap. Local variables, method definition parameters, and exception handler parameters are not shared between threads, they do not have memory visibility issues, and they are not affected by the memory model. Communication between Java threads is controlled by the Java Memory model, or JMM, which determines when a write by one thread to a shared variable is visible to another thread. From an abstract perspective, THE JMM defines an abstract relationship between threads and main memory: shared variables between threads are stored in main memory, and each thread has a private local memory where it stores copies of shared variables to read/write. Local memory is an abstraction of the JMM and does not really exist. It covers caching, write buffers, registers, and other hardware and compiler optimizations.

The following diagram illustrates the abstraction of local memory in the Java memory modelCopy the code

! Schematic diagram of Jva Memory model abstract structure.jpg](P3-juejin.byteimg.com/tos-cn-i-k3…) As you can see from the figure above, thread A and thread B must go through the following two steps if they want to communicate.

Thread A flusher the updated shared variable from local memory A to main memory.
Thread B goes into main memory to read the shared variable that thread A has updated before.
```
How do threads communicate with each otherCopy the code
```

As shown in the figure above, local memory A and B are copies of the shared variable X in main memory. So let’s say we start off with X equal to 0 in each of these three. When thread A executes, it temporarily stores the updated X value in its own local memory, A. When thread A and thread B need to communicate, thread A will first refresh the modified X value in its local memory to the main memory, and then the X value in the main memory becomes 1. Thread B then goes to main memory to read thread A’s updated X value, and thread B’s local memory X value also becomes 1. Taken as A whole, these two steps are essentially thread A sending messages to thread B, and this communication must go through main memory. The JMM provides Java visibility by controlling the interaction between main memory and local memory for each thread.

3. Reordering from source code to instruction sequence

When executing a program, the compiler and processor often reorder instructions to improve performance. There are three types of reordering. 1) Compiler optimized reordering. The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program. 2) Instruction-level parallel reordering. Modern processors use instruction-level parallelism to superimpose multiple instructions. If there is no data dependency, the processor can change the execution order of the machine instructions corresponding to the statement. 3) Memory system reordering. Because the processor uses caching and read/write caches, this can make load and store operations appear to be out of order.Copy the code

The sequence of instructions from the Java source code to the actual execution goes through the following three reorders1 above belongs to compiler reorder, 2 and 3 to handler reorder. These reorders can cause memory visibility problems in multithreaded programs. For compilers, the JMM’s compiler reordering rules prohibit reordering of certain types of compilers. In the case of handler reordering, the JMM’s handler reordering rules require the Java compiler to insert a memory barrier instruction of a specific type when generating an instruction sequence, which in turn disables a particular type of handler reordering. The JMM is a language-level memory model that ensures consistent memory visibility for programmers across compilers and processor platforms by disallowing certain types of compiler reordering and processor reordering.

4. Classification of concurrent programming models

Modern processors use write buffers to temporarily hold data written to memory. Write buffers keep the instruction pipeline running and avoid delays when the processor pauses to write data to memory. At the same time, the footprint on the memory bus is reduced by refreshing the write buffer in a batch manner and merging multiple writes to the same memory address in the write buffer. For all its benefits, the write buffer on each processor is visible only to its processor. This feature has a significant impact on the order in which memory operations are executed: the order in which processor reads/writes to memory may not be the same as the order in which memory reads/writes actually occur!

5. Happens-before profile

Starting with JDK1.5, Java uses the new jsr-133 memory model. Jsr-133 uses the concept of happens-beofre to illustrate memory visibility between operations. In the JMM, if the results of one operation need to be visible to another, there must be a happens-before relationship between the two operations. The two operations mentioned here can be either within a thread or between different threads.

1. The program order rule: for every action in a thread, happens-before is any subsequent action in the thread. 2. Monitor lock rule: a lock is unlocked, happens-before a lock is later locked. 3. Volatile variable rule: Writes to a volatile field are happens-before any subsequent reads to the volaile field. 4. If A happens-before B, and B happens-before C, then A happens-before C.Copy the code

Note: The happens-before relationship between two operations does not mean that the previous action must be executed before the latter! Happens-before simply requires that the previous action be visible to the next, and that the previous action precedes the second in order.

6. The reorder

We talked about compiler reordering and processor reordering, but what exactly is reordering?

Reordering is a process by which compilers and processors reorder instruction sequences to optimize program performance.

6.1 Data Dependencies

If two operations access the same variable, and one of them is a write operation, there is a data dependency between the two operations. There are three types of data dependencies:

The name of the	Code sample	instructions
Writing after reading	a = 1; b = a;	After writing a variable, read the position
Write after	a = 1; a = 2;	You write a variable, and then you write that variable
Got to write	a = b; b = 1;	After reading a variable, write the variable
In the above three cases, the result of the program’s execution will be changed simply by reordering the order of the two operations.
As mentioned earlier, the compiler and processor may reorder operations. The compiler and processor adhere to data dependencies when reordering, and do not change the order in which two operations with data dependencies are executed.
# # # # 6.2 the as – if – serial semantics
The as-if-serial semantics mean that the execution result of a program cannot be changed no matter how reordered it is. The compiler, runTime, and processor must comply with the AS-IF-Serial semantics.
In order to comply with the as-IF-serial semantics, the compiler and processor do not reorder operations that have data dependencies, because such reordering changes the result of program execution. But if there is no data dependency between the operations, they can be reordered by the compiler and processor.

Let’s use an example to illustrate

double pi = 3.14;	//A
double r = 1.0;		//B
double area = pi * r * r	//C

/** * There is A data dependency between A and C in the above code, and there is A data dependency between B and C. Therefore, C cannot be reordered before A and B in the final instruction sequence. * But there is no data dependency between A and B, and the compiler and processor can reorder the execution order between A and B * A -> B -> C * B -> A -> C * /Copy the code

The as-IF-serial semantics protect single-threaded programs. Compilers, runTime, and processors that adhere to the as-IF-Serial semantics collectively create the illusion that single-threaded programs are executed in program order. Compliance with as-IF-serial semantics allows us to write programs without worrying about reordering and memory visibility.

6.3 Program sequence rules

From the above example, we can see that there are three happens-before relationships

		1.A happens-before B
		2.B happens-before C
		3.A happens-before C
		
Copy the code

The three happens-before relationships here are derived from the transitivity of happens-before. A happens before B, but B happens before A. If A happens-before B, the JMM does not require that A be executed before B. The JMM simply requires that the results of the previous operation be visible to the latter, and that the former operation precedes the latter in order. Here the result of operation A does not need to be visible to B; Moreover, the execution result of A happens-before B is consistent with that of B happens-before A. In this case, the JMM considers this reordering not illegal, and the JMM allows it.

7. Consistency of order

7.1 Data competition and sequence consistency

Data contention can occur when programs are not properly synchronized. The Java Memory model specification defines data contention as:

Write a variable in one thread, read the same variable in another thread, and write and read are not sorted by synchronization.Copy the code

When the code contains data races, the execution of the program often produces counterintuitive results. If a multithreaded program can synchronize correctly, it will be a program with no data contention. The JMM guarantees memory consistency for properly synchronized multithreaded programs as follows. If the program is properly synchronized, the execution of the program will be sequentially consistent — that is, the execution of the program will be the same as the execution of the program in the sequence-consistent memory model.

7.2 Sequential consistent Memory model

Sequential consistent memory model has two characteristics:

1) All operations in a thread must be executed in program order 2) (whether or not the threads are synchronized) all threads see only a single order of operations. In the sequential consistent memory model, every operation must be performed atomically and immediately visible to all threads.Copy the code