Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

The concept, origin and specific rules of the happens-before principle of JMM Java memory model are introduced in detail.

The Java Memory Model (JMM) is a set of specifications and mechanisms defined by the Java virtual Machine specification. It is an abstract concept and does not exist. The goal of the JMM is to provide Java programmers with memory visibility assurance by controlling the interaction between main memory and the local memory (working memory) of each thread, so that multiple threads can correctly access shared variables. Java uses a concrete, improved and better understood happens-before principle to achieve this goal.

Two problems with concurrent programming

Two key issues that concurrent programming deals with are how threads communicate with each other and how threads synchronize.

1.1 communication

Communication refers to the mechanism by which threads exchange information.

  1. In imperative programming, there are two communication mechanisms between threads: shared memory and message passing.
  2. In the concurrent model of shared memory, threads share the common state of the program and communicate implicitly with each other by writing to read the common state in memory. The typical shared memory communication is through shared objects.

In the concurrent model of messaging, there is no common state between threads, and threads must communicate explicitly by explicitly sending messages. The typical messaging methods in Java are wait() and notify().

1.2 the synchronization

Synchronization is the mechanism that a program uses to control the relative order in which operations occur between different threads.

  1. In the concurrent model of shared memory, synchronization takes place explicitly. Programmers must explicitly specify that a method or piece of code needs to be executed mutually exclusive between threads.
  2. In the concurrent model of message delivery, synchronization is implicit because the message must be sent before the message is received.

Concurrency in Java uses a shared memory model, where communication between threads is completely transparent to the programmer. At the bottom of synchronization is the use of critical section objects, which means that when a thread is used to access a shared resource, the code segment must be exclusive to the resource and not allow other threads to access the resource.

Java memory model

2.1 Abstraction of memory model

The word “memory model” can be understood as, under the specific operation agreement, to read and write access specific cache memory or the process of abstraction, it defines the Shared memory system in a multithreaded program to read and write operation rules of behavior, by these rules to regulate the memory read and write operations, to ensure the correctness of the instruction execution. It’s about the processor, it’s about the cache, it’s about concurrency, it’s about the compiler. It solves the memory access problems caused by multi-level CPU cache, processor optimization and instruction rearrangement, and ensures the correctness (consistency, atomicity and order) of shared memory data in concurrent scenarios.

Physical computers with different architectures can have different memory models. Java virtual machine is a model of a complete computer, so this model naturally contains a memory model — also known as the Java memory model, and its memory access operations are highly comparable to the cache access operations of hardware.

The Java Memory Model (JMM) is a set of specifications and mechanisms defined by the Java virtual Machine specification. It is an abstract concept and does not exist. It screens out the memory access differences of various hardware and operating systems, so that Java programs in various platforms can achieve the consistency of memory access (visibility, order, atomicity), there is no need to customize the development program for each platform because of the differences in the memory model of physical machines on different platforms.

The original Java memory model was not very efficient, so it was refactored in Java version 1.5, which is still used in Java version 8.

2.2 Main memory and Working Memory

Java memory model’s main goal is to define the access rules of all variables in the program, which in the heart of the virtual machine variables stored in the memory and from the memory to retrieve such low-level details, such as how and when a thread can be seen by other threads changes the value of the Shared variables in the wake of how and when to synchronize access to Shared variables. Variables are different from Java programming in that they include instance fields, static fields, and elements that make up array objects, but not local Variables and method parameters.

The JMM defines an abstract relationship between threads and Main Memory: all shared variables are stored in Main Memory (which is the same name as the Main Memory used in physical hardware, and can be analogically used, but is only a portion of virtual machine Memory). Each thread has its own Working Memory (Working Memory, and speak in front of the processor cache analogy, but only an abstract concept, physical does not exist), thread’s Working Memory holds the variables used by the thread to main Memory copy copy (may copy for your reference types, object reference or a field, But I can’t make a copy of this object is the whole time), the thread of variables all the operations (read, assignment, etc.) must be done in working memory, and cannot be directly read/write variables in main memory (for volatile variables still have copies of the working memory, but because of its special sequence of operation rules, so it looks like directly in memory, speaking, reading and writing).

Different threads cannot directly access variables in each other’s working memory, and the transfer of variable values between threads needs to be completed through the main memory. The JMM provides Java programmers with memory visibility assurance by controlling the interaction between main memory and the local memory (working memory) of each thread.

The interaction among thread, main memory and working memory is shown as follows:

Here of main memory, working memory and the Java heap, stack, method of the area of the Java memory area is not the same level of memory, such as the two are basically no relationship, if both must force corresponding to that from the definition of variables, main memory, working memory, main memory is mainly corresponding to the object instance data portion of the Java heap, Working memory corresponds to a portion of the virtual machine stack. At a lower level, main memory corresponds directly to physical hardware memory, and a virtual machine (or even an optimization of the hardware system itself) may prioritize working memory in registers and caches for better performance, since it is working memory that the program primarily accesses and writes.

2.3 Memory Interaction

Physical machine cache and the interaction between the main memory has agreement, in the same way, about between main memory and working memory is also a Java specific communication protocol, that is, a variable how to copy from main memory into the working memory, how to go from the working memory synchronous back into main memory implementation details, such as the Java memory model defines the eight kinds of operation to complete the interaction, Virtual machine implementations must ensure that all eight operations are atomic and indivisible (load, store, read, and write are allowed on some 32-bit vm platforms for long and double variables).

  1. Lock: A variable that operates on main memory, marking a variable as a thread-exclusive state. A variable can only be locked by one thread at a time.
  2. Unlock: Applies to the main memory variable. It releases a locked variable before it can be locked by another thread.
  3. Read: Acts on main memory variables to transfer the value of a variable from main memory to working memory for subsequent load operations.
  4. Load: Acts on a working memory variable, placing the value of the variable read from main memory into a copy of the working memory variable (the copy is relative to the main memory variable).
  5. Use: Applies to a working memory variable, passing the value of a variable in the working memory to the execution engine whenever the virtual machine reaches a bytecode instruction that requires the value of the variable to be used.
  6. Assign: Applies to the working memory variable. It assigns a value to the working memory variable from the execution engine whenever the engine encounters a bytecode instruction that assigns a value to the variable.
  7. Store: Acts on working memory variables to transfer the value of a variable in working memory to main memory for subsequent write operations.
  8. Write: Applies to main memory variables to place the values of variables obtained from working memory by the store operation in main memory.

2.4 Synchronization rules for Interactive Operations

To ensure data consistency between dimms, the JMM must comply with the following rules when performing the previous eight operations:

  1. If you want to copy a variable from main memory to working memory: perform read and load operations sequentially. If you want to synchronize variables from working memory, main memory: Store and write operations are performed sequentially. The above operations refer to the steps required by the JVM to operate on a main memory, where lock and unlock can issue packets via bytecode instructions and our union, while lock and UNLOCK operations on variables involve a memory barrier. The JMM merely states that the execution must be sequential, but does not guarantee that it will be sequential, with additional instructions inserted in between.
  2. One of the read and load, store and write operations is not allowed to occur separately, that is, a variable is read from main memory but not accepted by working memory, or a write back is written from working memory but not accepted by main memory.
  3. The thread is not allowed to discard its most recent assign operation, which means that after a variable has changed in working memory, it must synchronize the change to main memory.
  4. A thread is not allowed to synchronize data from the thread’s working memory back to main memory without any assign operation.
  5. A new variable can only be created in main memory. It is not allowed to use an uninitialized load or assign variable in working memory. In other words, the use and store operations must be performed before a variable is assigned.
  6. A variable can only be locked by one thread at a time. However, the lock operation can be repeated several times by the same thread. The variable can be unlocked only when the same number of UNLOCK operations are performed.
  7. If you perform a lock operation on a variable, the value of the variable will be emptied from working memory, and the load or assign operation will be re-performed to initialize the value before the execution engine can use the variable.
  8. It is not allowed to unlock a variable that has not been previously locked by a lock operation, nor to unlock a variable that has been locked by another thread.
  9. Before an unlock operation can be performed on a variable, the variable must be synchronized back to main memory (store, write).

The above nine memory access operations and rule restrictions, combined with some special rules for volatile, have completely determined which memory access operations in a Java program are safe under concurrency, which is equivalent to the happens-before principle.

3. Improved happens-before principle

3.1 JMM design intent

From a JMM designer’s perspective, there are two key considerations when designing the JMM:

  1. Programmer’s use of memory models. Programmers want the memory model to be easy to understand and easy to program. Programmers want to write code based on a strong memory model.
  2. Compiler and processor implementations of memory models. Compilers and processors want the memory model to tie them down as little as possible so they can make as many optimizations as possible to improve performance. The compiler and processor want to implement a weak memory model.

Because these two factors are at odds with each other, the core goal of the JSR-133 expert group in designing the JMM was to find the right balance: on the one hand, to provide programmers with sufficient memory visibility; On the other hand, restrictions on compilers and processors should be as relaxed as possible. Jsr-133 uses the modified happens-before principle to achieve this goal. For the relationship between the Java memory model and the sequential consistent memory model, and the original happens-before memory model, see: Introduction to the RELATIONSHIP between the JMM and the sequential consistent and happens-before models.

Here’s the code:

double pi = 3.14;// A
double r = 1.0;// B
doublearea = pi * r * r;// C
Copy the code

The example code above for calculating the area of a circle has three happens-before relationships, as follows.

  1. A happens-before B.
  2. B happens-before C.
  3. A happens-before C.

In the 3 happens-before relationships, 2 and 3 are required, but 1 is not necessary. Therefore, THE JMM divides reordering prohibited by happens-before into the following two categories.

  1. Reordering that alters the results of program execution.
  2. Reordering that does not change the results of program execution.

The JMM has different strategies for these two different types of reordering, as follows.

  1. The JMM requires that the compiler and processor must forbid reordering that changes the results of program execution.
  2. The COMPILER and processor are not required by the JMM for reordering that does not change the result of program execution (the JMM allows such reordering).

Here is a schematic of the JMM design:

Two things can be seen from the figure above:

  1. The HAPPENs-before rules that the JMM provides to programmers meet their needs. JMM’s happens-before rule is not only straightforward, but also provides programmers with A strong enough guarantee of memory visibility (some memory visibility guarantees are not necessarily true, such as A happens-before B above).
  2. The JMM has as few constraints on the compiler and processor as possible. From the above analysis, we can see that the JMM is following a basic principle: the compiler and processor can be optimized as long as the execution results of the program are not changed (i.e., single-threaded programs and properly synchronized multithreaded programs). For example, if the compiler, after careful analysis, determines that a lock can only be accessed by a single thread, that lock can be eliminated. For example, if the compiler determines, after careful analysis, that a volatile variable can only be accessed by a single thread, the compiler can treat the volatile variable as a normal variable. These optimizations will not change the execution result of the program, but also improve the execution efficiency of the program.

3.2 Definition of modified happens-before

Starting with JDK 1.5, Java uses the new JSR-133 memory model. The Current Java memory model is based on the happens-before memory model rather than the sequentially consistent memory model, and has been enhanced somewhat. Because the happens-before memory model is a weakly constrained memory model, when multiple threads compete for access to shared data, it can result in unexpected results, some of which are acceptable to the Java memory model, and some of which are not.

Jsr-133 uses the concept of happens-before to specify the order of execution between two operations, illustrating memory visibility between operations. Because these two operations can be within a thread, can also be between different threads. Thus, the JMM can provide programmers with a guarantee of memory visibility across threads through happens-before relationships.

In the JMM, if the results of one operation need to be visible to another, there must be a happens-before relationship between the two operations.

Jsr-133 :JavaTM Memory Model and Threading Specification defines the happens-before relationship as follows:

  1. If one action happens-before the other, the execution result of the first action will be visible to the second action, and the execution order of the first action precedes the second action.
  2. The existence of a happens-before relationship between two operations does not mean that the specific implementation of the Java platform must be executed in the order specified by the happens-before relationship. The reorder is not illegal (that is, the JMM allows it) if the result of the reorder is the same as the result of the happens-before relationship.

1) above is the JMM’s promise to programmers. From A programmer’s perspective, the happens-before relationship can be understood this way: If A happens-before B, the Java memory model guarantees the programmer that the results of A’s operations will be visible to B, and that A takes precedence over B in execution order. Note that this is just a guarantee made by the Java memory model to the programmer that the order of execution is not necessarily as expected!

2) above is the JMM’s constraint on compiler and processor reordering. As mentioned earlier, the JMM follows a basic principle: the compiler and processor can be optimized as long as the execution results of the program (i.e., single-threaded programs and properly synchronized multithreaded programs) are not changed. The reason for this is that the programmer does not care whether the two operations are actually reordered, but that the semantics of the program execution cannot be changed (that is, the execution result cannot be changed). Thus, the happens-before relationship is essentially the same thing as the as-if-serial semantics.

3.3 the as – if – serial and happens-before

The as-if-serial semantics guarantee that the execution result of a program in a single thread is not changed, and the happens-before relationship guarantee that the execution result of a properly synchronized multithreaded program is not changed.

The as-if-serial semantics create a fantasy for programmers who write single-threaded programs: single-threaded programs are executed in sequence. The happens-before relationship creates a fantasy for programmers who write properly synchronized multithreaded programs: properly synchronized multithreaded programs are executed in the order specified by happens-before.

The purpose of this is to make the execution of a program as efficient as possible without changing the result of its execution.

3.4 Happens-before for single and multiple threads

Happens-before under a single thread, the bytecode sequence naturally involves a happens-before relationship: because a single thread shares a working memory, there is no problem with data consistency.

Happens-before The first bytecode in the program control flow path, that is, after the first bytecode has been executed, the result of the operation is visible to the later bytecode. However, this does not mean that the former necessarily precedes the latter. In fact, they might be reordered if the latter did not depend on the results of the former.

Happens-before: Since each thread has A copy of the shared variable, thread 1 updates the value of the shared variable of operation A, thread 2 starts operation B, and the result of operation A may not be visible to operation B if the shared variable is not synchronized. Happens-before is not enough.

3.5 Specific happens-before principles in the Java memory model

Below are some “natural” antecedents under the Java memory model that already exist without the assistance of any synchronizer and can be used directly in coding. If the relationship between two operations is not in this column and cannot be deduced from the following rules, they are not guaranteed order and the virtual machine can reorder them at will.

  1. Program Order Rule: In a thread, in Order of Program code, operations written earlier take place before those written later. To be precise, it should be the control flow sequence rather than the program code sequence, since branches, loops, and so on are to be considered.
  2. Monitor Lock Rule: An UNLOCK operation occurs first when a subsequent Lock operation is performed on the same Lock. It must be emphasized here that the same lock, and “behind” refers to the chronological order.
  3. Volatile Variable Rule: Writes to a volatile Variable occur first before reads to that Variable occur later, again in chronological order.
  4. Thread Start Rule: The Start () method of a Thread object precedes every action of the Thread.
  5. Thread Termination Rule: All operations ina Thread occur first when the Thread terminates. We can detect that the Thread has terminated by means of the end of thread.join () method, the return value of thread.isalive (), etc.
  6. Interruption Rule: A call to the interrupt () method occurs when the interrupted Thread code detects that the Interruption has occurred. The interrupt () method detects that the Interruption has occurred.
  7. Finalizer Rule: An object’s initialization completes (the end of constructor execution) before its Finalize () method.
  8. Transitivity: If operation A precedes operation B and operation B precedes operation C, operation A precedes operation C. According to the happens-before rule above, it is obvious that memory visibility is usually achieved by using either the volatile keyword or the locking mechanism.

Note:

  1. The existence of a happens-before relationship between two operations does not mean that the specific implementation of the Java platform must be executed in the order specified by the happens-before relationship. The reorder is not illegal (that is, the JMM allows it) if the result of the reorder is the same as the result of the happens-before relationship, where the previous operation is visible to the latter.
  2. The JMM follows a basic principle: as long as you don’t change the results of your program (single-threaded programs and properly synchronized multithreaded programs), the compiler and processor can optimize as much as they like. The reason for this is that the programmer does not care whether the two operations are actually reordered, but that the semantics of the program execution cannot be changed (that is, the execution result cannot be changed). Thus, the happens-before relationship is essentially the same thing as the as-if-serial semantics.

3.6 Happens-before relationship with JMM

As shown in the figure above, a happens-before rule corresponds to one or more compiler and handler reordering rules. For Java programmers, the happens-before rule is straightforward and prevents them from having to learn complex reordering rules and how to implement them in order to understand the memory visibility guarantees provided by the JMM.

In computers, software technology and hardware technology have a common goal: to maximize parallelism without changing the results of program execution. Compilers and processors follow this goal, and as you can see from the definition of happens-before, the JMM follows this goal as well.

References:

  1. JSR133 Specification
  2. In-depth Understanding of the Java Virtual Machine
  3. The Beauty of Concurrent Programming in Java
  4. The Art of Concurrent Programming in Java
  5. Practical Java High Concurrency Programming

If you don’t understand or need to communicate, you can leave a message. In addition, I hope to like, collect, pay attention to, I will continue to update a variety of Java learning blog!