This article was first published in 51CTO technology Stack public account author Chen Caihua article reprint exchange please contact [email protected]Copy the code

Recently, I relearned the “In-depth Study of Java Virtual Machine” and reorganized the fuzzy knowledge in the previous Java memory model. This article mainly introduces the background of the problems generated by the model, the problems to be solved, the processing ideas, the relevant implementation rules, and the link between each other. It is hoped that readers will come away from this article with a relatively clear understanding of the Java memory model architecture, and know how and why.

1 Memory model generates background

Before introducing the Java memory model, let’s take a look at concurrency issues in physical computers. Understanding these issues can help clarify the context in which the memory model is generated. There are many similarities between the concurrency problem of physical machine and that of virtual machine. The solution of physical machine is of great reference significance to the realization of virtual machine.

The concurrency of a physical machine is abnormal

  • Hardware efficiency issues

Most of the tasks that a computer processor performs cannot be “calculated” by the processor alone. At a minimum, the processor needs to interact with memory, such as reading the data of the operation and storing the results of the operation. This I/O operation is difficult to eliminate.

Because a computer’s storage device is orders of magnitude faster than the processor, modern computer systems avoid processors waiting for slow reads and writes to complete by adding a layer of caching that reads and writes as fast as possible as the processor. The cache acts as a buffer between memory and the processor: the data needed for the operation is copied into the cache, so that the operation can run quickly, and then synchronized back to memory from the cache when the operation is finished.

  • Cache consistency issues

Cache based storage system interaction solves the processor and memory speed conflict well, but it also brings more complexity to the computer system because of the introduction of a new problem: cache consistency.

In a multiprocessor system (or a single-processor multi-core system), each processor (each core) has its own cache, and they share the same Main Memory. When multiple processors’ computing tasks involve the same main memory area, their cache data may be inconsistent. Therefore, each processor needs to follow some protocols when accessing the cache and operate according to the protocols when reading and writing to maintain the consistency of the cache.

  • Code out of order to perform optimization problems

In order to make full use of the operation units inside the processor and improve the operation efficiency, the processor may execute the input code in out-of-order, and recombine the results of out-of-order execution after calculation. Out-of-order optimization can ensure that the execution results are consistent with the results of sequential execution in a single thread. However, there is no guarantee that the order in which each statement in the program is computed is the same as the order in the input code.

Out-of-order execution technology is the optimization that the processor makes against the original order of code in order to improve the operation speed. In the single-core era, processors are guaranteed to make optimizations that do not lead to execution results that are far from their intended goal, but this is not the case in a multi-core environment.

Multicore environment, if there is a nuclear computing tasks depend on the task of another nuclear program to calculate the intermediate results, to read and write data and did not do any protective measures, so the sequentiality and can’t order to ensure that the code, the processor final results and the logic we get the result may be quite different.

As shown in the figure above, logic B in CORE2 of the CPU depends on logic A in Core1 to execute first

  • Normally, logic A is executed before logic B is executed.
  • If the processor performs the optimization out of order, flag may be set to true in advance, causing logic B to execute before logic A.

2. Composition analysis of Java memory model

Memory model concept

In order to better solve the problems mentioned above, the memory model is summarized and proposed. We can understand the memory model as the process abstraction of reading and writing to a particular memory or cache under a particular operation protocol.

Physical computers of different architectures can have different memory models, and the Java Virtual Machine has its own memory model. The Java Virtual Machine specification seeks to define a Java Memory Model (JMM) to mask the Memory access differences of various hardware and operating systems, so that Java programs can achieve consistent Memory access across various platforms. There is no need to customize development programs for different platforms because of differences in the memory models of physical machines on different platforms.

More specifically, the Proposed goal of the Java memory model is to define the rules for accessing variables in a program, such as the low-level details of storing and fetching variables from memory in the virtual machine. Variables are different from those used in Java programming in that they include instance fields, static fields, and elements that make up numeric objects, but not local Variables and method parameters, which are thread-private. (If the local variable is of type Reference, the object it refers to can be shared by threads in the Java heap, but the reference itself is in the local variable table of the Java stack, which is thread-private).

Composition of the Java memory model

  • Main Memory The Java Memory model specifies that all variables are stored in Main Memory (the name here is the same as the name used to describe physical hardware and can be likened to each other, but this is only part of virtual machine Memory).

  • Working Memory Each thread has its own Working Memory, which holds a copy of the shared variables in the main Memory of the variables used by the thread. Working memory is an abstract concept of the JMM and does not really exist. It covers caching, write buffers, registers, and other hardware and compiler optimizations.

Java memory model abstract diagram is as follows:

Concurrency issues with JVM memory operations

The following Java memory model execution processing will focus on solving these two problems:

  • 1 working memory data consistency individual threads operation data will be saved using the Shared variables in main memory copy, when multiple threads of computing tasks involve the same Shared variables, will lead to their own copy of the Shared variables, if that happens, the data synchronization who back into main memory to a copy of the data is accurate? The Java memory model mainly ensures the consistency of data through a series of data synchronization protocols and rules, which will be described in detail later.

  • 2 Instruction reorder optimization In Java, reorder is usually used by a compiler or runtime environment to reorder the execution of instructions in order to optimize program performance. There are two types of reordering: compile-time reordering and run-time reordering, corresponding to the compile-time and run-time environments, respectively. Similarly, the reorder of instructions is not arbitrary, it must meet the following two conditions:

    • The just-in-time compiler (and processor) needs to ensure that the program complies with the AS-if-serial property. Colloquially, in the case of a single thread, you want to give the illusion that the program is executing sequentially. That is, the reordered execution results must be consistent with the sequential execution results.
    • 2 If data dependency exists, reordering is not allowed

In a multi-threaded environment, if there is a dependency between the threads processing logic, it is possible that the result will be different from the expected result due to the reordering of instructions. We will expand on how the Java memory model addresses this situation later.

3 Java memory interaction

Before understanding the series of protocols and special rules of the Java memory model, let’s first understand the intermemory operations in Java.

Interactive operation process

To better understand memory interaction, let’s look at how to synchronize values between threads using thread communication as an example:

Thread 1 and thread 2 each have a copy of the shared variable x in main memory. Initially, all three have a value of 0 for x. There are two main steps involved in synchronizing to thread 2 after updating x to a value of 1 in thread 1:

  • Thread 1 refreshes the updated value of x in thread working memory to main memory
  • Thread 2 goes into main memory and reads the x variable that thread 1 has updated before

As a whole, these two steps are thread 1 sending a message to thread 2, and this communication must go through main memory. All operations (reads, assignments) by threads on variables must be performed in working memory. Different threads cannot directly access variables in each other’s working memory, and the transfer of variable values between threads needs to be completed through the main memory, so that each thread can provide visibility of shared variables.

Basic operations of memory interaction

The Java memory model defines the following eight operations to implement the specific protocol of interaction between main memory and working memory, i.e. how a variable is copied from main memory to working memory and synchronized back from working memory to main memory.

Virtual machine implementations must ensure that each of the operations described below is atomic and non-divisible. (For double and long variables, load, store, read, and write are allowed exceptions on some platforms, as described later.)

Eight basic operations

  • Lock applies to variables in main memory and identifies a variable as being exclusive to a thread.
  • An unlock applies to a variable in main memory. It releases a variable from the locked state so that it can be locked by another thread.
  • Read A variable that acts on main memory to transfer the value of a variable from main memory to the thread’s working memory for subsequent load actions.
  • Load on a variable in working memory. It puts the value of the variable obtained by the read operation from main memory into a copy of the variable in working memory.
  • Use a variable that acts on the working memory. It passes the value of a variable in the working memory to the execution engine, which is performed whenever the virtual machine gets a bytecode instruction that requires the use of the variable.
  • Assign a variable assigned to the working memory. It assigns a value received from the execution engine to the variable in the working memory. This operation is performed whenever the virtual opportunity is assigned to a bytecode instruction that assigns a value to the variable.
  • Store A variable that acts on working memory to transfer the value of a variable in working memory to main memory for subsequent write operations.
  • Write a variable that acts on main memory. It places the value of the variable obtained from the working memory by the store operation into the main memory variable.

4 Java memory model running rules

4.1 Memory Interaction Three features of basic operations

Before introducing the specific eight basic operations of memory interaction, it is necessary to introduce the three features of the operation. The Java memory model is built around how to handle these three features in the concurrent process. Here is a brief introduction of the definition and basic implementation, and then the analysis will be carried out step by step.

  • Atomicity means that one or more operations are either all performed and the process is not interrupted by anything, or none at all. Even when multiple threads are executing together, once an operation is started, it will not be disturbed by other threads.

  • Visibility means that when multiple threads access the same variable and one thread modifies the value of the variable, the other threads can immediately see the changed value. As explained in “Interactive flow” above, the JMM achieves visibility by relying on main memory as a medium by synchronizing the new value back to main memory after the working memory of the variable is modified in thread 1, and by refreshing the variable value from main memory in thread 2 before the variable is read.

  • Ordering rules can be implemented in the following two scenarios: intra-thread and inter-thread

    • From a thread’s point of view, instructions are executed in a way called as-if-serial, which has been used in sequential programming languages.
    • When one thread “watches” other threads concurrently execute non-synchronous code, any code can be executed across due to instruction reordering optimization. The only constraint that matters is that synchronized blocks and volatile fields remain relatively orderly for synchronized methods.

The set of rules for running the Java memory model may seem a bit tedious, but in summary, it is built around the characteristics of atomicity, visibility, and order. The bottom line is that the program works as expected in an environment of shared variables, data consistency across multiple threads of working memory, concurrent threads, and instruction reordering optimization.

4.2 happens-before relations

The happens-before relationship describes the memory visibility of two operations: If operation A happens-before operation B, then the result of A is visible to B. The analysis of happens-before relationships needs to be divided into single-threaded and multi-threaded cases:

  • The sequence of happens-before bytecodes under a single thread naturally contains a happens-before relationship: since a single thread shares working memory, there is no data consistency issue. Happens-before bytecode in the program control flow path, i.e. the result of the operation is visible to the bytecode after the bytecode completes its execution. However, this does not mean that the former is necessarily executed before the latter. In fact, if the latter does not depend on the results of the former, they may be reordered.

  • Since each thread has A copy of the shared variable, if no synchronization is done on the shared variable, thread 1 updates the value of the shared variable A and thread 2 begins to perform operation B. The result of operation A may not be visible to operation B.

To facilitate program development, the Java memory model implements the following operations that support the happens-before relationship:

  • Rules of program order Within a thread, in the order of code, the actions that are written in the front happen before the actions that are written in the back.
  • Locking Rules An unLock operation happens-before is followed by a lock operation on the same lock.
  • Volatile variable rules write operations to a variable happens-before followed by reads to that variable.
  • If operation A happens-before operation B, and operation B happens-before operation C, then operation A happens-before operation C can be obtained.
  • The start() method of the Thread object happens before every action of the Thread.
  • The call to the threadinterrupt () method of the thread interrupt rule happens-before the code of the interrupted thread detects that an interrupt event has occurred.
  • All actions in a Thread happen before the Thread terminates. We can detect that the Thread has terminated by the return value of thread.join () and thread.isalive ().
  • The completion of an object’s initialization happens before the start of its Finalize () method

4.3 Memory Barrier

How do you ensure the order and visibility of the underlying operations in Java? You can get past the memory barrier.

A memory barrier is an instruction inserted between two CPU instructions to prevent processor instructions from being reordered (like a barrier) to ensure order. In addition, to serve as a barrier, it also causes the processor to write the main memory value to the cache before writing or reading the value, clearing the invalid queue and ensuring visibility.

Here’s an example:

Store1; Store2; Load1; StoreLoad; // Memory barrier Store3; Load2; Load3;Copy the code

For the above set of CPU instructions (Store for write and Load for read), the Store before the StoreLoad barrier cannot be reordered with the Load after the StoreLoad barrier. But instructions before and after the StoreLoad barrier are interchangeable, that is, Store1 can be interchangeable with Store2 and Load2 can be interchangeable with Load3.

There are four common barriers

  • LoadLoad barrier: For statements such as Load1; LoadLoad; Load2, ensure that the data to be read by Load1 is read before the data to be read by Load2 and subsequent read operations are accessed.
  • StoreStore barrier: For statements such as Store1; StoreStore; Store2. Before Store2 and subsequent writes, ensure that writes to Store1 are visible to other processors.
  • LoadStore barrier: For statements such as Load1; LoadStore; Store2, before Store2 and subsequent write operations are performed, ensure that the data to be read by Load1 is finished.
  • StoreLoad barrier: For statements such as Store1; StoreLoad; Load2, before Load2 and all subsequent reads are performed, ensure that the write to Store1 is visible to all processors. It has the largest overhead of the four barriers (flushing the write buffer, emptying the invalidation queue). In most processor implementations, this barrier is a universal barrier that doubles as the other three memory barriers.

The use of memory barriers in Java isn’t always easy to see in normal code, such as blocks with the volatile and synchronized keywords (more on that later), or through the Unsafe class.

4.4 Eight Operation Synchronization Rules

To ensure data consistency between memory modules, the JMM must comply with the following rules when performing the preceding eight basic operations:

  • Rule 1: To copy a variable from main memory to working memory, perform read and load operations sequentially. To synchronize a variable from working memory to main memory, perform store and write operations sequentially. But the Java memory model only requires that the above operations be performed sequentially; there is no guarantee that they must be performed consecutively.
  • Rule 2: Do not allow read and load, Store, or write operations to occur separately.
  • Rule 3: Do not allow a thread to discard its most recent assign operation, that is, variables changed in working memory must be synchronized to main memory.
  • Rule 4: Do not allow a thread to synchronize data from working memory back to main memory for no reason (no assign operation has occurred).
  • Rule 5: A new variable can only be created in main memory. Do not use a variable that has not been initialized (load or assign) in working memory. A load or assign operation must be performed before a use or store operation can be performed on a variable.
  • Rule 6: Only one thread is allowed to lock a variable at a time, but the lock operation can be repeated by the same thread for many times. After multiple lock operations, the variable can be unlocked only by performing the unlock operation for the same number of times. So lock and unlock must be paired.
  • Rule 7: If you lock a variable, the value of the variable will be cleared from working memory, and the variable must be reloaded or assigned before the execution engine can use it.
  • Rule 8: You cannot unlock a variable if it has not been locked by the lock operation. It is also not allowed to unlock a variable that is locked by another thread.
  • Rule 9: You must synchronize a variable to main memory (store and write) before you can unlock it.

These rules may seem a bit cumbersome, but they’re not hard to understand:

  • Rule 1 and rule 2 copies of the Shared variables in working memory as the main memory, primary memory variable values of synchronous to the working memory needs to read and the load, the value of a variable in the working memory synchronous back into main memory to store and write use together, the two groups operating in their respective is is a regular and orderly match, don’t allow appear alone.
  • Rules 3, 4 Since shared variables in working memory are copies of main memory, to ensure data consistency, variables in working memory must be synchronized back to main memory when they are reassigned by the bytecode engine. If a variable in working memory has not been updated, synchronization back to main memory is not allowed for no reason.
  • Rule 5 Since shared variables in working memory are copies of main memory, they must be born from main memory.
  • Rules 6, 7, 8, 9 In order to safely use a variable in concurrent situations, a thread can lock a variable exclusively in main memory. Other threads are not allowed to use or unlock the variable until it is unlocked by the thread.

4.5 Special Rules for Volatile variables

Volatile means unstable, volatile, volatile, volatile, volatile, volatile, volatile.

Semantics of volatile

Volatile has two main semantics

Semantics 1 guarantees visibility

This ensures memory visibility of operations on this variable by different threads.

Ensuring visibility is not the same as ensuring the safety of concurrent operations on volatile variables.

The process of a thread writing a volatile variable:

  • 1 Changes the value of the copy of the volatile variable in the thread’s working memory
  • 2 Refresh the value of the changed copy from working memory to main memory

The process of a thread reading a volatile variable:

  • Read the latest value of volatile variables from main memory into the thread’s working memory
  • Read copies of volatile variables from working memory

However, if multiple threads refresh the updated variable value back into main memory at the same time, the value may not be the expected result:

For example: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0: volatile int count = 0

  • Procedure Step 1 The thread reads the latest count value from the main memory
  • Step 2 The execution engine increments the count value by 1 and assigns the value to the thread working memory
  • Step 3 The thread working memory saves the count value to the main memory. It is possible that at a certain time, the two threads read the value 100 in Step 1 and the value 101 after the execution of Step 2. At last, 101 is refreshed twice and saved to the main memory.

Semantic 2 disallows instruction reordering

To be specific, the rules against reordering are as follows:

  • When a program performs a read or write operation on a volatile variable, all changes to the preceding operation must have been made and the results are visible to subsequent operations. The operation behind it has certainly not been done;
  • When performing instruction optimization, statements that are accessed on volatile variables cannot be executed after them, nor can statements that follow volatile variables be executed before them.

Ordinary variables only guarantee that the method will get the correct result wherever it depends on the result of the assignment, not that the order of the assignment is the same as the order of execution in the program code.

Here’s an example:

volatile boolean initialized = false; // The following code thread reads the configuration information. When the read is complete, initialized is set totrueTo notify other threads that the configuration is availabledoSomethingReadConfg();
initialized = true; // The following code thread B executes // wait for initialized to betrueThread A has initialized the configuration informationwhile(! initialized) { sleep(); } // Use the configuration information initialized by thread AdoSomethingWithConfig();
Copy the code

In the above code, if initialized is not volatile, it is possible that due to the instruction reorder optimization, This causes the last sentence of “initialized = true” in thread A to be executed before “doSomethingReadConfg()”. This causes the code using configuration information in thread B to be incorrect. The volatile keyword prevents this from happening because of its non-reordering semantics.

Implementation principle of volatile variables

When bytecode is generated at compile time, a memory barrier is added to the instruction sequence to ensure the insertion. The following is a conservative JMM memory barrier insertion strategy:

  • Insert a StoreStore barrier before each volatile write. In addition to ensuring that writes before the barrier and writes after the barrier cannot be reordered, the barrier also ensures that any read or write before volatile writes will be committed before volatile.

  • Insert a StoreLoad barrier after each volatile write. In addition to preventing volatile writes from being reordered with subsequent reads, this barrier refreshes the processor cache, making write updates to volatile variables visible to other threads.

  • Insert a LoadLoad barrier after each volatile read. In addition to preventing volatile reads from being reordered from previous writes, this barrier also refreshes the processor cache so that volatile variables read the latest value.

  • Insert a LoadStore barrier after each volatile read operation. In addition to preventing volatile reads from being reordered from any subsequent writes, this barrier refreshes the processor cache, making updates to volatile variables written by other threads visible to the thread on which the volatile reads were written.

Usage scenarios of volatile variables

In summary, it’s “write once, read everywhere,” where one thread updates the variables, and the other reads only the variables (without updating them) and executes the logic based on their new values. For example, status flag bits are updated and observer model variable values are published.

4.6 Special Rules for Final variables

As we know, final member variables must be initialized at declaration time or in the constructor, otherwise a compilation error will be reported. The visibility of the final keyword means that once a final-modified field is initialized, it can be seen correctly by other threads without synchronization. This is because once initialization is complete, the value of the final variable is immediately written back to main memory.

4.7 Synchronized special rules

Controls the reading and writing of data through an area of code enclosed by the synchronized keyword:

  • Read Data When a thread enters the region to read variable information, the data can not be read from the working memory, only from memory, to ensure that the read is the latest value.
  • Write Data A write operation to a variable in the synchronization area. When it leaves the synchronization area, the data in the current thread is flushed to memory, ensuring that the updated data is visible to other threads.

4.8 Special Rules for Long and Double variables

The Java memory model requires atomicity for lock, unlock, read, load, assign, use, Store, and write operations. However, for 64-bit data types (long and double), the model defines relatively loose rules: Allow a VM to divide a read/write operation on 64-bit data that has not been volatile into two 32-bit operations. In other words, the LOAD, Store, read, and write operations of 64-bit data types are not guaranteed to be atomic. Because of this nonatomicity, it is possible for other threads to read the value of the “32-bit half variable” that has not been synchronized.

However, in the actual development, Java memory model strongly recommends that the virtual machine implement the reading and writing of 64-bit data as atomic. At present, commercial virtual machines on various platforms choose to treat the reading and writing of 64-bit data as atomic operations. Therefore, we don’t usually need to write code that declares long and double as volatile.

5 concludes

Because the Java memory model involves the series of rules, most of the articles online is parsed to these rules, but many did not explain why need these rules, these rules, in fact this is not conducive to beginners learning, easily around in these complicated rules don’t know why, here’s my little personal experience to learn knowledge:

Knowledge is not equivalent to just understand the process of learning and memory of knowledge, but to the input and output connection is established to solve the problem of the knowledge, knowledge is the essence of problem solving, so before learning to understand the problem, understand the problem to output and output, and knowledge is input to the output of a relational mapping. The learning of knowledge should be combined with a large number of examples to understand this mapping relationship, and then compress the knowledge. Hua Luogeng once said: “Read a book thick, then read thin”, which explains this truth. First, understand the knowledge with a large number of examples, and then compress the knowledge.

Take learning the Java memory model as an example:

  • Understand the problem, define the input and output first understand what the Java memory model is, what does it do, what problem does it solve
  • Understand the in-memory model family of protocols with a number of examples to understand the protocol rules
  • A large number of rules are compressed to ensure data consistency between memory copies through the data synchronization protocol, while preventing reordering from affecting the program.

I hope that’s helpful.

More wonderful, welcome to pay attention to the author’s public account [Distributed System Architecture]


Learning more about the Java Virtual Machine

Take down the Java virtual machine

Java Core Technology 36 lecture

Synchronization and the Java Memory Model — Doug Lea

In-depth understanding of the Java memory model

Java memory barriers and visibility

Principles of memory barriers and synchronized and Volatile

Ali cloud recently began to issue vouchers, new and old users can get free, new registered users can get 1000 yuan vouchers, old users can get 270 yuan vouchers, I suggest that we all get a, anyway, free to get, maybe in the future need? Ali cloud vouchers to receive…

Popular activities High performance cloud server preferential level power companies on the performance of cloud host 2-5 fold…