Second interview of JD Java R&D Post: Talk about Java Memory region and Memory Model (JMM)

Java memory regions are different from the memory model, where the Jvm runtime stores data in regions that emphasize the partitioning of memory space.

The Java Memory Model (JMM) defines the abstract relationship between threads and main Memory, that is, the JMM defines how the JVM works in computer Memory (RAM). If we want to understand concurrent programming in Java, we must first understand the Java Memory Model.

Java runtime data area

It is well known that the Java virtual machine has automatic memory management mechanism. If memory leakage and overflow problems occur, you must understand how the VM uses memory.

Below is the JVM memory layout after JDK8.

The memory area before JDK8 looks like this:

In the HotSpot JVM, the persistent generation is used to hold metadata for classes and methods, as well as constant pools such as Class and Method. Whenever a class is first loaded, its metadata is put into the persistent generation.

Permanent generation has a size limit, so if the loaded classes is too much, will likely result in the permanent generation of memory, namely the evil Java. Lang. OutOfMemoryError: PermGen, therefore we have to do to the virtual machine tuning.

So why was PermGen removed from HotSpot JVM in Java 8? I summarize two main reasons:

Because the PermGen memory overflow, often causing annoying Java. Lang. OutOfMemoryError: PermGen, so the JVM developers hope that this area of memory can be managed more flexibly, not so often. Removing PermGen will facilitate the fusion of HotSpot JVM and JRockit VM, since JRockit has no permanent generation.
For various reasons above, PermGen was eventually removed, the method area moved to Metaspace, and string constants moved to the Java Heap.

Quote from https://www.sczyh30.com/posts/Java/jvm-metaspace/

Program counter

The Program Counter Register is a small memory space that can be thought of as a line number indicator of the bytecode being executed by the current thread.

Because Java virtual machine multithreading is implemented by the way threads alternate and allocate processor execution time, only one processor core executes instructions in one thread at any given time.

Therefore, in order to recover to the correct execution position after thread switching, each thread needs to have an independent program counter, which is not affected by each other and stored independently. We call this kind of memory area “thread private” memory.

If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed. If the Native method is being executed, this counter value is null (Undefined). This memory region is the only one where the Java Virtual Machine specification does not specify any OutOfMemoryError cases.

Java virtual machine stack

Like program counters, the Java Virtual Machine Stack is thread-private and has the same lifetime as a thread.

The virtual machine Stack describes the memory model of Java method execution: each method execution creates a Stack Frame (the basic data structure of method execution) for storing information about local variables, operand stacks, dynamic links, method exits, and so on. The process of each method from invocation to completion corresponds to the process of a stack frame being pushed into and out of the virtual machine stack.

In an active thread, only frames at the top of the stack are valid, called current stack frames. The executing method is called the current method, and the stack frame is the basic structure in which the method runs. When the execution engine is running, all instructions can only operate on the current stack frame.

More interview questions, welcome to pay attention to the public number Java interview questions selected

1. Local variation scale

A local variable table is an area where method parameters and local variables are stored. Local variables have no preparation phase and must be explicitly initialized. If the method is non-static, the instance reference of the object to which the method belongs is stored at the index[0] position, with a reference variable of 4 bytes, followed by parameters and local variables. The STORE instruction in the bytecode instruction is to write the calculated local variation in the operation stack back to the storage space of the local variation table.

The virtual machine stack specifies two exceptions: StackOverflowError is thrown if the stack depth requested by the thread is greater than the depth allowed by the virtual machine; If the virtual stack can scale dynamically (as most Java virtual machines currently do), OutOfMemoryError will be thrown if sufficient memory cannot be allocated while scaling.

2. The operation stack

The operation stack is a bucket stack with an empty initial state. During method execution, various instructions write and extract information onto the stack. The EXECUTION engine of the JVM is a stack-based execution engine, where the stack refers to the operation stack. The bytecode instruction set is defined based on the stack type, and the depth of the stack is in the stack property of the method meta-information.

I ++ and ++ I:

I ++ : takes I from the local variator and pushes it onto the operation stack. Then increments I in the local variator by 1 and uses the top value of the operation stack. Finally, the top value of the operation stack is used to update the local variator so that the thread reads the value from the operation stack before it increments.
++ I: first increment the I of the local variable scale by 1, then take out and push into the operation stack, and then take out the top value of the operation stack for use. Finally, update the local variable scale with the top value of the stack, and the thread reads the value after increment from the operation stack.

I ++ is not an atomic operation, and even volatile is not thread-safe, because it is possible for I to be removed from the local variable table (memory), pushed onto the operation stack (register), increments in the operation stack, and updates the local variable table with the top of the stack (register update written into memory) in three steps. Volatile guarantees visibility and ensures that each read from the local variable table is the latest value, but it is possible that these three steps may be interrupted by three steps from another thread, resulting in a data overwrite problem that causes the value of I to be smaller than expected.

3. Dynamic linking

Each stack frame contains a reference to the current method in the constant pool to support dynamic concatenation during method calls.

4. The method returns the address

There are two exits when a method executes:

Normal exit, that is, normal execution of RETURN bytecode instructions to any method, such as RETURN, IRETURN, ARETURN, etc.
Abnormal exit.

In any exit case, the method is returned to where it was currently called. Method exit is equivalent to eject the current stack frame. There are three possible ways to exit:

The return value is pushed into the upper call stack frame.
The exception message is thrown to a stack frame that can handle it.
The PC counter points to the next instruction after the method call.

Local method stack

The Native Method Stack is very similar to the virtual machine Stack. The difference is that the virtual machine Stack performs Java methods (that is, bytecode) services for the virtual machine, while the Native Method Stack serves the Native methods used by the virtual machine. The Sun HotSpot VIRTUAL machine simply combines the local method stack with the virtual machine stack. Like the virtual stack, the local method stack area throws StackOverflowError and OutOfMemoryError exceptions.

When threads start calling local methods, they enter a world that is no longer bound by the JVM. Local methods can access the data area of the virtual machine through the Java Native Interface (JNI), and even call registers, with the same capabilities and permissions as the JVM. When a large number of native methods occur, the JVM’s control over the system is bound to be weakened because of its black-box error messages. NativeheapOutOfMemory is still thrown by the local method stack in case of memory shortage.

The most famous JNI class native method is System.CurrentTimemillis (), which enables Java to use operating System features and functions in depth, reusing non-Java code. But over the course of a project, if you implement JNI in a lot of other languages, you lose the cross-platform nature.

The Java heap

For most applications, the Java Heap is the largest chunk of memory managed by the Java virtual machine. The Java heap is an area of memory that is shared by all threads and is created when the virtual machine is started. The sole purpose of this memory area is to hold object instances, and almost all object instances are allocated memory here.

The Heap is the primary area managed by the Garbage collector, and is often referred to as the “Garbage Collected Heap.” From the point of view of memory collection, the Java heap can be subdivided into: new generation and old generation; More detailed are Eden space, From Survivor space, To Survivor space, etc. From the perspective of memory Allocation, the Java heap shared by threads may have multiple Thread private Allocation buffers (TLabs).

The Java heap can be in a physically discontinuous memory space, as long as it is logically contiguous, and most current virtual machines are implemented in terms of scalability (controlled by -xmx and -xMS). OutOfMemoryError is thrown if there is no memory in the heap to complete the instance allocation and the heap can no longer be extended.

Methods area

The Method Area, like the Java heap, is an Area of memory shared by threads that stores information about classes loaded by the virtual machine, constants, static variables, code compiled by the just-in-time compiler, and so on. Although the Java Virtual Machine specification describes the method area as a logical part of the Heap, it has an alias called non-heap, which is supposed to distinguish it from the Java Heap.

The Java Virtual Machine specification is very relaxed about method areas, which, like the Java heap, do not require continuous memory and can be either fixed size or extensible, but optionally do not implement garbage collection. Garbage collection is relatively rare in this region, where the main targets of memory collection are constant pool collection and offloading of types. OutOfMemoryError is thrown when the method area cannot meet memory allocation requirements.

Before JDK8, the method area implementation in Hotspot was Perm, and JDK8 started using Metaspace, where all the string constants of the previous Perm content were moved to the heap, and other content was moved to Metaspace, which was allocated directly in local memory.

Why use meta-spaces instead of permanent generation implementations?

Strings exist in persistent generation, which is prone to performance problems and memory overflow.
It is difficult to determine the size of information about classes and methods, so it is difficult to specify the size of permanent generation. If the size is too small, it is easy to overflow the permanent generation, while if the size is too large, it is easy to overflow the old generation.
Persistent generation introduces unnecessary complexity to the GC and is inefficient for collection.
Combine HotSpot with JRockit.

Run-time constant pool

The Runtime Constant Pool is part of the method area. The Constant Pool Table is used to store various literals and symbolic references generated at compile time. This part of the Constant Table is stored in the runtime Constant Pool after the Class is loaded into the method area.

In general, in addition to storing symbolic references described in Class files, translated direct references are also stored in the runtime constant pool.

Runtime constant pool relative to the Class file another important feature of the constant pool is dynamic, the Java language does not require constant must only compile time to produce, is not preset constant pool into the Class file content can enter method area runtime constant pool, runtime might also put new constants in a pool, One feature that developers use most often is the Intern () method of the String class.

Since the runtime constant pool is part of the method area and is naturally limited by the method area memory, OutOfMemoryError is thrown when the constant pool can no longer claim memory.

Direct memory

Direct Memory is not part of the run-time data region of the virtual machine, nor is it defined in the Java Virtual Machine specification.

NIO was added in JDK 1.4, introducing a Channel and Buffer based I/O approach that uses Native libraries to allocate out-of-heap memory directly. It then operates through a DirectByteBuffer object stored in the Java heap as a reference to this memory. This can significantly improve performance in some scenarios because it avoids copying data back and forth between the Java heap and Native heap.

Obviously, the allocation of native direct memory is not limited by the Size of the Java heap, but since it is memory, it is certainly limited by the size of the total native memory (including RAM and SWAP or paging files) and the addressing space of the processor. When configuring VM parameters, the server administrator sets parameters such as -xmx based on the actual memory. However, the direct memory is often ignored. As a result, the sum of memory regions exceeds the physical memory limit (including the physical and OS limits), and an OutOfMemoryError occurs during dynamic expansion.

More interview questions, welcome to pay attention to Java interview questions selected public number

Java memory model

The Java memory model is a concurrent model of shared memory, in which threads communicate implicitly through read-write shared variables (instance fields, static fields, and array elements in heap memory).

The Java Memory Model (JMM) controls communication between Java threads, determining when writes by one thread to a shared variable are visible to another thread.

Computer caching and cache consistency

Computers use caches between fast cpus and relatively slow storage devices as a buffer between memory and the processor. Copy the data needed for the operation to the cache so that the operation can run quickly. When the operation is finished, it is synchronized from the cache back to memory.

In a multi-processor system (or a single-processor, multi-core system), each processor core has its own cache, and they share the same Main Memory.

When the computation tasks of multiple processors all involve the same main memory area, the cache data of each processor may be inconsistent.

Therefore, each processor must follow some protocols when accessing the cache, and operate according to the protocols when reading and writing the cache to maintain the consistency of the cache. Extension: a collection of 100 interview questions

JVM main memory vs. working memory

The main goal of the Java memory model is to define the access rules for variables in the program, the low-level details of storing variables (variables shared by threads) into and out of memory in the virtual machine.

In the Java memory model, all variables are stored in the main memory, and each thread has its own working memory. All operations on variables must be carried out in the working memory by the thread, instead of reading or writing variables in the main memory.

Working memory here is a JMM abstraction, also known as local memory, which stores copies of shared variables that the thread reads/writes.

Just as each processor kernel has its own cache, each thread in the JMM has its own local memory.

Different threads cannot directly access variables in each other’s working memory, and there are generally two ways to communicate between threads, one is through message passing, and the other is shared memory. The communication between Java threads adopts shared memory. The interaction between threads, main memory and working memory is shown in the following figure:

The main memory, working memory and the Java heap, stack, method area in the Java memory area are not the same level of memory partition, the two are basically unrelated, if the two must be forced to correspond, then from the definition of variables, main memory, working memory, Main memory corresponds primarily to the object instance data portion in the Java heap, while working memory corresponds to a portion of the virtual machine stack.

Reorder and happens-before rule

To improve performance, compilers and processors often reorder instructions when executing programs. There are three types of reordering:

Compiler optimized reordering. The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program.
Instruction – level parallel reordering. Modern processors use instruction-level Parallelism (ILP) to overlap multiple instructions. If there is no data dependency, the processor can change the execution order of the machine instructions corresponding to the statement.
Memory system reordering. Because the processor uses caching and read/write buffers, this makes the load and store operations appear to be out of order.

The sequence of instructions from the Java source code to the actual execution goes through one of the following three reorders:

The JMM is a language-level memory model that ensures consistent memory visibility for programmers across compilers and processor platforms by disallowing certain types of compiler reordering and processor reordering.

The Java compiler disallows processor reordering by inserting a memory barrier at the appropriate location of the generated instruction sequence (reordering cannot reorder subsequent instructions to the location before the barrier).

happens-before

Starting with JDK5, the Java memory model introduced the concept of happens-before to illustrate memory visibility between operations.

If the results of one operation need to be visible to another, there must be a happens-before relationship between the two operations. The two operations mentioned here can be within a thread or between different threads.

By “visibility”, I mean that when one thread changes the value of the variable, the new value is immediately visible to other threads.

If A happens-before B, then the Java memory model guarantees the programmer that the result of A’s operation will be visible to B, and that A takes precedence over B in execution order.

The important happens-before rules are as follows:

Procedure order rule: For every action in a thread, happens-before any subsequent action in that thread.
Monitor lock rule: The unlocking of a monitor lock, happens-before the subsequent locking of the monitor lock.
Volatile variable rule: Writes to a volatile field, happens-before any subsequent reads to that volatile field.
Transitivity: If A happens-before B, and B happens-before C, then A happens-before C.

Below is the happens-before relationship with the JMM

The volatile keyword

Volatile is arguably the lightest synchronization mechanism the JVM provides. When a variable is volatile, it has two properties:

Ensure that this variable is visible to all threads. This cannot be done with ordinary variables, whose values are passed between threads through main memory.

Note that while volatile guarantees visibility, operations in Java are not atomic, making operations on volatile variables unsafe in concurrent situations. The synchronized keyword is thread-safe by the rule that a variable can only be locked by one thread at a time.

Disallow instruction reordering optimization. Ordinary variables only guarantee that the correct result will be obtained in all the places that depend on the result of assignment during the execution of the method, not that the order of assignment operations will be the same as in the program code.