JVM ≠ Japanese Video’s Man

The main reason for writing this post is to have “familiarity with JVM infrastructure” on your resume, and the other reason is to get everyone reading my post to write that as well, such a helpful guy… Well, it’s not just about learning for interviews, it’s more about building your JVM knowledge. Javaers have to have breadth in their technology stack, but depth in their JVM mastery

Seem to have the interview

I read on with these questions in mind anyway

  • What are the JVM runtime data areas? What are they?
  • Java 8 memory generation improvements
  • Example stack overflow?
  • Can I save without overflow by adjusting the stack size?
  • Is it better to allocate more stack memory?
  • Does garbage collection involve the virtual machine stack?
  • Are local variables defined in a method thread-safe?

Runtime data area

Memory is a very important system resource. It is the intermediate warehouse and bridge between hard disk and CPU, carrying the operating system and application program running in real time. The JVM memory layout defines the Java memory allocation, allocation, and management strategies during the running process, ensuring efficient and stable running of the JVM. There are some differences in how memory is divided and managed by different JVMS.

The following figure shows the overall ARCHITECTURE of the JVM, with the various runtime data areas defined by the Java Virtual Machine in the middle.

jvm-framework

The Java virtual machine defines several run-time data areas that are used by programs while they are running, some of which are created when the virtual machine is started and destroyed when the virtual machine exits. Others are thread-specific, where thread-specific areas of data are created and destroyed as threads start and end.

  • Thread private: program counters, stacks, local stacks
  • Thread sharing: heap, off-heap memory (permanent generation or meta-space, code cache)

Let’s detoxify these memory areas one by one, starting with the simplest

1. Program counter

The Program Counter Register (Program Counter Register) is named after the CPU Register. The Register stores the thread information related to the instruction. The CPU can run only when it loads the data into the Register.

Instead of referring to a physical register in a broad sense, it would be more appropriate to call it a program counter (or PC counter or instruction counter) and less likely to cause unnecessary misunderstandings. The PC register in the JVM is an abstract simulation of the physical PC register.

A program counter is a small memory space that can be viewed as a line number indicator of the bytecode being executed by the current thread.

1.1 role

The PC register is used to store the address pointing to the next instruction, the instruction code to be executed. The execution engine reads the next instruction.

jvm-pc-counter

(Analysis: Go to the directory where the class file is located, and perform the javap -v xx.class inverse parsing (or view it directly through IDEA plug-in Jclasslib, above), you can see the corresponding Code area (assembly instructions), local variation table, exception table and Code line offset mapping table, constant pool and other information.)

1.2 an overview of the

  • It’s a small, almost negligible amount of memory. It is also the fastest storage area
  • In the JVM specification, each thread has its own program counter, which is thread-private and has a lifetime consistent with that of the thread
  • There is only one method executing per thread at any one time, which is called the current method. If the current thread is executing a Java method, the program counter records the ADDRESS of the JVM bytecode instruction, or undefined if the natice method is executing.
  • It is an indicator of program control flow, and basic functions such as branching, looping, jumping, exception handling, thread recovery, and so on rely on this counter
  • The bytecode interpreter works by changing the value of this counter to select the next bytecode instruction to execute
  • It is the only one that does not specify anything in the JVM specificationOutOfMemoryErrorArea of the situation

👨💻 : What is the use of using PC registers to store byte code instruction addresses? Why use a PC register to record the execution address of the current thread?

🙋♂️ : Because the CPU needs to keep switching threads, this time after switching back, you have to know where to start to continue execution. The JVM’s bytecode interpreter needs to change the value of the PC register to figure out what bytecode instructions to execute next.

👨💻 : Why are PC registers set to be thread private?

🙋♂️ : Multithreading in a specific period of time will only execute one of the thread methods, CPU will keep doing task switching, which will inevitably lead to frequent interruptions or recovery. In order to accurately record the address of the current bytecode instructions being executed by each thread, a PC register is assigned to each thread, and each thread calculates independently without affecting each other.


2. Virtual machine stack

2.1 an overview of the

Java Virtual Machine Stacks, also known as Java Stacks in its early days. Each thread creates a virtual machine Stack when it is created, which stores Stack frames one by one, corresponding to each Java method call. It is thread private and its life cycle is consistent with that of the thread.

Function: Manages the operation of Java programs, saves local variables, partial results of methods, and participates in method calls and returns.

Features:

  • The stack is a fast and efficient way to allocate storage, second only to program counters in access speed
  • The JVM has only two direct operations on the virtual machine stack: each method executes, followed by a push (push/push), and the method execution terminates the stack
  • There is no garbage collection problem in the stack

Possible exceptions in the stack:

The Java Virtual Machine specification allows the size of the Java virtual machine stack to be dynamic or fixed

  • With a fixed size Java virtual machine stack, the size of the Java virtual machine stack for each thread can be selected independently at thread creation time. The Java virtual machine will throw a StackOverflowError if the thread request allocates more stack capacity than the maximum allowed by the Java virtual machine stack
  • An OutOfMemoryError will be thrown if the Java virtual machine stack can be dynamically extended and cannot allocate enough memory when attempting to extend it, or if there is not enough memory to create the corresponding virtual machine stack when creating a new thread

The maximum stack size of a thread can be set with the -xss parameter. The stack size directly determines the maximum reachable depth of a function call.

Official offer reference tool, can check a few parameters and operation: docs.oracle.com/javase/8/do…

2.2 Storage unit of the stack

What is stored in the stack?

  • Each thread has its own Stack, and the data in the Stack is stored in the format of Stack frames
  • Each method executing on this thread has a corresponding stack frame
  • A stack frame is a block of memory, a data set that holds various data information during the execution of a method

2.3 Stack operation principle

  • The JVM operates directly on the Java stack only two times, pushing and unloading the stack frame, following the “fifin, fifout/LIFO, fifout” principle
  • In an active thread, there is only one active stack frame at a time. That is, only the stack Frame (top stack Frame) of the currently executing Method is valid. This stack Frame is called the Current Frame. The Method corresponding to the Current Frame is the Current Method, which is defined by the Current Class.
  • All bytecode instructions run by the execution engine operate only on the current stack frame
  • If other methods are called in this method, a new stack frame is created and placed at the top of the stack, called the new current stack frame
  • The stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference another thread’s stack frame in one stack frame
  • If the current method calls another method, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame, and then the virtual machine will discard the current stack frame, making the previous stack frame become the current stack frame again
  • A Java method can return a function in two ways, either as a normal function, using a return instruction, or by throwing an exception that will cause the stack frame to be ejected

IDEA During debugging, you can view the pushing and unpushing of various methods in Frames in the debug window

2.4 Internal structure of stack frames

Each Stack Frame ** stores:

  • Local Variables
  • Operand Stack (or expression Stack)
  • Dynamic Linking: Method reference that points to the runtime constant pool
  • Return Address: indicates the Address at which a method exits normally or abnormally
  • Some additional information

jvm-stack-frame

Continue deep casting in the five sections of the stack frame

2.4.1. Local variation scale

  • A local variable list is also called a local variable array or a local variable list
  • Is a set of variable value storage Spaces, mainly used to store method parameters and local variables defined in the method body. This includes various Java virtual machine basic data types (Boolean, byte, CHAR, short, int, float, long, double) that the compiler knows, and object references (reference types, which are not identical to the object itself, but may be a reference pointer to the starting address of the object. It may also point to a handle representing an object or other relevant location) and the returnAddress type (which points to the address of a bytecode instruction and has been replaced by an exception table)
  • Since the local variable table is built on the stack of the thread, it is the thread’s private data, so there is no data security problem
  • The size of the local variable table is determined at compile timeAnd saved in the Code attribute of the methodmaximum local variablesData item. The size of the local variable scale does not change during method execution
  • The number of nested calls to a method is determined by the stack size. In general, the larger the stack, the more nested method calls. For a function, the more parameters and local variables it has, which causes the local variable table to swell, the larger its stack frame will be to meet the need for more information to be passed through method calls. In turn, function calls take up more stack space, resulting in fewer nested calls.
  • Variables in the local variable table are only valid in the current method call. During method execution, the virtual machine passes the parameter values to the parameter variable list using the local variable table. When the method call ends, the local variable table is destroyed along with the method stack frame.
  • Parameter values are always stored at index0 of the local variable array and end at the index of the array length -1
Groove Slot.
  • The most basic storage unit for a local variable table is Slot.
  • In the local variable table, types up to 32 bits occupy only one Slot(including the returnAddress type), and 64-bit types (long and double) occupy two consecutive slots
    • Byte, short, and char are converted to int before storage. Boolean is also converted to int, where 0 means false and non-0 means true
    • Long and double occupy both slots
  • The JVM assigns an access index to each Slot in the local variable table, which successfully accesses the value of the local variable specified in the local variable table. The index value ranges from 0 to the maximum number of slots in the local variable table
  • When an instance method is called, its method parameters and local variables defined inside the method body are copied to each Slot in the local variable table in sequence
  • If you need to access the value of a 64bit local variable in the local variable table, you only need to use the previous index. (For example, to access a long or double variable, it is not allowed to access either Slot in any way)
  • If the current frame is created by a constructor or instance method, the object reference to this will be placed in Slot with index 0, and the rest of the arguments will continue in the argument list order. Static methods cannot refer to this because this does not exist in the local variable table of the current method.
  • The slots in the local variable table in the stack frame can be reused. If a local variable goes out of its scope, the new local variable declared after its scope is likely to reuse the slots of the expired local variable, so as to achieve the purpose of saving resources. (In the figure below, this, A, B, and C should theoretically have 4 variables, and C uses the slot of B)

  • The stack frame that is most relevant for performance tuning is the local variable table. When a method executes, the virtual machine uses a local variable table to complete the method’s delivery
  • Variables in the local variable table are also important garbage collection root nodes, as long as objects referenced directly or indirectly in the local variable table are not collected

2.4.2. Operand stack

  • Each individual Stack frame contains a last-in-first-out operand Stack, also known as the Expression Stack, In addition to the local variable table.
  • Operand stack. During the execution of a method, data is written to or extracted from the operand stack according to bytecode instructions, i.e. push and pop.
  • Some bytecode instructions push values onto the operand stack, while others push operands off the stack. Use them and push the results onto the stack. For example, perform copy, swap, sum, and so on
An overview of the
  • Operand stack, mainly used to store the intermediate results of the calculation process, and as a temporary storage space for variables during the calculation process
  • The operand stack is a workspace of the JVM’s execution engine. When a method is first executed, a new stack frame is created and the operand stack is empty
  • Each stack of operands has an explicit stack depth for storing values. The maximum depth required is defined at compile time and stored in the Code property of the methodmax_stackA data item in the
  • Any element in the stack can be any Java data type
    • 32-bit types occupy one stack unit depth
    • 64-bit types occupy two stack units of depth
  • The operand stack does not access the data by accessing the index, but only once through the standard push and push operations
  • If the called method has a return value, the return value is pushed into the operand stack of the current stack frame and updates the NEXT bytecode instruction to be executed in the PC register
  • The data types of the elements in the operand stack must exactly match the sequence of bytecode instructions, which is verified by the compiler at compile time and again during the data flow analysis phase during the class validation phase during class loading
  • In addition, we say that the Interpretation engine of the Java Virtual machine is a stack-based execution engine, where the stack refers to the operand stack
Top-of-stack-cashing cache

HotSpot’s execution engine is not regitor-based, but that doesn’t mean HotSpot VM’s implementation doesn’t use registers indirectly. Register is one of the components of physical CPU, and it is also a very important high-speed storage resource in CPU. Generally speaking, the register read/write speed is very fast, and even can be dozens of times faster than the memory read/write speed, but the register resources are very limited, the number of CPU registers under different platforms is different and irregular. Registers are used to cache data such as local machine instructions, values, and the address of the next instruction to be executed.

The zero-address instructions used by stack-based virtual machines are more compact, but more loading and unloading instructions are required to complete an operation, which means more instruction dispatch times and memory read/write times. Since operands are stored in memory, frequent in-memory read/write operations inevitably affect execution speed. In order to solve this problem, HotSpot JVM designers have proposed a stack top cache technology, which caches all the top of the stack elements in the physical CPU registers to reduce the number of reads/writes to memory and improve the execution engine efficiency

2.4.3. Dynamic linking (method references to runtime constant pools)

  • Each stack frame contains an internal reference to the method that the stack frame belongs to in the runtime constant pool. The purpose of including this reference is for the code supporting the current method to achieve Dynamic Linking.
  • When a Java source file is compiled into a bytecode file, all variable and method references are kept as Symbolic references in the Class file’s constant pool. For example, describing a method that calls another method is represented by symbolic references to the method in the constant pool, so dynamic linking is used to convert these symbolic references to direct references to the calling method

jvm-dynamic-linking

How does the JVM perform method calls

Method invocation is different from method execution. The only task of method invocation stage is to determine the version of the called method (that is, which method to call), and the specific running process inside the method is not involved for the time being. The compilation of a Class file does not include the concatenation step of a traditional compiler, and all method calls store symbolic references in the Class file, rather than the entry address (direct reference) of the method’s actual runtime memory layout. That is, a direct reference to the target method needs to be determined during class loading, or even at runtime.

This section, in addition to method calls, also includes parsing and dispatching (static dispatching, dynamic dispatching, single dispatching and multiple dispatching), which is not covered here and will be dug up later.

In the JVM, the conversion of symbolic references to direct references to calling methods is related to the method binding mechanism

  • Static linking: When a bytecode file is loaded into the JVM, if the target method being called is known at compile time and the runtime remains the same. The process of converting a symbolic reference to a calling method in this case into a direct reference is called static chaining
  • Dynamic linking: If the invoked method cannot be determined at compile time, that is, the symbolic reference of the invoked method can only be converted into a direct reference at program runtime. Because of the dynamic nature of this reference conversion process, it is also called dynamic linking

The Binding mechanism of the corresponding method is: Early Binding and Late Binding. Binding is the process by which a symbolic reference to a field, method, or class is replaced with a direct reference, which happens only once.

  • Early binding: early binding is refers to the target method is invoked if at compile time, and the run-time remains the same, this method can be bound with subordinate type, as a result, due to clearly define the target method is called which one on earth is, therefore, you can use the static link way of converting symbols refer to reference directly.
  • Late binding: If the called method cannot be determined by the compiler and only the related method can be bound at runtime based on the actual type, this method is called late binding.
Virtual methods and non-virtual methods
  • If the version of a method is specified at the compiler, that version is immutable at run time. Such methods are called non-virtual methods, such as static methods, private methods, final methods, instance constructors, and superclass methods
  • Other methods are called virtual methods
Virtual method table

In object-oriented programming, dynamic dispatch is frequently used, and it may affect the execution efficiency if each dynamic dispatch has to search for the appropriate target in the method metadata of the class. To improve performance, the JVM creates a virtual Method table in the method section of a class, using index tables instead of lookups. Non-virtual methods do not appear in the table.

Each class has a virtual method table that holds the actual entry to each method.

The virtual method table is created and initialized during the connection phase of the class load, and the JVM initializes the method table for that class after the class’s variable initializers are ready.

2.4.4. Return address

Used to hold the value of the PC register that called the method.

There are two ways to end a method

  • Normal Execution Completed
  • Unhandled exception, abnormal exit

Either way, the method is returned to where it was called after it exits. When a method exits normally, the value of the caller’s PC counter is returned as the address of the next instruction that calls the method. In the case of an exception exit, the return address is determined by the exception table, which is generally not stored in the stack frame.

When a method is executed, there are only two ways to exit the method:

  1. When the execution engine encounters a bytecode instruction returned by any method, a return value is passed to the upper method caller, referred to as the normal completion exit

    Which return instruction to use after a normal call to a method also depends on the actual data type of the method return value

    In bytecode instructions, the return instructions include iReturn (used when the return value is Boolean, byte, CHAR, short, and int), lreturn, freturn, dreturn, and Areturn, There is also a return directive for methods declared as void, instance initializer methods, class and interface initializer methods.

  2. An exception encountered during the execution of a method that is not handled within the method causes the method to exit as long as no matching exception handler is found in the method’s exception table. Abnormal completion exit for short

    Exception handling when an exception is thrown during the execution of a method is stored in an exception handling table, which is convenient to find the code to handle exceptions when exceptions occur.

In essence, the method exit is the process of the current stack frame out of the stack. At this point, you need to restore the local variable table of the upper method, operand stack, push the return value into the operand stack of the caller’s stack frame, set the PC register value, etc., and let the caller’s method continue to execute.

The difference between a normal completion exit and an exception completion exit is that an exception completion exit does not return any value to its upper callers

2.4.5. Additional Information

Stack frames also allow you to carry additional information about the Java virtual machine implementation. For example, support information for program debugging is provided, but this information depends on the specific virtual machine implementation.


Local method stack

3.1 Local Method Interface

To put it simply, a Native Method is a Java interface that calls non-Java code. The Unsafe class, as we know, has many native methods.

Why use Native methods?

Java is very convenient to use, but problems arise when there are levels of tasks that are not easy to implement in Java, or when we are concerned with program efficiency

  • Interact diplomatically with the Java environment: Sometimes Java applications need to interact with environments outside of Java, which is why native methods exist.
  • Interaction with the operating system: The JVM supports the Java language itself and runtime libraries, but sometimes still relies on support from some underlying system. With the native approach, we can use Java to interact with the underlying system that implements the JRE, some parts of which are written in C.
  • Sun’s Java: Sun’s interpreter is implemented in C, which allows it to interact with the outside world just like normal C. The JRE is mostly implemented in Java, and it also interacts with the outside world through some native methods. For example, classjava.lang.ThreadsetPriority()Is implemented in Java, but its implementation calls native methods of the classsetPrioruty()This method is implemented in C and built into the JVM.

3.2 Native Method Stack

  • The Java virtual machine stack is used to manage the invocation of Java methods, and the local method stack is used to manage the invocation of local methods
  • The local method stack is also thread-private
  • Allows threads to fix or dynamically expand memory size
    • If a thread request allocates more stack capacity than the maximum allowed by the local method stack, the Java virtual machine will throw oneStackOverflowErrorabnormal
    • The Java virtual machine will throw one if the local method stack can be dynamically extended and there is not enough memory to create the corresponding local method stack when trying to extend it, or if there is not enough memory to create the corresponding local method stack when creating a new threadOutofMemoryErrorabnormal
  • The native methods are implemented using the C language
  • And what it does isMative Method StackRegister native methods inExecution EngineLoading local method libraries at runtime When a thread calls a local method, it enters a whole new world that is no longer constrained by the virtual machine. It has the same rights as the VIRTUAL machine.
  • Local methods can access the runtime data area inside the VIRTUAL machine through the local method interface, and it can even allocate any amount of memory directly from the heap of local memory using registers in the local processor
  • Not all JVMS support native methods. Because the Java virtual Machine specification does not specify the language, implementation, data structure, and so on of the native method stack. If the JVM product does not intend to support native methods, you may not need to implement a native method stack
  • In Hotspot JVM, local stack and virtual machine stack are combined directly

The stack is the unit of runtime and the heap is the unit of storage.

The stack takes care of the execution of the program, that is, how the program executes, or how it processes data. The heap solves the problem of data storage, how and where data is stored.

Heap memory

4.1 Memory Division

For most applications, the Java heap is the largest chunk of memory managed by the Java Virtual machine and is shared by all threads. The sole purpose of this memory area is to hold object instances, and almost all object instances and data are allocated in memory here.

For efficient garbage collection, the virtual machine logically divides the heap memory into three regions (the only reason for generational partitioning is to optimize GC performance) :

  • New generation (young generation) : New objects and objects under a certain age are in the new generation
  • Older generation (pension area) : objects used for a long time, older generation should have more memory space than younger generation
  • Metaclspace (permanent generation before JDK1.8) : Like some methods that operate on temporary objects, which before JDK1.8 used JVM memory, after JDK1.8 directly used physical memory

JDK7

The Java Virtual Machine specification states that the Java heap can be in a physically discontiguous memory space, as long as it is logically contiguous, like disk space. When implemented, it can be either fixed size or extensible, and mainstream virtual machines are extensible (controlled by -xmx and -xMS), throwing OutOfMemoryError if no instance allocation has been completed in the heap and the heap can no longer be extended.

Young Generation

The young generation is where all the new objects are created. Garbage collection is performed when the young generation is populated. This garbage collection is called the Minor GC. The younger generation is divided into three parts — Eden Memory and two Survivor memories (called from/to or S0 / S1), with a default ratio of 8:1:1

  • Most of the newly created objects are in Eden memory space
  • When the Eden space is filled with objects, Minor GC is performed and all survivor objects are moved to a survivor space
  • The Minor GC examines the survivor objects and moves them to another survivor space. So every time, one survivor’s space is always empty
  • Objects that survive multiple GC cycles are moved to the old age. Typically, this is done by setting an age threshold for the younger generation objects before they are eligible for promotion to the older generation

The Old Generation

The old generation memory contained objects that survived many rounds of small GC. Typically, garbage collection is performed when old memory is full. Old-age garbage collections are called primary GC and usually take longer.

Large objects go straight into the old age (large objects are objects that require a large amount of contiguous memory space). The goal is to avoid a large number of memory copies between the Eden region and two Survivor regions

dimension

Both persistent generations prior to JDK8 and meta-spaces after JDK8 can be considered implementations of method areas in the Java Virtual Machine specification.

Although the Java Virtual Machine specification describes the method area as a logical part of the Heap, it has a separate name, non-heap, which is supposed to be separate from the Java Heap.

So meta space comes later in the method area.

4.2 Setting the heap memory size and OOM

The Java heap is used to store Instances of Java objects, so the size of the heap is determined at JVM startup and can be set with -xmx and -xms

  • -XmxUsed to represent the starting memory of the heap, equivalent to-XX:InitialHeapSize
  • -XmsUsed to represent the maximum memory of the heap, equivalent to-XX:MaxHeapSize

If the memory size of the heap exceeds the maximum memory set by -xms, OutOfMemoryError is thrown.

We typically set the -xmx and -xms parameters to the same value to improve performance by eliminating the need to repartition the heap size once the garbage collection mechanism has cleaned up the heap

  • By default, the initial heap size is computer memory size /64
  • By default, the maximum heap size is: computer memory size /4

You can get our Settings from the code, and of course you can simulate OOM:

public static void main(String[] args) {

  // Returns the JVM heap size
  long initalMemory = Runtime.getRuntime().totalMemory() / 1024 /1024;
  // Returns the maximum memory of the JVM heap
  long maxMemory = Runtime.getRuntime().maxMemory() / 1024 /1024;

  System.out.println("-Xms : "+initalMemory + "M");
  System.out.println("-Xmx : "+maxMemory + "M");

  System.out.println("System memory size:" + initalMemory * 64 / 1024 + "G");
  System.out.println("System memory size:" + maxMemory * 4 / 1024 + "G");
}
Copy the code

View JVM heap memory allocation

  1. When the JVM heap size is not configured by default, the JVM configures the current memory size based on the default value

  2. By default, the ratio of the new generation to the old generation is 1:2, which can be configured by running -xx :NewRatio

    • CenozoicEden:From Survivor:To SurvivorIs the percentage of8:1:1, can be accessed through-XX:SurvivorRatioTo configure the
  3. If -xx :+UseAdaptiveSizePolicy is enabled in JDK 7, the JVM dynamically adjusts the size and age of the various regions in the JVM heap

    -xx :NewRatio and -xx :SurvivorRatio will fail, and JDK 8 is enabled by default. -xx :+UseAdaptiveSizePolicy

    In JDK 8, do not turn off -xx :+UseAdaptiveSizePolicy unless you have a clear plan for how the heap is divided

The Eden, From Survivor, and To Survivor sizes are recalculated after each GC

The calculation is based on the statistics of GC time, throughput and memory usage during GC

java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit                         = 0                                   {product}
    uintx HeapSizePerGCThread                       = 87241520                            {product}
    uintx InitialHeapSize                          := 134217728                           {product}
    uintx LargePageHeapSizeThreshold                = 134217728                           {product}
    uintx MaxHeapSize                              := 2147483648                          {product}
java version "1.8.0 comes with _211"
Java(TM) SE Runtime Environment (build 1.8. 0 _211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)$jmap -heap Process numberCopy the code

4.3 Life cycle of objects in the heap

  1. In the HEAP of the JVM memory model, the heap is divided into new generation and old generation
    • The Cenozoic generation is further divided into Eden region and Survivor region, and Survivor region consists of From Survivor and To Survivor
  2. When an object is created, it will be allocated to the Eden region of the new generation first
    • At this point, the JVM defines one for the objectObject youth counter(-XX:MaxTenuringThreshold)
  3. When Eden space is insufficient, the JVM will perform Minor GC for the next generation.
    • The JVM transfers surviving objects to Survivor and the object age is +1
    • An object in a Survivor also experiences a Minor GC, and each time it experiences a Minor GC, the object’s age increases by 1
  4. If the number of allocated objects exceeds-XX:PetenureSizeThresholdThat object isThey were assigned directly to the old age

4.4 Object Allocation Process

Allocates memory for the object is a very precise and complicated task, the JVM’s designers not only need to consider how to allocate memory, where is the distribution problems, and because the memory allocation and memory recovery algorithms are closely related, so you also need to consider the GC to perform after the memory recovery will produce memory fragments in the memory space.

  1. The object of new is first placed in Eden Park, which has a size limit
  2. When Eden fills up and the program needs to create objects again, the JVM’s garbage collector will perform a Minor GC on Eden garden, destroying objects in Eden garden that are no longer referenced by other objects. Load a new object and place it in the Garden of Eden
  3. Then move the remaining objects in Eden to Survivor 0
  4. If garbage collection is triggered again, the last surviving item will be placed in Survivor 0, if not collected, it will be placed in Survivor 1
  5. If the garbage collection is repeated, it will be put back into Survivor 0 and then go to Survivor 1
  6. When are you gonna go to the nursing home? The default is 15 recycle flags
  7. In the retirement area, relatively leisurely. When the endowment area memory is insufficient, Major GC is triggered again to clean up the endowment area memory
  8. An OOM exception will be generated if the endowment area fails to save objects after performing Major GC

4.5 Introduction to GC Garbage Collection

Minor GC、Major GC、Full GC

JVM GC does not always check heap memory (new generation, old generation; Most of the time, it’s the Cenozoic generation.

In the implementation of HotSpot VM, the GC is divided into two main categories according to the collection region: Partial GC and Full GC.

  • Partial collection: Garbage collection that does not collect the entire Java heap. Which are divided into:
    • Currently only the G1 GC has this behavior
    • Currently, only the CMS GC has a separate collection behavior for older generations
    • Most of the time, the Major GC will be mixed with the Full GC, and you need to be specific about whether it is old age or whole heap
    • Minor GC/Young GC: This is just garbage collection for the new generation
    • Major GC (Old GC) : Just Old GC
    • Mixed GC: Collects garbage from the entire new generation and part of the old generation
  • Full GC: Collects garbage from the entire Java heap and method area

4.6 TLAB

What is TLAB (Thread Local Allocation Buffer)?

  • The Eden region continues to be partitioned from the perspective of the memory model rather than garbage collection, and the JVM allocates a private cache region for each thread, which is contained within Eden space
  • When multiple threads allocate memory simultaneously, TLAB can avoid a series of non-thread-safe problems and improve the throughput of memory allocation, so we can call this method of memory allocation fast allocation strategy
  • Most JVMS derived from OpenJDK provide TLAB designs

Why TLAB?

  • The heap is shared by threads, and any thread can access the shared data in the heap
  • Because object instances are created so frequently in the JVM, it is not thread-safe to partition memory from the heap in a concurrent environment
  • In order to avoid multiple threads operating on the same address, it is necessary to use mechanisms such as locking, which will affect the allocation speed

Although not all object instances can successfully allocate memory in TLAB, the JVM does use TLAB as the first choice for memory allocation.

In the program, you can set whether to open the TLAB space by -xx :UseTLAB.

By default, TLAB memory space is very small, only accounts for 1% of the whole Eden space, we can use – XX: TLABWasteTargetPercent set TLAB space occupied the percentage of Eden space size.

When an object fails to allocate memory in TLAB space, the JVM tries to allocate memory directly in Eden space by using locking mechanisms to ensure atomicity of data operations.

4.7 Is the heap the only option for allocating object storage

As JIT compilation time progresses and escape analysis techniques mature, on-stack allocation, scalar replacement optimization techniques will lead to subtle changes in how all objects allocated to the heap become less “absolute.” — Understanding the Java Virtual Machine

Escape analysis

Escape Analysis is one of the most advanced optimization techniques in Java virtual machines. This is a cross-function global data flow analysis algorithm that can effectively reduce the synchronization load and memory heap allocation stress in Java programs. Through escape analysis, the Java Hotspot compiler can figure out how far a new object’s references are used to determine whether to allocate the object to the heap.

The basic behavior of escape analysis is analyzing object dynamic scope:

  • When an object is defined in a method and is used only inside the method, no escape is considered to have occurred.
  • An object is considered to have escaped when it is defined in a method and referenced by an external method. For example, passing as a call parameter somewhere else is called method escape.

Such as:

public static StringBuffer craeteStringBuffer(String s1, String s2) {
   StringBuffer sb = new StringBuffer();
   sb.append(s1);
   sb.append(s2);
   return sb;
}
Copy the code

A StringBuffer sb is an internal variable of the method. In the above code, sb is returned directly, so that the StringBuffer can be changed by other methods, so that its scope is not just inside the method, although it is a local variable that escapes outside the method. It may even be accessed by an external thread, such as an instance variable assigned to a class variable or accessible from another thread, called thread escape.

If you want StringBuffer sb not to escape the method, you can write:

public static String createStringBuffer(String s1, String s2) {
   StringBuffer sb = new StringBuffer();
   sb.append(s1);
   sb.append(s2);
   return sb.toString();
}
Copy the code

If a StringBuffer is not returned directly, it will not escape the method.

Parameter Settings:

  • Escape analysis is enabled by default in HotSpot after JDK 6U23
  • If you use an earlier version, you can pass-XX"+DoEscapeAnalysisExplicit open

When using local variables in development, do not define them outside the method.

Using escape analysis, the compiler can optimize code:

  • On-stack allocation: Converts heap allocation to stack allocation. If an object is allocated in a subroutine so that Pointers to it never escape, the object may be a candidate for stack allocation, not heap allocation
  • Synchronous elision: If an object is found to be accessible only from one thread, operations on the object may be performed without regard to synchronization
  • Detached object or scalar substitution: Some objects may be accessible without needing to exist as a contiguous memory structure, so part (or all) of the object can be stored not in memory but in CPU registers

The JIT compiler, based on the results of escape analysis at compile time, finds that an object can be optimized for stack allocation if it does not escape the method. After allocation, execution continues in the call stack, and finally the thread terminates, the stack space is reclaimed, and the local variable object is reclaimed. This eliminates the need for garbage collection.

Common stack assignment scenarios: member variable assignment, method return value, instance reference passing

Synchronous elision (elimination) of code optimization
  • The cost of thread synchronization is quite high, and the consequence of synchronization is reduced concurrency and performance
  • When a synchronized block is dynamically compiled, the JIT compiler can use escape analysis to determine whether the lock object used by the synchronized block can be accessed by one thread without being published to another. If not, the JIT compiler unsynchronizes the code when it compiles the synchronized block. This can greatly improve concurrency and performance. This unsynchronization process is called synchronization elision, also known as lock elimination.
public void keep(a) {
  Object keeper = new Object();
  synchronized(keeper) { System.out.println(keeper); }}Copy the code

In the code above, the Keeper object is locked, but the life cycle of the Keeper object is only in the keep() method and is not accessed by other threads, so it will be optimized during JIT compilation. Optimized:

public void keep(a) {
  Object keeper = new Object();
  System.out.println(keeper);
}
Copy the code
Scalar substitution for code optimization

A Scalar is a quantity which cannot be broken down into smaller quantities. Primitive data types in Java are scalars.

In contrast, data that can still be decomposed is called aggregates. Objects in Java are aggregates because they can also be decomposed into other aggregates and scalars.

In the JIT phase, when escape analysis determines that the object cannot be accessed externally and can be further decomposed, the JVM does not create the object, but instead splits the object’s member variables into several that are used by this method. These substitute member variables allocate space on the stack frame or register. This process is called scalar substitution.

Through – XX: + EliminateAllocations can open the scalar replaced, – XX: + PrintEliminateAllocations checking their scalar replacement.

public static void main(String[] args) {
   alloc();
}

private static void alloc(a) {
   Point point = newPoint (1.2); System.out.println("point.x="+point.x+"; point.y="+point.y);
}
class Point{
    private int x;
    private int y;
}
Copy the code

In the above code, the point object does not escape the alloc() method, and the point object can be disintegrated into scalars. Instead of creating a Point object directly, the JIT uses two scalars int x, int y instead.

private static void alloc(a) {
   int x = 1;
   int y = 2;
   System.out.println("point.x="+x+"; point.y="+y);
}
Copy the code
Stack allocation for code optimization

We can know from THE JVM memory allocation that objects in JAVA are allocated on the heap. When objects are not referenced, GC needs to recycle the memory. If there are a large number of objects, GC will bring great pressure, which also indirectly affects the application performance. To reduce the number of temporary objects allocated in the heap, the JVM uses escape analysis to determine that the object will not be accessed externally. Then the object is decomposed on the stack by scalar substitution, so that the memory space occupied by the object can be destroyed as the stack frame leaves the stack, reducing the pressure of garbage collection.

Conclusion:

Papers on escape analysis were published in 1999, but it wasn’t implemented until JDK 1.6, and the technology isn’t fully developed yet.

The fundamental reason is that there is no guarantee that the performance cost of escape analysis will be higher than its cost. Scalar substitution, stack allocation, and lock elimination can be done after escape analysis. However, escape analysis itself also requires a series of complex analysis, which is actually a relatively time-consuming process.

An extreme example would be an escape analysis where no object is escape-free. Then the process of escape analysis is wasted.

Although this technique is not very mature, it is also a very important tool in real-time compiler optimization.


5. Method area

  • The Method Area, like the Java heap, is an Area of memory shared by all threads.
  • Although the Java Virtual Machine specification describes the method area as a logical part of the Heap, it has a separate name, non-heap, which is supposed to be separate from the Java Heap.
  • The Runtime Constant Pool is part of the method area. The Constant Pool Table is used to store the literal and symbolic references generated at compile time. The Constant Pool is used to store the Constant references generated at compile time when the Class is loaded into the method area. It is also possible to put new constants into the pool at run time, a feature that is often used by developersString.intern()Methods. Thrown when the constant pool can no longer be allocated to memory due to method area memory limitationsOutOfMemoryErroR anomalies.
  • The size of the method area, like the heap space, can be fixed or extensible. The size of the method area determines how many classes the system can hold. If the system has too many classes and the method area overflows, the VIRTUAL machine will also throw an overflow error
  • The method area is freed when the JVM is shut down

5.1 to reassure

Have you ever looked at different reference materials, some memory structure diagrams have method area, some are permanent generation, metadata area, confused?

  • Method area ** is just a concept defined in the **JVM specification for storing class information, constant pools, static variables, JIT-compiled code, etc. It is not specified how to implement it. Different vendors have different implementations. PermGen is a concept unique to the Hotspot virtual machine. In Java8, PermGen was replaced by meta space. Both PermGen and meta space can be understood as the ground implementation of the method area.
  • Permanent generation physics is part of the heap, and new generation and old generation addresses are contiguous (managed by the garbage collector), while meta space exists in local memory, which is not limited by the JVM and is more difficult to occur OOM.
  • In Java7 we passed-XX:PermSize-xx:MaxPermSizeTo set permanent generation parameters, Java8, with the cancellation of permanent generation, these parameters will also be invalid, changed to pass-XX:MetaspaceSize-XX:MaxMetaspaceSizeUsed to set meta-space parameters
  • The storage contents are different. The meta space stores the meta information of the class, and static variables and constant pools are merged into the heap. The data equivalent to the permanent generation is divided into the heap and meta-space
  • If the memory in the method area cannot be used to satisfy an allocation request, the Java VIRTUAL machine throwsOutOfMemoryError
  • The JVM specification states that the method area is logically part of the Heap, but is currently actually non-heap from the Java Heap.

So for the method area, what has changed since Java8:

  • PermGen is removed and Metaspace is replaced.
  • Class metadata in the permanent generation is moved to native memory (local memory, not virtual machine).
  • Interned Strings and Class Static variables from the permanent generation have been moved to the Java Heap;
  • Permanent generation parameter (PermSize MaxPermSize) -> MetaspaceSize MaxMetaspaceSize

5.2 Setting Method Memory size

Jdk8 and later:

  • Metadata area size can use parameters-XX:MetaspaceSize-XX:MaxMetaspaceSizeSpecifies to replace the previous two parameters
  • The default values are platform dependent. Under Windows,-XX:MetaspaceSize21 m,-XX:MaxMetaspacaSizeThe value of is -1, that is, there is no limit
  • Unlike the permanent generation, if the size is not specified, the virtual machine uses up all available system memory by default. If metadata overflow occurs, the VM will also throw an exceptionOutOfMemoryError:Metaspace
  • -XX:MetaspaceSize: Sets the initial size of the meta-space. The default for a 64-bit server-side JVM-XX:MetaspaceSizeOnce hit, Full GC will trigger and unload useless classes (that is, their corresponding classloaders are no longer alive). Then the high watermark will reset, depending on how much space is freed after GC. If the free space is insufficient, then in no more thanMaxMetaspaceSize, appropriately increase the value. If too much space is freed, lower this value appropriately
  • If the initial high water level is set too low, this high water level adjustment will occur many times, and multiple calls to the Full GC will be observed in the garbage collection logs. To avoid frequent GC, it is recommended that-XX:MetaspaceSizeSet to a relatively high value.

5.3 Internal structure of method area

The method area is used to store type information that has been loaded by the virtual machine, constants, static variables, just-in-time compiler compiled code caches, and so on.

The type information

For each loaded type (class, interface, enumeration enum, annotation), the JVM must store the following type information in the method area

  • The full valid name for this type (full name = package name. The name of the class)
  • The full valid name of the type’s immediate parent (no parent for interface or java.lang.Object)
  • Modifiers of this type (some subset of public, abstract, final)
  • An ordered list of direct interfaces of this type

Field information

  • The JVM must keep information about all fields of the type and the order in which the fields are declared in the method area
  • Domain information includes domain name, domain type, and domain modifiers (a subset of public, private, protected, static, final, volatile, and TRANSIENT)

Method information

The JVM must save all methods

  • Method names
  • Method return type
  • The number and type of method parameters
  • Method modifiers (a subset of public, private, protected, static, final, synchronized, native, abstract)
  • Bytecodes, operand stack, local variable table and size of methods (except abstract and native methods)
  • Exception list (except for abstract and native methods)
    • The start and end location of each exception handler, the offset address of the code handler in the program counter, and the constant pool index of the caught exception class

Stack, heap, method area interaction

5.4 Runtime constant Pool

The Runtime Constant Pool is part of the method area. To understand the Runtime Constant Pool, let’s first talk about the Constant Pool table in the bytecode file.

Constant pool

A valid bytecode file contains not only the version information, fields, methods, and interfaces of the class, but also the Constant Pool Table, which contains various literals and symbolic references to types, fields, and methods.

Why do WE need constant pools?

A Java source file of classes, interfaces, compiled to produce a bytecode file. Bytecodes in Java require data support, usually too large to be stored directly in bytecodes, or, alternatively, in a constant pool, which contains references to the constant pool. The run-time constant pool is used for dynamic linking.

Below, we look at a simple class with only Main methods in Jclasslib, where #2 in the bytecode refers to Constant Pool

A constant pool can be thought of as a table from which virtual machine instructions find the class names, method names, parameter types, literals, and so on to execute.

Run-time constant pool

  • After the classes and structures are loaded into the virtual machine, the corresponding runtime constant pool is created
  • The Constant Pool Table is the part of the Class file that stores literal and symbolic references generated at compile time and is stored in the runtime Constant Pool in the method area after the Class is loaded
  • The JVM maintains a constant pool for each loaded type (class or interface). Data items in a pool, like array items, are accessed by index
  • The runtime constant pool contains a variety of constants, including numeric literals that are already explicit by the compiler, as well as method or field references that are not available until runtime parsing. So instead of having a symbolic address in the constant pool, we have a real address
    • Another important feature of run-time constant pools versus Class file constant pools is:dynamicThe Java language does not require constants to be generated only at compile time. New constants can also be pooled at run timeintern()That’s how it works
  • When creating a runtime constant pool for a class or interface, the JVM throws an OutOfMemoryError if the amount of memory required to construct the runtime constant pool exceeds the maximum that can be provided by the method area.

5.5 Evolution details of method areas in JDK6, 7, and 8

Only HotSpot has the concept of perpetual generation

Jdk1.6 and before

There are permanent generations, where static variables are stored

jdk1.7 Persistent generation exists, but has been progressively “de-perpetuated”, string constant pools, static variables removed, and stored in the heap
Jdk1.8 and after The permanent generation is cancelled, and the type information, fields, methods, and constants are kept in the local memory meta-space, but the string constant pool and static variables remain in the heap

Remove the permanent cause

openjdk.java.net/jeps/122

  • Setting the size of the space for the permanent generation is difficult to determine.

    In some scenarios, if too many classes are dynamically loaded, OOM in the Perm area may be generated. If a practical Web project, because there are many function points, in the process of running, to dynamically load many classes, often OOM. The biggest difference between a meta-space and a permanent generation is that the meta-space does not exist in a VM. Instead, it uses local memory. Therefore, by default, the size of a meta-space is limited only by the local memory

  • Tuning permanent generations is difficult

5.6 Garbage collection in method area

The method area garbage collection collects two main parts: obsolete constants in the constant pool and types that are no longer used.

Let’s start with the two main types of constants stored in the method area constant pool: literals and symbolic references. Literals are close to the Java language level concepts of constants, such as text strings, constant values declared final, and so on. Symbolic references, on the other hand, are concepts related to compilation principles and include the following three constants:

  • Fully qualified names of classes and interfaces
  • The name and descriptor of the field
  • The name and descriptor of the method

The HotSpot VIRTUAL machine has a very clear policy for reclaiming constants from the constant pool as long as they are not referenced anywhere

To determine whether a type is “no longer in use”, three conditions must be met:

  • All instances of the class are already recycled, that is, there are no instances of the class or any of its derived children in the Java heap
  • The classloader that loaded the class has already been recycled, a condition that is usually difficult to achieve unless there are carefully designed scenarios that can replace the classloader, such as OSGi, JSP reloads, and so on
  • The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere through reflection

The Java virtual machine is allowed to recycle garbage classes that meet the above three criteria. This is only “allowed”, not necessarily recycled when objects are not used. The HotSpot virtual machine provides the -xnoClassGC parameter to control whether or not the class is recycled. You can also use -verbose:class and -xx :+TraceClassLoading and -xx :+TraceClassLoading to view class loading and unloading information.

In scenarios where reflection, dynamic proxies, ByteCode frameworks such as CGLib, dynamically generated JSPS, and frequent custom Classloaders such as OSGi are used extensively, virtual machines with the ability to unload classes are required to ensure that permanent generations do not overflow.

Reference and thanks

Is a study notes, mutual encouragement, main source:

Song Hongkang JVM tutorial

In-depth Understanding of the Java Virtual Machine Third Edition

Docs.oracle.com/javase/spec…

www.cnblogs.com/wicfhwffg/p…

Here are three constants:

  • Fully qualified names of classes and interfaces
  • The name and descriptor of the field
  • The name and descriptor of the method

The HotSpot VIRTUAL machine has a very clear policy for reclaiming constants from the constant pool as long as they are not referenced anywhere

To determine whether a type is “no longer in use”, three conditions must be met:

  • All instances of the class are already recycled, that is, there are no instances of the class or any of its derived children in the Java heap
  • The classloader that loaded the class has already been recycled, a condition that is usually difficult to achieve unless there are carefully designed scenarios that can replace the classloader, such as OSGi, JSP reloads, and so on
  • The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere through reflection

The Java virtual machine is allowed to recycle garbage classes that meet the above three criteria. This is only “allowed”, not necessarily recycled when objects are not used. The HotSpot virtual machine provides the -xnoClassGC parameter to control whether or not the class is recycled. You can also use -verbose:class and -xx :+TraceClassLoading and -xx :+TraceClassLoading to view class loading and unloading information.

In scenarios where reflection, dynamic proxies, ByteCode frameworks such as CGLib, dynamically generated JSPS, and frequent custom Classloaders such as OSGi are used extensively, virtual machines with the ability to unload classes are required to ensure that permanent generations do not overflow.

Reference and thanks

Is a study notes, mutual encouragement, main source:

Song Hongkang JVM tutorial

In-depth Understanding of the Java Virtual Machine Third Edition

Docs.oracle.com/javase/spec…

www.cnblogs.com/wicfhwffg/p…

www.cnblogs.com/hollischuan…