First, the composition of JVM

JVM = class loading system + runtime data area + bytecode execution engine

The diagram below:

Runtime data area (JVM memory model)

1. Method area:

In JDK8, the method area is also called permanent generation. After 8, it is changed to metacase, which mainly stores constants, static variables, class metacase information

2. The stack:

  • Stacks are thread private, one stack per thread
  • There are stack frames in the stack, one method for each stack frame
  • Stack frame also contains local variable table, operand stack, dynamic link, method exit

3. Program counter:

The program counter is also thread private, and the bytecode execution engine dynamically records the number of machine code lines to be executed next time during program execution

4. The stack:

Most objects are in the heap, and a few are on the stack (escape analysis occurs during just-in-time compilation)

Stack – An example

See how the following code works inside the JVM:

public class MyTest {

    public static void main(String[] args) {
        int result = math();
        System.err.println(result);
    }

    public static int math(){
        int a=1;
        int b=2;
        int c = (a+b)*10;
        returnc; }}Copy the code

Javap -c mytest.class to see what bytecode looks like after disassembly:

public class com.example.spring.jvmTest.MyTest {
  public com.example.spring.jvmTest.MyTest();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: invokestatic  #2                  // Method math:()I
       3: istore_1
       4: getstatic     #3                  // Field java/lang/System.err:Ljava/io/PrintStream;
       7: iload_1
       8: invokevirtual #4                  // Method java/io/PrintStream.println:(I)V
      11: return

  public static int math();
    Code:
       0: iconst_1
       1: istore_0
       2: iconst_2
       3: istore_1
       4: iload_0
       5: iload_1
       6: iadd
       7: bipush        10
       9: imul
      10: istore_2
      11: iload_2
      12: ireturn
}

Copy the code

Following the principle of one stack frame per method and following the stack frame FILO principle, let’s examine the Math () method

Note: JVM bytecode instructions stand for what, Baidu search (JVM bytecode instructions table)

public static int math();
    Code:
       0: iconst_1     // push int 1 to the operand stack
       1: istore_0     // Store the operand stack int value to the first local variable
       2: iconst_2     // push int 2 to the operand stack
       3: istore_1     // Store the operand stack int value into the second local variable
       4: iload_0      // Push the first int local variable to the operand stack
       5: iload_1      // Push the second int local variable to the operand stack
       6: iadd         // Add two int values in the operand stack and push the result into the operand stack
       7: bipush        10     // Push 10 to the operand stack
       9: imul                 // Multiply two int values on the operand stack and push the result onto the operand stack
      10: istore_2             // Store the operand stack int value into the third local variable
      11: iload_2              // Push the third int local variable to the operand stack
      12: ireturn              // Returns int from the current method
}
Copy the code

Therefore, it is concluded that operand stack is mainly used for data transfer to local variable table and for operation

Escape analysis

The JVM runs in three modes:

  1. Interpretation pattern: An act of machine code is compiled by executing a line of JVM bytecode
  2. Compilation mode: All JVM bytecode is compiled to machine code at once, and then all machine code is executed at once
  3. Mixed mode: Code is still executed in interpreted mode, but for some “hot” code that is executed in compiled mode, the JVM typically executes code in mixed mode

Explain mode starts quickly and is suitable for situations where only part of the code needs to be executed and most of the code is executed only once;

Compile mode is slow to start, but fast to execute later, and is more memory intensive because there is at least ten times more machine code than JVM bytecode. This mode is suitable for scenarios where code is likely to be executed repeatedly.

Mixed mode is the JVM’s default way of executing code. It is interpreted at first, but for a small portion of “hot” code, it is compiled. The machine code corresponding to the hot code is cached so that the next time it is executed, there is no need to compile. This is the just-in-time (JIT) compilation technique we commonly use.

The JVM may make some optimizations to our code during just-in-time compilation, such as object escape analysis

Take a look at this code:

public class MyTest {

    public static void main(String[] args) {
        math();
    }

    public static void math(){
        A a = newA(); }}Copy the code

In the Math () method, although an object is instantiated, it is not returned, no reference is made, and the object is put on the stack after escape analysis.

Second, GC algorithm

As we all know, GC mainly occurs in heap area (method area also occurs, but rarely), and heap is divided into Cenozoic and old age, and Cenozoic is divided into Eden area +Survivor0 area (From area)+Survivor1 area (To area), as shown below:

  • Minor GC: Occurs in the young generation. Compared with Full GC, Minor GC occurs more frequently and garbage collection is faster
  • Full GC: Occurs in the old generation, collects garbage from the old generation, young generation, and method area, and is generally 10 times slower than Minor GC

How do I determine if an object can be reclaimed?

1. Reference counting method:

Add a reference counter to the object, incrementing it every time it is referenced; When a reference is invalid, the counter is decayed by 1; Any time an object with a 0 counter is no longer usable.

This method is easy to implement and efficient, but the mainstream virtual machine does not choose this algorithm to manage memory, its main reason is that it is difficult to solve the problem of object reference cycle.

2. Accessibility analysis algorithm

From these nodes, “GC Roots” objects are used as a starting point, and the referenced objects are searched down. The objects found are marked as non-garbage and the remaining unmarked objects are garbage

GC Roots: thread stack local variables, static variables, local method stack variables, and so on

As shown in figure:

3. Determine the reference type of the object

Java reference types are generally divided into four types: strong reference, soft reference, weak reference, and virtual reference

Strong references: Plain variable references

public static User user = new User();
Copy the code

SoftReference: an object is wrapped with a SoftReference SoftReference. Normally, the object will not be reclaimed. After the GC is complete, if no space is available for storing new objects, the SoftReference objects are reclaimed. Soft references can be used for caching

public static SoftReference<User> user = new SoftReference<User>(new User());
Copy the code

Weak and virtual references, similar to no references, are collected directly during the GC phase.

Last resort before recycling

Even in the reachability analysis algorithm, unreachable objects are not necessarily dead. At this time, they are temporarily in the stage of “probation”. In order to truly declare an object dead, at least the process of re-marking must be experienced.

The premise of the tag is that the object is found to have no chain of references connected to GC Roots after reachability analysis.

1. Mark and filter for the first time.

Determine whether the object overrides the Finalize method, if not, it is directly recycled.

2. Second mark

If the object overrides the Finalize method, Finalize method is the last chance for the object to escape from death. If the object wants to save itself successfully in Finalize (), it only needs to re-associate with any object in the reference chain. If the object has not escaped by this time, it is indeed reclaimed.

Minor GC:

  1. After the program runs, objects enter Eden area continuously. When Eden area is filled, Minor GC will occur. Generally speaking, 90% of the garbage objects will be collected, and the remaining objects will be moved to S0 area with age +1
  2. When a Minor GC occurs after Eden is filled again, surviving objects and objects in S0 are moved to S1 with age +1
  3. When a Minor GC occurs after Eden is filled again, surviving objects and S1 objects are moved to S0 with age +1

. Again.

What kind of objects will go into the old age?

1. Big objects directly enter the old age:

Large objects that occupy a large number of continuous memory space objects (such as strings, arrays), we can configure the JVM parameter (- XX: PretenureSizeThreshold) specifies the size of large object, more than this size would enter old age.

2. Long-lived objects will enter the old age

When the object’s age increases to a certain age (15 by default), it is promoted to the old age. The age threshold for the object to be promoted to the old age cannot be greater than 15 years old, because the age of the object occupies only 4 bits in the composition of the object (the maximum is 1111=15). This threshold can be set by using -xx :MaxTenuringThreshold.

3. Object dynamic age judgment

For example, there is a batch of objects in S0 area now, and the sum of multiple age objects with age 1+ age 2+ age N exceeds 50% in S0 area, then all objects with age n and above will be put into the old age. This rule is actually to hope that those objects that are likely to be long-term survival, as early as possible into the old age. The object dynamic age determination mechanism is triggered after the Minor GC.

4. The Survivor zone of objects that survive the Minor GC cannot be saved

In this case, some of the surviving objects are moved to older ages, and some may still be in Survivor zones

5. Old chronospatial allocation guarantee mechanism

Before each Minor gc, the JVM calculates the free space of the old generation, and if the free space is less than the sum of all existing objects (including garbage objects) in the young generation, A parameter of “-xx :-HandlePromotionFailure” (which is set by default in JDK1.8) is checked to see if the available memory size of the old age is greater than the average size of objects that entered the old age after each previous minor gc. If the result of the previous step is less than or the previous parameter is not set, then a Full GC is triggered and garbage is collected for both old and young generations. If there is still not enough space to store new objects, “OOM” will occur.

As shown in figure:

Let’s draw a picture to summarize:

The GC algorithms commonly used by the JVM are:

1. Mark-clear algorithm:

The algorithm is divided into “mark” and “clean” stages: first, mark all objects to be recycled, after the completion of the mark all marked objects.

Disadvantages:

  • Efficiency problem (90% of the objects in the young generation need to be recycled, marking one by one is inefficient)
  • Space issues (large number of discontinuous fragments after marker clearing)

As shown in figure:

2. Replication algorithm

In order to solve the efficiency problem, the “copy” algorithm emerged. It divides memory into two equally sized pieces, one of which is used at a time. When this area of memory is used up, the surviving objects are copied to another area, and then the used space is cleaned up again. In this way, half of the memory range is reclaimed each time.

Problem: Memory consumption, need to keep a piece of the same size space at all times

As shown in figure:

3. Mark-tidy algorithm

The marking process is still the same as the “mark-clean” algorithm, but instead of directly collecting the recyclable objects, the next step is to move all the surviving objects towards a segment and then directly clean up the memory beyond the boundary

As shown in figure:

4. Generational collection algorithm

Currently, garbage collection of VMS adopts generational collection algorithm

The Java heap is divided into the new generation and the old generation so that we can choose the appropriate garbage collection algorithm based on the characteristics of each generation.

For example, in the new generation, a large number of objects (nearly 99%) die in each collection, so you can choose a replication algorithm that can complete each garbage collection for a small amount of object replication cost. Older objects have a higher chance of survival, and there is no extra space for allocation guarantees, so we must choose the “mark-clean” or “mark-clean” algorithm for garbage collection.

Note: “mark-clean” or “mark-tidy” algorithms can be more than 10 times slower than copy algorithms

Let’s draw a picture to summarize:

Garbage collector

1. Serial collector:

Serial is a single-threaded collector. It must pause other worker threads (“Stop The World”) while it is garbage collecting until it is finished.

The Serial collector uses the copy algorithm in the new generation and the mark-collation algorithm in the old generation.

As shown in figure:

2. ParNew collector

It is a multithreaded version of the Serial collector, which by default collects the same number of threads as the number of CPU cores

As shown in figure:

Parallel collector

The Parallel collector is similar to the ParNew collector and is the default collector for the JVM

The Parallel collector focuses on throughput. Garbage collectors such as CMS focus more on the pause times of user threads (improving user experience).

The new generation of Parallel collector adopts the copy algorithm, and the old era adopts the mark-collation algorithm.

4. CMS collector

The CMS (Concurrent Mark Sweep) collector seeks the shortest pause time in the collection process. With a strong focus on user experience, it is the HotSpot VIRTUAL machine’s first truly concurrent collector, enabling the garbage collector thread to work simultaneously with the user thread.

As you can see from the word Mark Sweep in its name, the CMS collector is implemented as a “mark-sweep” algorithm that operates in four steps:

  • Initial marker: Suspend all other threads and record objects that GC Roots can reference directly, which is fast
  • Concurrent marking: Start both the GC and user threads and record their references to the objects in the chain based on the objects found in the previous step. And marks the object on which the reference update occurred.
  • Relabeling: Fixes references to updated objects during concurrent marking, where the pause time is longer than the initial marking and shorter than the concurrent marking phase
  • Concurrent cleanup: The user thread is started and the GC thread begins to clean the unmarked area.

As shown in figure:

Advantages: Concurrent collection, low pauses.

Disadvantages:

  1. Sensitive to CPU resources (competing with services for resources)
  2. Unable to handle floating garbage (garbage is generated during the concurrent cleanup phase and can only be cleaned up by the next GC);
  3. Recovery algorithm, it USES “tag – clear algorithm can lead to” when it has a lot of space debris gather over produce, through the parameter – XX: + UseCMSCompactAtFullCollection allows the JVM the execution of the tag to remove do again after finishing
  4. The uncertainty in the execution process, there may be a garbage collection before the last garbage collection is completed, and then the garbage collection is triggered again, especially in the concurrent marking and concurrent cleaning phases. The system may run while the garbage collection is being completed, and then the full GC may be triggered again, i.e. “Concurrent mode failure”. Stop the World is entered and the Serial Old collector is used for collection

5. G1 collector

G1 is a server-based garbage collector for machines with large amounts of memory. It has high throughput performance characteristics while meeting the GC pause time requirements

G1 retains the concept of young and old, but instead of being physically separated, they are collections of (potentially discontinuous) regions. G1 divides the heap into independent regions of equal size, and the JVM can have up to 2048 regions. The size of Region is equal to the size of the heap divided by 2048. For example, if the heap size is 4096 MB, the size of Region is 2 MB.

As shown in figure:

By default, the young generation contributes 5% of the heap memory. While the system is running, the JVM keeps adding more regions to the young generation, but not more than 60%, which can be adjusted by “-xx :G1MaxNewSizePercent”

The Region function of a Region changes dynamically. A Region may have been a young generation before, but may become an old generation after garbage collection.

Rather than allowing large objects to go directly into older regions, G1 has a Region that allocates large objects called the Humongous Region (for short-term large objects). In G1, the rule for determining large objects is that a large object is more than 50% of the size of a Region. If a large object is too large, it may be stored across multiple regions.

G1 collector a GC operation process can be roughly divided into the following steps:

  • Initial marker: Suspend all other threads and record objects that GC Roots can reference directly, which is fast;
  • Concurrent tag: Concurrent tag with CMS
  • Final markup: re-markup with CMS
  • Filter recycling: The G1 collector maintains a priority list in the background, based on the expected GC pause times (specified by the JVM parameter -xx :MaxGCPauseMillis). Within the time range allowed by the user, the G1 collector preferentially selects regions with the highest collection value, allowing G1 to maximize collection efficiency within the limited time. No matter in the young generation or the old generation, the reclamation algorithm uses the replication algorithm to copy the surviving objects in one region to another region. This method will not complete the reclamation like CMS because there are many memory fragments that need to be collated once. G1 uses the replication algorithm to recycle almost no memory fragments.

As shown in figure:

It has the following characteristics:

  • Parallelism and concurrency: The G1 takes full advantage of The hardware of multi-core cpus to shorten stop-the-world pause times
  • Generational collection: Although G1 can manage the entire GC heap independently without the cooperation of other collectors, the concept of generational collection is retained.
  • Spatial consolidation: Different from CMS’s “mark-clean” algorithm, G1 is a collector based on the “mark-clean” algorithm as a whole; Locally, it is based on a “copy” algorithm.
  • Predictable pauses: This is another big advantage G1 has over CMS. Reducing pauses is a common concern of BOTH G1 and CMS, but G1 also models predictable pauses that allow users to specify that garbage collection is completed in M milliseconds. (Specified with “-xx :MaxGCPauseMillis”)

G1 Garbage collection and classification:

1. YoungGC

G1 will calculate how long it will take to reclaim Eden. If the reclaim time is far less than the value set by -xx :MaxGCPauseMills, add the region of the young generation and continue to store new objects. The Young GC will not be done immediately until the next Eden area is full and G1 calculates the collection time close to the value set by -xx :MaxGCPauseMills, then the Young GC will be triggered

2. MixedGC

Not FullGC, the heap of Old s share reach parameters (- XX: InitiatingHeapOccupancyPercen) set the value of the trigger, recycle all part of Young and Old (according to expect GC pause time determine the priority of the Old area garbage collection) and the large object area, Under normal circumstances, G1 garbage collection is MixedGC first, mainly using the replication algorithm, which needs to copy the surviving objects in each region to other regions. During the copying process, if there are not enough empty regions that can carry the copied objects, a Full GC will be triggered

3. Full GC

Stopping the system program and then marking, cleaning, and compressing it with a single thread to free up a bunch of regions for the next MixedGC is time consuming.

G1 garbage collector optimization recommendations:

Suppose the -xx :MaxGCPauseMills parameter is set to a large value, causing the system to run for a long time, and the young generation may have consumed 60% of the heap before the young GENERATION GC is triggered. In this case, there may be too many surviving objects, which will result in the Survivor zone not being able to accommodate so many objects, and it will enter the old age. Or after your young gc, there are so many surviving objects that you enter the Survivor zone and trigger a dynamic age rule that reaches 50% of the Survivor zone and quickly causes some objects to enter the old age. The key here is to adjust the value of the -xx :MaxGCPauseMills parameter. In addition to ensuring that the young generation gc is not too frequent, we also need to consider the number of live objects after each GC, so as to avoid too many live objects quickly entering the old age and frequently trigger mixed GC.

What scenarios are appropriate to use G1?

  1. More than 50% of the heap is occupied by live objects
  2. The speed of object assignment and promotion varies greatly
  3. Garbage collection times are particularly long, exceeding 1 second
  4. More than 8GB of heap memory (recommended)
  5. The pause time is within 500ms