1 Object’s life and death

Inside the heap stored in the Java world almost all the object instance, the garbage collector before recycling was carried out on the pile, the first thing is to determine which of these objects was still “alive”, which is “dead” (that is, may no longer be any way to use object), that is how to determine whether an object in the GC live or die?

1.1 Reference Counting Algorithm (Reference Counting)

Many textbook algorithms for determining whether an object is alive are as follows: add a reference counter to the object and increment it by one each time a reference is made to it; When a reference is invalid, the counter value is reduced by 1; An object whose counter is 0 at any point in time cannot be used again. This is the Reference Counting algorithm. The implementation of Reference Counting algorithm is simple and the judgment efficiency is high. However, Reference Counting algorithm is not used to manage memory in mainstream Java VIRTUAL machines. The main reason is that it is difficult to solve the problem of circular Reference between objects.

Circular references are references to object A holding object B, and references to object B holding object A.

1.2 Reachability Analysis

In Java, Reachability Analysis is used to determine whether an object is alive. The basic idea of this algorithm is to search down from a series of objects called “GC Roots” as the starting point, and the search path is called the Reference Chain. When an object is not connected to GC Roots by any Reference Chain, it is proved that the object is unavailable. As shown, object 5, Object 6, and Object 7 are related to each other, but they are not reachable to GC Roots, so they will be judged as recyclable objects.

In the Java language, objects that can be used as GC Roots include the following:

  1. The object referenced in the virtual machine stack (the local variable table in the stack frame).
  2. The object referenced by the class static property in the method area.
  3. The object referenced by the constant in the method area.
  4. Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack.

1.3 Object Reference

Prior to JDK1.2, a reference in Java was defined as representing a reference if the value stored in the data of a reference type represented the starting address of another chunk of memory. This definition is too narrow; an object can only be referred to or not referred to by this definition. After JDK 1.2, Java expanded the concept of references, The references are divided into four types: ** Strong Reference, Soft Reference, Weak Reference and Phantom Reference. The strength of these four kinds of references gradually decreases.

  1. Strong references: References such as “Object obj=new Object ()” are common in program code. As long as strong references exist, the garbage collector will never reclaim the referenced Object.
  2. Soft references: is used to describe objects that are useful but not necessary. Objects associated with soft references are listed in the collection scope for a second collection before an out-of-memory exception occurs. An out-of-memory exception is thrown if there is not enough memory for this callback. Provided after JDK 1.2SoftReferenceClass to implement soft references.
  3. A weak reference: is also used to describe non-essential objects, but it is weaker than soft references, and objects associated with weak references only survive until the next garbage collection occurs. Provided after JDK 1.2WeakReferenceClass to implement weak references.
  4. Phantom reference: Also known as ghost reference or phantom reference, it is the weakest type of reference relationship. The existence of virtual references does not affect the lifetime of an object, nor can we obtain an object instance through virtual references.The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector. Provided after JDK 1.2PhantomReferenceClass to implement a virtual reference.

1.4 Death and Self-help

An object is declared dead at least twice: if an object is found to have no reference chain connected to GC Roots after reachabality analysis, it will be marked for the first time and filtered once, based on whether it is necessary to execute finalize () method. When objects do not overwrite the Finalize () method, or finalize () method has already been called by the virtual machine, objects will be reclaimed in both cases. If the object is judged to be necessary to execute finalize (), then the object will be placed ina Queue called f-queue and triggered later by a low-priority Finalizer thread automatically set up by the virtual machine. There is no promise to wait until the finalize () method is executed. If an object executes slowly in Finalize (), or an infinite loop occurs, other objects in the F-queue will be waiting forever, and even the whole memory reclamation system will crash.

package com.wtj.myjvm; /** * Created on 2019/4/11. ** @author wangtingjun * @since 1.0.0 */ ** * 1 Objects can save themselves when GC is performed. * 2. You only get one chance to save yourself. Because the finalize() method of an object will be called automatically by the system only once at most * * @author WTJ */ public class FinalizeEscapeGC {public static FinalizeEscapeGC SAVE_HOOK = null; public voidisAlive() {
        System.out.println("Yes, I am still alive :)");
    }

    @Override
    protected void finalize() throws Throwable {
        super.finalize();
        System.out.println("finalize mehtod executed!"); FinalizeEscapeGC.SAVE_HOOK = this; } public static void main(String[] args) throws Throwable { SAVE_HOOK = new FinalizeEscapeGC(); SAVE_HOOK = null; System.gc(); // Since the Finalize method has a low priority, we pause for 0.5 seconds to wait for thread.sleep (500);if(SAVE_HOOK ! = null) { SAVE_HOOK.isAlive(); }else {
            System.out.println("no,i am dead:("); SAVE_HOOK = null; SAVE_HOOK = null; System.gc(); // Since the Finalize method has a low priority, we pause for 0.5 seconds to wait for thread.sleep (500);if(SAVE_HOOK ! = null) { SAVE_HOOK.isAlive(); }else {
            System.out.println("no,i am dead:("); }}}Copy the code

Running results:

Finalize mehtod executed! Yes, I am still alive :) no, I am dead:Copy the code

As you can see from the code above, the object’s Finalize () method was indeed triggered by the GC collector and managed to escape before being collected. Object’s Finalize () method will only be called automatically by the system once at most.

2 garbage collection algorithm

2.1 Mark-clear algorithm

The mark-clear algorithm is one of the most basic collection algorithms, which is divided into two stages of “mark” and “clear” : first, mark all the objects to be recycled, after the completion of marking all the marked objects. It has two main shortcomings: one is the efficiency problem, marking and clearing the efficiency of the two processes are not high; The other is the space problem. After the mark is cleared, a large number of discontinuous memory fragments will be generated. Too much space fragment may cause that when the program needs to allocate large objects in the future, it cannot find enough contiguous memory and has to trigger another garbage collection action in advance.

2.2 Replication Algorithm

It divides the available memory by capacity into two equally sized pieces, one of which is used at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again. In this way, the memory is reclaimed for the whole half region every time, and there is no need to consider the complicated situation of memory fragmentation when allocating memory. As long as the heap top pointer is moved, the memory can be allocated in order, and the operation is efficient. But at the cost of reducing memory by half. The current commercial virtual machines all use this collection algorithm to recover the new generation. According to the research, 98% of the objects in the new generation are “live and die”, so it is not necessary to divide the memory space according to the 1:1 ratio, but to divide the memory into a large Eden space and two small Survivor Spaces. Use Eden and one of the pieces Survivor each time. When recycling is done, the surviving objects in Eden and Survivor are copied to another Survivor space once and for all, and Eden and the Survivor space that was just used are cleaned up. By default, the HotSpot VIRTUAL machine has an 8:1 ratio of Eden to Survivor, which means that each new generation has 90% (80%+10%) of its available memory, and only 10% of its memory is “wasted”. However, there is no way to guarantee that no more than 10% of objects survive each collection, and when Survivor space runs out, other memory is required for allocation guarantees (Handle Promotion). That is, if another Survivor space does not have enough space to store the surviving objects collected from the previous generation, these objects will enter the old age directly through the allocation guarantee mechanism.

2.3 Mark-Compact Algorithm

The mark-tidy algorithm still performs the same marking process as the Mark-clean algorithm, but instead of cleaning up the recyclable objects directly, the next step is to move all surviving objects to one end and then clean up memory directly beyond the end boundary.

2.4 Generational Collection algorithm

Currently, garbage collection of commercial VMS adopts generational collection, which divides the memory into several blocks based on the life cycle of objects. Typically, the Java heap is divided into the new generation and the old generation, so that the most appropriate collection algorithm can be used for each generation. In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected. In the old days, because the object has a high survival rate and there is no extra space to allocate it, it has to use the “mark-clean” or “mark-tidy” algorithm for recycling.

3 HotSpot VM algorithm

3.1 Enumerate the root node OopMap

The HotSpot VIRTUAL machine uses a reacability analysis algorithm to determine the survivability of an object, and checking each GC Roots node individually would be time-consuming. And the analysis of the work should be done in a snapshot to ensure consistency, namely during the whole analysis of the whole execution system looks like frozen at some point in time, can’t appear analysis object references in the process of relationship is still in changing circumstances, which does not meet the results of the analysis accuracy cannot be guaranteed. This is one of The important reasons why you have to Stop all Java threads of execution during GC (something Sun calls “Stop The World”). In its implementation, HotSpot uses a set of data structures called OopMap to hold the object references. When the class is loaded, HotSpot calculates what type of data is in the object at what offsets. During JIT compilation, HotSpot also records which locations in the stack and registers are references. This allows the GC to know this information directly during the scan.

3.2 Safepoint

HotSpot can quickly do the enumeration of the root node through OopMap. There are so many instructions in the OopMap that generating an OopMap for each instruction would require a lot of extra space and the GC space cost would be very high. In fact, HotSpot only records this information at “specific locations” called safepoints, where program execution does not stop everywhere to start GC, but only when it reaches a Safepoint. The selection of safety point is basically based on the standard of “whether the program has the characteristics that make the program run for a long time”, such as method call, circular jump, exception jump, etc., so instructions with these functions can produce Safepoint.

4 GC and memory allocation recycling used by HotSpot

The JDK version I use is JDK1.8, and you can see the default garbage collector for the VIRTUAL machine by running the Java -xx :+PrintCommandLineFlags -version command.

-XX:+UseParallelGC
Be insane or Serial Old.

4.1 the Parallel Scavenge

The Parallel Collector is a new generation collector that uses replication algorithms and is multithreaded. The goal of the Parallel Avenge is to hit a controlled throughput, the ratio of the amount of time the CPU spends running user code to the total amount of time the CPU consumes. A Parallel Avenge collector is also known as a “through-first” collector. The Parallel Scavenge collector provides two parameters for precise throughput control, the -xx: MaxGCPauseMillis parameter, which controls the maximum garbage collection pause time, and the -xx: GCTimeRatio parameter, which directly sets the throughput size. You can also run -xx: +UseAdaptiveSizePolicy to enable the adaptive adjustment policy. VMS can dynamically allocate performance monitoring information based on the system running status.

4.2 Serial Old

Serial Old is an older version of the Serial collector, which is also a single-threaded collector using a mark-tidy algorithm.

4.3 Concurrent Mark Sweep (CMS)

The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. These applications pay special attention to the response speed of services and hope to have the shortest system pause time to bring users a better experience. The CMS collector is a good fit for such applications. CMS collector is based on “mark – clear” algorithm, operation is divided into four steps: initial mark (CMS initial mark); CMS Concurrent Mark; Re-marking (CMS remark); CMS Concurrent sweep. Of these, concurrent tagging and concurrent cleanup are the most time-consuming, but can work with user threads.

4.4 Allocating and Reclaiming Memory

Java program at run time, will create a lot of object, the object of the life cycle is not the same, the main targets of short life cycle is assigned in the new generation of Eden, the garden of Eden) area, a few cases will be allocated in old age, is not absolutely fixed allocation rules, long object is assigned to life cycle in old age. Young generation: mainly stores newly created objects. Garbage collection of young generation is more frequent. In the young generation, there was one Eden region and two survior regions (from and to), with a ratio of 8:1. Objects in the young generation are short-lived and use the replication algorithm. At the beginning of GC, objects will only exist in the Eden zone and in the Survivor zone named “From”, where the Survivor zone “To” is empty. Following the GC, all surviving objects in Eden are copied To “To”, and surviving objects in “From” are moved based on their age. Objects whose age reaches a certain value (age threshold, which can be set by -xx :MaxTenuringThreshold) are moved To the aged generation, and objects that do not reach the threshold are copied To the “To” area. After this GC, the Eden and From areas have been emptied. At this point, “From” and “To” switch roles, so that the new “To” is the “From” before the last GC, and the new “From” is the “To” before the last GC. Either way, the Survivor region named To is guaranteed To be empty. The Minor GC repeats this process until the “To” section is filled, and when the “To” section is filled, all objects are moved To the aged generation.

Old generation
MaxTenuringThreshold
Full GCS can cause programs to temporarily stop, and a large number of Full GCS can slow down system response times and pose significant risks
Perm Gen(permanent generation) (replaced by meta space after JDK1.8)
It’s actually this method area

The ratio of young generation to old generation is 1:2

Reference: www.cnblogs.com/haitaofeiya…