Garbage collection – Overview

1.1 How Do I Determine whether an Object is garbage Object

  • Reference counting
  • Accessible to row analysis

1.2 How Do I Collect Garbage

  • Recovery strategy
    • Mark-clear algorithm
    • Replication algorithm
    • Mark-collation algorithm
    • Generational collection algorithm
  • Garbage collector
    • Serial
    • Parnew
    • Cms
    • G1

1.3 When to Recycle

2 Garbage collection – Determines whether an object is alive

  • The garbage collection log is displayed
  • -verbose:gc -xx:+PrintGCDetail (Run Configurations->VM arguments)

2.1 Detailed explanation of reference counting

  • Add a reference counter to the object. When there is a reference to the object, the reference counter value is +1.

When the reference is invalid, the counter will have a value of -1.

  • This algorithm is rarely used in current collectors

If the stack points to the heap object 1, the heap object 1 points to the heap object 2, the heap object 2 points to the heap object 3, if the heap object 1 is not pointed to the stack [stack -> object 1 is 0], but the reference of object 1-> Object 2, object 2->3 is not 0, and the reference counting algorithm will not recycle object 2 and object 3.

2.2 Detailed explanation of accessibility analysis

  • The starting point is some objects called GCRoot, and the search path from these nodes is called the Reference Chain. When an object is not connected to any Reference Chain, the object is not available. [i.e. :GCRoot not reachable (no reference chain) is garbage]

  • The objects of GCRoot are:

    • Virtual machine stack (local variable table in stack frame)
    • An object referenced by a static property of the class in the method area
    • An object referenced by a constant in the method area
    • Objects referenced by JNI(Native) in the local method stack

2.3 Further references

2.3.1 strong reference

Most of the references we used before were actually strong references, which are the most commonly used references. If an object has a strong reference, it is like an essential household item, and the garbage collector will never recycle it. When there is insufficient memory space, the Java virtual Machine would rather throw an OutOfMemoryError and the program terminates abnormally than arbitrarily reclaim objects with strong references to solve the memory shortage problem.

String str = "abc"; List<String> list = new Arraylist<String>(); list.add(str); Data in the list collection is not freed, even if there is insufficient memoryCopy the code

2.3.2 soft references

If an object has only soft references, it is like a household item with something to have. If there is enough memory space, the garbage collector does not reclaim it, and if there is not enough memory space, the memory of these objects will be reclaimed. As long as the garbage collector does not reclaim it, the object can be used by the program. Soft references can be used to implement memory-sensitive caching. Soft references can be used in conjunction with a ReferenceQueue (ReferenceQueue). If the object referenced by the soft reference is garbage collected, the JAVA virtual machine will add the soft reference to the ReferenceQueue associated with it.

Public class Test {public static void main(String[] args){system.out.println (" start "); A a = new A(); SoftReference<A> sr = new SoftReference<A>(a); a = null; if(sr! =null){ a = sr.get(); } else{ a = new A(); sr = new SoftReference<A>(a); } system.out.println (" end "); } } class A{ int[] a ; public A(){ a = new int[100000000]; }}Copy the code

When the memory is large enough, the array can be stored in soft reference, and the data can be fetched from memory to improve the operation efficiency

  • Usage scenarios

    The main user of soft reference implements the function similar to cache. In the case of sufficient memory, the value is directly obtained by soft reference, without the need to query data from busy real sources, so as to improve the speed. When memory runs out, the cached data is automatically deleted and queried from the real source.

2.3.3 weak references

If an object has only weak references, it is akin to a household item with something available.Copy the code
  • The difference between weak and soft references

Objects with only weak references have a shorter life cycle. When the garbage collector thread scans the area of memory it manages, if it finds an object with only weak references, it will reclaim its memory, regardless of whether the current memory space is sufficient. However, because the garbage collector is a low-priority thread, objects with only weak references are not necessarily quickly discovered. Weak references can be used in conjunction with a ReferenceQueue (ReferenceQueue). If the object referenced by the weak reference is garbage collected, the Java virtual machine will add the weak reference to the ReferenceQueue associated with the weak reference. Such as:

Object c = new Car(); WeakReference< car > weakCar = new WeakReference(car) (car); // WeakReference< car > weakCar = new WeakReference(car) (car); Weakcar.get (); WeakCar.get (); WeakCar.get (); WeakCar.get (); WeakCar.get (); WeakCar.get (); WeakCar.get ().Copy the code

import java.lang.ref.WeakReference;

public class TestWeakReference {
    public static void main(String[] args) {

        Car car = new Car(22000, "silver");
        WeakReference<Car> weakCar = new WeakReference<Car>(car);

        int i = 0;

        while (true) {
            if (weakCar.get() != null) {
                i++;
                System.out.println("Object is alive for " + i + " loops - " + weakCar);
            } else {
                System.out.println("Object has been collected.");
                break;
            }
        }
    }
}


class Car {
    private double price;
    private String colour;

    public Car(double price, String colour) {
        this.price = price;
        this.colour = colour;
    }

    public double getPrice() {
        return price;
    }

    public void setPrice(double price) {
        this.price = price;
    }

    public String getColour() {
        return colour;
    }

    public void setColour(String colour) {
        this.colour = colour;
    }

    public String toString() {
        return colour + "car costs $" + price;
    }

}
Copy the code

In the above example, after the program has been running for some time, the program prints out “Object has been collected.” The Object that weak Reference points to is collected.

Object is alive for “+ I +” loops – “+weakCar

System.out.println(” car==== “+car); Because there is a strong reference to the CAR object

A weak reference is collected during the second garbage collection. The data is retrieved from the weak reference within a short period of time. When the second garbage collection is performed, null is returned. Weak references are used to monitor whether an object has been marked by the garbage collector as garbage to be collected. The isEnQueued method of weak references can be used to return whether the object has been marked by the garbage collector.

2.3.4 virtual reference

  • A virtual reference is, as the name suggests, a virtual reference. Unlike other references, a virtual reference does not determine the life cycle of an object. If an object holds only virtual references, it is just as likely to be garbage collected at any time as if it had no references at all.

  • One difference between virtual references and soft and weak references

Virtual references must be used in conjunction with the ReferenceQueue. When the garbage collector is ready to reclaim an object, if it finds that it has a virtual reference, it will add the virtual reference to the reference queue associated with it before reclaiming the object’s memory. A program can determine whether a referenced object is about to be garbage collected by determining whether a virtual reference has been added to the reference queue. If a program finds that a virtual reference has been added to the reference queue, it can take the necessary action before the referenced object’s memory is reclaimed. Special attention, in the century program design is rarely used in weak reference and virtual reference, the use of soft use more, this is because soft reference can accelerate the JVM to garbage memory recycling speed, can maintain the safety of the system, to prevent the overflow of memory (OutOfMemory) and other problems.

  • Virtual references are primarily used to track the activity of an object being garbage collected. The virtual reference is collected every time the garbage collection. The data obtained through the virtual reference get method is always null, so it is also called the ghost reference. Virtual references are used to check whether an object has been deleted from memory.

2.3.5 summary

  • Strong references:

String str = “abc”; list.add(str);

  • Soft references:

If the memory still alarms after the weak reference object is reclaimed, the soft reference object continues to be reclaimed

  • Weak references:

If the memory still alarms after the virtual reference object is reclaimed, the weak reference object continues to be reclaimed

  • Phantom reference:

The virtual machine is running out of memory, and an alarm is generated. At this time, the garbage collection mechanism starts to execute system.gc (). String s = “ABC”; If no object is reclaimed, objects without virtual references are reclaimed

The four types of Java references include strong references, soft references, weak references, and virtual references

3 Garbage collection algorithm

3.1 Tag clearing algorithm

3.1.1 overview

  • The algorithm is divided into two stages: “marking” and “clearing” : first, the objects to be reclaimed are marked, and all the marked objects are reclaimed uniformly after the completion of marking.

Garbage marked according to the accessibility analysis method generally 2.1. Recycle.

  • Two stages.
  1. Mark phase: Find all accessible objects and mark them
  2. Cleanup phase: the heap is traversed to reclaim unmarked objects

3.1.2 Application Scenarios

The algorithm is generally applied to the old age, because the object life cycle is longer in the old age.

3.1.2 the pros and cons

3.1.2.1 advantages

  • Is to solve the circular reference problem
  • Reclaim only when necessary (when memory is running out)

3.1.2.2 shortcomings

  • To recycle, the application needs to be suspended, which is to stop the world.
  • Efficiency problem The efficiency of marking and clearing is not high, especially when there are more objects to scan
  • Space problems can cause discontinuous memory fragmentation (which can result in the presence of memory space, but cannot be applied for a slightly larger object due to the discontinuity), and when large objects are allocated in the future, more space fragmentation will result in the inability to find continuous space and have to start another garbage collection earlier.

Java garbage collection algorithm – tag clearance

3.2 Replication Algorithm

3.2.1 overview

Memory partition:

  • The heap
    • The new generation
      • Eden the garden of Eden
      • Survivor1/2 1/2
      • Tenured Gen Old Age – Pension area
    • The old s
  • Method area (Hotspot persistent generation)
  • The stack
    • Local method stack
    • Program counter

3.2.2 Working Principle:

  • Two memory areas, use one of them, first mark the used area, and then copy the unreclaimed space to the other half (contiguous memory)
  • Divide Eden 80%, Survivor1 10% and Survivor 10%; The memory object is created in Eden. If there is not enough of Survivor1, the surviving object is moved to Survivor2 when recycling. Then create new objects in Eden and Survivor2, and move the surviving ones to Survivor1 when recycling again. If reciprocating; It only wastes 10% of the space and is not available; If the inventory exceeds 10% after multiple times (10% of a Survivor is no longer available); The guarantee is then placed in Tenured Gen(memory guarantee [space allocation guarantee]).

3.2.3 Application Scenarios

Mainly for the new generation, the older generation can not directly choose this.

3.2.3 the pros and cons

  • Advantages: Improved efficiency and live memory is continuous.
  • Disadvantages: Waste of general space. When the rate of surviving objects is high, the efficiency becomes low when more replication operations are performed.

3.3 Tag sorting algorithm

3.3.1 overview

  • Mark-tidy – clear

The live object and garbage object are marked first, and the live object is moved to one end after finishing, and the object memory outside the end boundary is cleared after finishing.

3.3.2 Application Scenarios

The government aimed at the old age

3.4 Generational collection algorithm

  • Collection algorithm by region
    • A new generation selective replication algorithm with high memory recovery
    • In the old era, low memory recovery selection tag sorting algorithm

4. Garbage collector

4.1 Serial Collector

4.4.1 overview

p76

  • Serial /’sɪrɪəl//
  • Replication algorithm
  • Advantages and disadvantages: single thread garbage collector, the earliest development, single thread has no thread overhead, high efficiency;

4.1.2 Application Scenarios

For desktop applications, the default generation collector in Client mode (Client mode allocates little JVM memory, collects fast, pauses are not aware).

4.1.3 Setting Parameters

  • -xx :+UseSerialGC Starts the Serial collector
  • Xms30m -XMx30m -xMN10M: Xms30m -xmx30m specifies the fixed size of the JVM to be 30M, and -xmn10m refers to the new generation of JAVA space to be 10M.

4.2 ParNew Collector

p77

  • Feature: Parallel, which is essentially a multithreaded version of Serial. And Serial share a lot of code.
  • Application scenario: In Server mode, the VM is the first generation collector
  • Can be used with CMS collector (old age)
  • XX:+USeParNewGC Opens the concurrent flag to scan the garbage collector

Avenge the PARALLEL Collector.

This overview

  • Parallel /’pærəl
  • Replication algorithm
  • Multithreaded collector
  • To achieve a manageable throughput, so it is also called a “throughput first” collector

Throughput: The ratio of the CPU’s time spent running user code to the total CPU consumption

Throughput = (User code execution time)/(User code execution time + garbage collection time)

4.3.2 Application Scenarios

  • Cenozoic collector
  • The shorter the pause, the better the user experience.
  • Suitable for servers: high availability, high concurrency, high throughput required.

4.3.3 Setting Parameters

  • -xx :MaxGCPauseMillis Indicates the maximum pause time of the garbage collector, in milliseconds. If the value is too small, the new generation memory becomes smaller and the garbage collection frequency increases. So it needs to be set according to the actual scene.
  • -xx: indicates the throughput of GCTimeRatio. (0, 100) open interval, excluding 0 and 100; Default maximum: 99(1% reclamation time)
  • -XX:+UseAdaptiveSizePolicy On the switch parameter, you don’t have to manually specify the size of new generation (- Xmn), Eden and Survivor ratio (- XX: SurvivorRatio), promotion and old the size of the object s (- XX: PretenureSizeThreshold) details, parameters, such as virtual opportunities based on the current system is running Collect performance monitoring information and dynamically adjust these parameters to provide the most appropriate pause time or maximum throughput; This approach is called GC Ergonomics; To use this setting, you need to set the maximum heap (-xmx) and either MaxGCPauseMillis or GCTimeRatio.

4.4 Details about the CMS collector

4.4.1 overview

CMS(Concurrent Mark-Sweep)

  • Concurrent Mark Sweep collector
  • A collector whose goal is to obtain the shortest collection pause time. Applicable to Internet sites or B/S servers
  • When CMS is used as an old age, only Serial or ParNew generation collectors can be used.
  • Marker clearing algorithm
  • Concurrent; Reduce delay and increase speed
  • The difference between parallelism and concurrency
Concurrency and parallellism are: Explanation 1: Parallelism is when two or more events occur at the same time; Concurrency is when two or more events occur at the same time interval. Explanation 2: Parallelism is multiple events on different entities, concurrency is multiple events on the same entity. Explanation 3: Multitasking on one processor, multitasking on multiple processors. For example, hadoop distributed clusterCopy the code

Concurrency is when a processor processes more than one task at a time. Parallelization is when multiple processors or processors with multiple cores work on multiple different tasks at the same time. Concurrency is logical simultaneous, while parallelism is physical simultaneous. Here’s an analogy: concurrency means that one person eats three steamed buns at the same time, while parallelization means that three people eat three steamed buns at the same time

High concurrency: Implement one CPU for multiple threads instead of multiple cpus for multiple threads.

The difference between concurrent and parallel

4.4.2 Working process

  • STW Initial mark Suspends the application

At this stage, The virtual machine is required to Stop The task being performed, officially called STW(Stop The Word). The process starts with the “root object” in garbage collection, and only objects that can be directly associated with the “root object” are scanned and marked. So this process suspends the entire JVM, but it’s done quickly.

  • Concurrent marking

This phase follows the initial tag phase, and traces the tag down from the initial tag. In the concurrent markup phase, the application thread and the concurrent markup thread execute concurrently, so the user does not feel the pause.

  • Concurrent precleaning

The concurrent preclean phase is still concurrent. In this phase, the virtual machine looks for objects that are new to the old age during the execution of concurrent markup (there may be some objects promoted from the new generation to the old age, or some objects assigned to the old age). By rescanning, it reduces The “re-labeling” of The next phase, which stops The World.

  • STW remark Pauses the application

Resignating: This phase suspends the virtual machine and the collector thread scans for the remaining objects in the CMS heap. The scan starts with “follow object” and traces down, processing object associations.

  • Concurrent Cleanups

Garbage objects are cleaned up, during which the collector thread and the application thread execute concurrently.

  • Concurrent reset

At this stage, reset the DATA structure of the CMS collector and wait for the next garbage collection.

4.4.3 pros and cons

4.4.3.1 pros and cons
  • Concurrent collection
  • Low pause
4.4.3.1 shortcomings
  • Consumes a lot of CPU resources (parallel collectors all consume CPU).

The specific reasons are as follows: THE CMS is sensitive to CPU resources. In the concurrent phase, although the CMS does not stop user threads, some threads are occupied for garbage collection, which reduces the total throughput. By default, the CMS starts threads at (number of cpus +3)/4. When the number of cpus is greater than 4, the CMS collector consumes less than 25% of the CPU resources. However, when the number of cpus is less than 4, the CMS collector has a greater impact on user programs

  • Unable to deal with floating garbage generated garbage while cleaning, cleaning and then throw garbage, that will come to clean next time.

    The CMS can also run user threads during the concurrent cleanup phase, which will generate new garbage that CMS cannot collect this time. This part is called floating garbage. Therefore, unlike other collectors, CMS cannot wait until the aging generation is almost full before recycling, and must reserve some space for user threads to use in case of concurrent collection. The default Settings, CMS collector after the old generation to use as much as 68% of the space will be activated, can pass – XX: CMSInitiatingOccupancyFraction parameters to set this property.

  • A memory area is set aside to clean up newly generated objects during the process. If this space is too large, it will be wasted, and if it is too small, this error (Concurrent Mode Failure) will occur.

    A “Concurrent Mode Failure” occurs when the MEMORY reserved by the CMS is not sufficient for the program at runtime, in which case the Serial Old collector is temporarily enabled for the virtual machine to restart the Old garbage collection. CMS is based on the tag sorting algorithm, and there will be a lot of space debris in the process of cleaning. Allocating space to a hit object when there is too much space fragmentation can be troublesome. CMS provides a parameter – XX: + UseCMSCompactAtFullCollection used to attach a at the completion of a Full GC defragmentation process, defragmentation can’t concurrent will lead to pause time longer. Of course also provides a parameter – XX: CMSFullGCsBeforeCompaction, this parameter is set to Full how many times without compression in the implementation of the GC, followed by a band of compression.

  • Space debris CMS does not defragment or compress heap space

This is caused by the use of the marker clearing algorithm.

4.4.4 Common Configurations

  • -xx :+UseConcMarkSweepGC Enables the CMS setting
  • – XX: CMSInitiatingOccupancyFraction Settings to use the old s number to activate the CMS.
  • After completion of the Full GC – XX: + UseCMSCompactAtFullCollection attach a defragmentation process
  • – XX: CMSFullGCsBeforeCompaction performed many times without compression after Full GC, followed by a band of compression

4.5 G1 Collector

4.5.1 Basic Concepts

  • Stop-the-world (STW) is when The garbage collection algorithm is executed, all other threads of The Java application are suspended (except for The garbage collection helper). A global pause in Java, where all Java code stops, native code can execute, but cannot interact with the JVM; These phenomena are mostly caused by GC, which suspends all currently running threads

  • At the safe point, the virtual machine generates an OopMap to record the references. (This is why you can’t stop anywhere. If you generate an OopMap for every instruction, it’s very inefficient and takes up a lot of space.)

    The general safe points are set in the following positions: • Method call (before the method return/after the call instruction to call the method) • Loop jump (at the end of the loop) • Exception jump (where an exception may be thrown)

    So how does the JVM stop threads? When GC is required, the JVM changes the value of the flag. The thread polls for the flag while running, and when it receives a signal that GC is going to occur, it stops at the next safe point and waits for GC to proceed

    Of course, just using the safe point is not enough, there is the following case, when the thread sleep or block, it will not run at all, let alone enter the safe point, let alone make all the threads to wait for it, so the concept of safe zone is introduced

Java family: OopMap in the JVM

  • The safety area

    When a thread enters a safe zone, such as sleep or blocks, it signals that it is in the safe zone. When it GC, it does not bother to leave the safe zone. When it leaves the safe zone, it first checks whether the JVM has finished GC, and if not, it waits until the GC is complete before leaving the safe zone

  • The moment the safety point is triggered

    In addition to GC, other VM operations that trigger a safety point include:

    1. JIT related, such as Code deoptimization, Flushing Code cache
    2. Class redefinition (e.g. javaagent, generative instrumentation for AOP code embedding)
    3. Biased Lock revocation Cancels the Biased lock
    4. Various debug operation (e.g. thread dump or deadlock check)
  • The metrics of interest to JVM performance are

    • Throughput: Percentage of CPU time not spent in garbage collection. Contains memory allocation time.
    • Pauses: The amount of time that Pauses are unable to respond because of garbage collection
    • Footprint: The memory used by the program (is the working set of a process, measured in pages and cache lines)
    • Promptness: When an object is unreachable (when there are no other object references) and when memory is available (when it is not available until after GC), (is the time between when an object becomes dead and when the memory becomes available, an important consideration for distributed systems, including remote method invocation (RMI))

4.5.2 overview

  • G1 = garbage first, garbage collection is first. The most awesome garbage collector
  • On the server side, G1 was designed with simple and achievable performance tuning in mind
  • Take advantage of multiple cpus and cores to reduce stop-the-world pause times.
  • Generational (incremental) collection is divided into regions, regardless of the old age, the new generation, etc., and then recycled according to the memory region. The memory area contains n Eden, Surviror, old, and Humongous
  • Algorithm: “mark-clean” (CMS is “mark-clean”) and “copy” between two regions. The space integrates the fragments together.
    • G1 is implemented as a whole based on the tag sorting algorithm and locally (between two regions) based on the replication algorithm, both of which do not generate memory fragmentation
  • Predictable pause processing is much more powerful than CMS, specifying that no more than N milliseconds will be spent on the garbage collector for a time segment of M milliseconds in length. The predictable reason is that you can systematically avoid region-wide garbage collection in The entire Java heap.
  • Remember Set records references to objects to score each Region. Then tidy up.

4.5.3 advantage

Hotspot already carries collectors such as Serial, Paralel, CMS, why do you need to develop a new G1? The three performance indicators of garbage collection: footprint, Max pause time, and throughput seem to be as unsatisfying as CAP. (The larger the Heap, the longer the pause time)

On the server side, the focus is on the short pause time, the stop-the-world time, and the total pause time over a period of time is also a measure.

Mark-sweep and Mark-Compact each require effort proportional to the size of the cleanup area, while Copying algorithms typically require about half the space to store live objects per copy. The TWO STW phases of CMS Initial Marking and Remarking take longer when the Heap area is larger and larger, and also cause longer pause times when compression is needed due to memory fragmentation. Therefore, a high-throughput short stop-time collector is required, regardless of the size of the heap memory. That’s G1.

4.5.4 Working Principle

In the picture, E is Eden, S is Survivor, H is Humongous, O is Old, and the empty space is the availability zone.

  • Region: the idea of dividing parts into parts.

The G1 algorithm divides the heap into several regions, which still belong to the generational collector. Region are in the Eden, Survivor, Old, Humongous various types, (no need to continuous).

The G1 collector divides the heap memory into a series of equal-sized regions, with Region sizes ranging from 1MB to 32MB determined at startup. G1 also uses a generational collection strategy, dividing the heap into Eden, Survivior,Old, and so on, but logically. Each Region belongs to a generational Region logically and is physically discontinuous. After an Old Region is collected, it becomes a new available Region and may become the next Eden Region. When the requested object is larger than half of the Region size, it is placed in a Humongous Region. If a Region is empty, it is called an available Region or a new Region.

    • New generation:

Some of these areas contain the new generation, which still suspends all application threads and copies the surviving objects to the old or Survivor space.

    • Old age:

The old age is also divided into regions, and the G1 collector cleans up by copying objects from one region to another. This means that during normal processing, G1 compresses the heap (at least partially) so that there is no problem with CMS fragmentation.

    • A: /hju ɑŋ ŋ əs// a

If an object occupies more than 50% of the partition capacity, the G1 collector considers it to be a giant object. These giant objects, by default, are directly allocated to the aging generation, but if it is a short-lived giant object, it will have a negative impact on the garbage collector. To solve this problem, G1 has a Humongous area, which is dedicated to giant objects. If a single h-block doesn’t fit a giant object, G1 will look for contiguous H-partitions to store it. In order to find continuous H blocks, you sometimes have to start Full GC.

  • The G1 young generation collector is a parallel stop-the-world collector, and like other HotSpot GCS, when a young generation GC occurs, the entire young generation is reclaimed. The G1 old-age collector is different in that it does not require the entire old-age collector to be collected in the old-age, and only part of the Region is called.

  • When a Java heap bottleneck is exceeded, that is, when the heap runs out, G1 initializes the old collector. This initial-mark phase is a parallel stop-the-world phase, whose size depends on the old and the overall Java heap size.

    • The persistence generation has also been moved to the normal heap memory space, to the meta space
  • Garbage accumulation value

G1 tracks the value of Garbage accumulation in each Region(the experience value of space collected and time required for Garbage collection), and maintains a priority list in the background. According to the allowed collection time, the Region with the largest value is First collected (this is also the name of garbage-first), which improves the Garbage collection efficiency .

  • Remembered Set

In addition to objects in the Region, a Region can be referenced by objects in other regions. New Generation Region and old era Region can also be used for remembering Set to avoid full heap scanning. Each Region has a Remembered Set. When the VM discovery program writes data of the Reference type, it generates a Write Barrier temporarily stops the write operation and checks whether the objects referenced by Reference are in different regions (in the generational example, it checks whether the objects of the old generation Reference the objects of the new generation). If yes, CardTable records the ** related Reference information to the Region to which the referenced objects belong Who was quoted by ** When in memory recycling, a Remebered Set in the enumeration scope of the GC root node ensures that nothing in the full heap scan will be missed.

Only references from other regions need to be recorded in RS. Therefore, references and nulls inside regions do not need to be recorded in RS.

  • CardTable CardTable is a remembered set

Because G1 only recyls part of its Region, it needs to know which objects in other regions are referring to the objects in its own Region. Because the proposed algorithm requires moving objects, it needs to update the references to the new addresses of the objects. This is also true in ordinary distribution collections. A collection of records, often called remembered set(RS), that require citations from the old to the young generation. CardTable is a remembered set. A card represents a range of memory. Currently, a card is 512bytes. Maintaining a Remembered set requires the Mutator thread to notify the Collector of possible changes to cross-region references, which is often called a write barrier(as opposed to Memory barriers in GC). Each thread will have its own remembered set log, which will correspond to its own modified card buffer, as well as a global buffer. When the mutator’s own remember set buffer is full, it will be stored in the global buffer. Then create a new buffer.

  • Incremental Collection

To avoid long pause times, consider splitting the heap into multiple parts and collecting one part at a time. This method is called incremental collection. Generational collection can also be regarded as a special incremental collection.

  • SATB

The full name is snapshot-at-the-beginning, which is literally a Snapshot of The object that was alive At The Beginning of GC. It is obtained by Root Tracing and is used to maintain the correctness of concurrent GC. So how does it maintain the correctness of concurrent GC? According to the three-color labeling algorithm, we know that there are three states of an object: • White: The object is not marked to the end of the marking phase, will be garbage collection. • Grey: The object is marked, but its field is not marked or finished. • Black: The object is marked and all its fields are marked.

4.5.5 Object Allocation Policy

Speaking of the allocation of large objects, we have to talk about the allocation strategy of objects. It is divided into three stages:

1)TLAB(Thread Local Allocation Buffer) 2)Eden zone 3)Humongous zone

TLAB allocates buffers locally to threads. Its purpose is to get objects allocated as quickly as possible. If objects are allocated in a shared space, we need some synchronization mechanism to manage the free space Pointers in those Spaces. In the Eden space, each thread has a fixed partition for allocating objects, namely a TLAB. When objects are allocated, there is no need for any synchronization between threads.

For objects that cannot be allocated in TLAB space, the JVM will try to allocate them in Eden space. If the Eden space cannot accommodate the object, the space can only be allocated in the old age.

4.5.6 Working process

Initial tag ==> Root zone scan ==> Concurrent tag ==> Final tag ==> Filter reclaim

4.5.6.1 Overview of working process

  • Initial mark (STW)

At this stage, the G1 GC marks the root GCRoot. This phase is closely related to regular (STW) young generation garbage collection. P86: The initial tag simply marks the objects to which GC Root can be directly associated and changes the value of TAMS(Next Top Mark Start) so that the Next phase of user programs running concurrently can create new objects in the correct available Region. This phase requires thread stoppage, but it takes a short time.

  • Root Region Scan

The G1 GC scans for references to the old age in the survival of the initial tag and marks the referenced object. This phase runs at the same time as the application (non-STW), and only after this phase is complete can the next STW young generation garbage collection begin.

  • Concurrent Marking

The G1 GC looks for accessible (living) objects throughout the heap. This phase runs at the same time as the application and can be interrupted by the STW young generation garbage collection. P86: The concurrent marking phase is the reachedness analysis of the objects in the heap from GC Root to find the ones that are alive. This phase takes a long time but can be executed concurrently with the user program.

Activity (reachabability analysis) is analyzed without generating stop-the-world, concurrent with the program process. The lower the activity, the higher the efficiency of recycling, and the higher the priority of recycling.

  • Final mark (Remark, STW)

This phase is STW reclamation, which helps complete the marking cycle. The G1 GC clears the SATB buffer, keeps track of surviving objects that have not been accessed, and performs reference processing. P86: The final token is used to correct the part of the record that changed during the concurrent token period as the user program continued to operate. The virtual machine records object changes during this period in the thread Remembered Set Logs. The final markup phase requires that the data for Remembered Set Logs be merged into the Remembered Set. This phase requires the thread to be paused, but can be executed in parallel.

  • Filter collection (Cleanup, STW)

In this final phase, the G1 GC performs STW operations for statistics and RSet purification. During statistics, the G1 GC identifies regions that are completely free and regions that are available for mixed garbage collection. The cleanup phase is partially concurrent when the white space is reset and returned to the free list P86: screening phase recovery first to sort the recovery value and cost of each Region, according to user’s desired GC pauses to recycling plan, this stage is also can be done concurrently with the user program, but because only recycling part of the Region, time is the user can control, and pause user threads will greatly improve the collection effect Rate.

The young generation and the old generation are recycled at the same time at this stage. The region to be reclaimed is selected based on the survival of the region.

4.5.6.2 Detailed working process

  • Marking (tag)

The marking phase of the G1 collector is responsible for marking the objects that are alive at that point, calculating the activity of each Region, and so on.

G1 uses a marking algorithm called Snaphot-at-beginning, or SATB, to record a snapshot of The object graph At The Beginning of The marking, After new applications in the process of concurrent collection of objects are considered to be living objects, when the heap usage ratio exceeds began after marking InitiatingHeapOccupancyPercent stage, using SATB records at the start of the marking object graph snapshot,.

Which positions have been marked with bitmap in G1? The bit of one bitmap represents 8bytes. We use two marking bitmaps. A previous and a next. A previous marking bitmap indicates the part that has been marked. When the marking is complete, previous and next are exchanged

The marking phase is divided into several steps:

    • Initial Marking Phase

The marking cycle begins with the cleanup of the next marking bitmap, which is performed concurrently. The initial marking phase then begins, which suspends all threads and marks all objects that can be reached directly from the GC roots, in conjunction with the Young GC’s pause collection phase.

    • Root Region Scan Phase

In this stage, G1 scans the Survivor Region marked in the Initial Marking stage.

    • Concurrent Marking Phase

This phase of G1 uses tracing to find all reachable objects throughout the heap. This phase is executed concurrently.

    • Remark Phase

Remark is a STW phase. G1 processes all SATB buffers.

    • Cleanup Phase

In the last stage of marking, G1 counts the activity of each Region. Regions with no living objects are directly put into the list of idle regions, and the mixed GC Region candidate list is then found.

  • Collection process

In addition to regular Young GC, there is also Mixed GC in G1.

Young Garbage Collection When Eden is unable to apply for new objects, The Young GC copies the live objects from the Eden and Survivor regions (called Collection sets, csets) to new regions (new Survivor). When the GC age of an object reaches the threshold, the object is copied to the Old Region. By adopting Copying algorithms, the problem of memory fragmentation is avoided and separate compression is no longer necessary.

Mixed Garbage Collection when the Heap object after the proportion of the total Heap more than InitiatingHeapOccupancyPercent, ConcurentMarking will begin. After the completion of Concurrent Marking, G1 will be switched from Young GC to Mixed GC. In Mixed GC, G1 can add several regions of Old Region to CSet. When G1 has reclaimed enough memory, it falls back to Young GC.

Full GC is similar to CMS. Some collection processes in G1 are executed concurrently with the application. Therefore, the collection may not be completed yet. Called Concurrent Mode Failure in CMS, or Allocation Failure in G1, also degrades to a STW fullGC.

Floating Garbage G1 records live objects using a snapshot-at-the-begining approach, That’s the Object graph of memory at that point in time, but after that the Object inside may become Garbage, which is called floating Garbage and can only be collected by the next collection.

4.5.7 Common Configurations

  • – XX:+UseG1GC Uses the G1 garbage collector
  • -XX:+UseG1GC -Xmx32g -XX:MaxGCPauseMillis=200
  • – XX: + PrintGCApplicationStoppedTime GC log print “stop – the – world” (STW) pause time
  • + PrintSafepointStatistics – – XX: XX: PrintSafepointStatisticsCount = 1 to print a timestamp, and into the cause of the safe point (VM Operation type), as well as the thread profile

Amazing G1 — Java’s new garbage collection mechanism

4.6 Collector Summary and comparison

4.6.1 Comparison between CMS and G1

4.6.1.1 Working process

How CMS works

CMS: Initial mark -> concurrent mark -> Concurrent pre-clean -> Re-mark -> Concurrent clean -> concurrent reset

G1: Initial tag -> Concurrent tag -> Final tag -> Filter reclaim

• Initial markup: Marks out all reference objects directly associated with the root node. Requires STW • Concurrent marking: Traverses the associated nodes previously marked and continues to mark all surviving nodes down. All objects whose references change during this time are recorded in Remember Set Logs • Final flag: Flag the new garbage generated during concurrent flags. Need STW • Filter recycle: Recycle objects with high value according to the expected recycle time specified by the user (see “Principle” 2). Need to STW

  • conclusion

• From the figure above, G1 differs from CMS only in the final “filter reclamation” section (CMS is concurrent cleanup), in fact the entire heap memory partition of G1 collector is different from other collectors. •CMS needs to cooperate with ParNew, G1 can reclaim the entire space separately

4.6.1.2 Memory Partition

CMS

    • The new generation
      • Eden the garden of Eden
      • Survivor1/2 1/2
      • Tenured Gen Old Age – Pension area
    • The old s

G1

  • Region
    • E refer to Eden,
    • S is a Survivor,
    • H means the Humongous,
    • O is Old,
    • The blank area is an availability zone.

4.6.1.3 Advantages, disadvantages and Features

CMS(ConcurrentMarkSweep):

  • A collector whose goal is to get the shortest collection pause time
  • Characteristics: Generation: New generation (), old age
  • Algorithm: mark-clear
  • Advantages: Concurrent collection, low pause
  • Disadvantages: Space debris, if open the defragment will cause a longer pause time,
  • Disadvantages: Insufficient reserved space may cause Concurrent Mode Failure

G1(garbage first):

  • Garbage collection is The first priority, with The goal of reducing stop-the-world pause times

  • Feature: Predictable pauses (specifies a time segment of M milliseconds in length that consumes no more than N milliseconds in garbage collection.) The reason: According to the “garbage accumulation value” collection, Remember Set is introduced to record the object reference to score each Region.

  • Features: No generation is required. The Region is introduced and only the concept of generation is retained. G1 can manage the entire GC heap independently without the need for collaboration with other collectors

  • Algorithm: overall tag sorting algorithm, locally (between two regions) is based on the replication algorithm, both algorithms do not generate memory fragmentation.

  • Advantages: Parallelism and concurrency: THE G1 makes full use of multiple cpus

Jvm garbage collector CMS & G1 Garbage collector Serial, Parallel, CMS, G1

4.7 small summary

4.7.1 Summary of basic concepts

  • The two most basic Java reclamation algorithms are the copy algorithm and the tag cleanup algorithm
    • Copy algorithm: two regions A and B, the initial object is in A, and the surviving objects are moved to B. This is the most commonly used algorithm in the new generation
    • Mark clean: a area, mark to recycle the object, and then recycle, there will be debris, then elicited
    • Mark-defragmentation algorithm: With defragmentation, larger objects are stored in larger Spaces
  • Two concepts: the new generation and the old generation
    • Cenozoic: Initial objects with a short life cycle
      • Eden the garden of Eden
      • Survivor1/2 1/2
      • Tenured Gen Old Age – Pension area
    • An object that has existed for a long time

The entire Java garbage collection is a collaboration between the new generation and the old generation, called generational collection.

4.7.2 Collector comparison

  • The Serial New collector is a New generation collector that uses the replication algorithm. It is a single-threaded collector that must suspend all other worker threads during garbage collection until the collection is complete. Features: The CPU usage is the highest and the pause time is long. Application scenario: Small applications can use the serial garbage collector with the JVM parameter -xx :+UseSerialGC.

  • Parallel New collector, the New generation uses the replication algorithm, the old generation uses the mark sorting using multithreading to scan and compress the heap characteristics: short pause time, high recovery efficiency, high throughput requirements. Application scenarios: large-scale applications, scientific computing, large-scale data collection, etc. Turn on the concurrent flag to scan the garbage collector with the JVM parameter XX:+USeParNewGC.

  • Parallel Scavenge collector. Apply the copy collection algorithm

  • Serial Old collector, the new generation uses copy, the Old age uses mark cleaning

  • Parallel Old collector, for the Old age, mark sorting

  • CMS collector, based on tag cleaning characteristics: response time priority, reduce garbage collection pause time adapt to scenarios: server, telecom domain, etc. Set with the JVM parameter -xx :+UseConcMarkSweepGC

  • G1 collector: Overall marker based cleanup, partial replication in G1, the heap is divided into a number of continuous regions. G1 algorithm is adopted to recycle, which absorbs the characteristics of CMS collector. Features: Support for large heap, high throughput — support for multiple CPU and garbage collection threads — use of parallel collection when main thread is paused — Use of concurrent collection when main thread is running real-time targets: You can configure garbage collection to take up to M milliseconds in N milliseconds using the G1 garbage collector using the JVM parameter -xx :+UseG1GC

To sum up: the new generation basically adopts the replication algorithm, while the old generation adopts the marker sorting algorithm. CMS uses tag cleaning.

  • Scope of application

Go for STW short: If ParNew/CMS works, use this; Apply the APPLICATION to Parallel Scavenge/Parallel Old. G1 is not advantageous in throughput

4.7.3 Common collector combinations

The new generation The old s
Serial Serial Old
Serial CMS
ParNew CMS
ParNew Serial Old
Parallel Scavenge Serial Old
Parallel Scavenge Parallel Old
ParallelG1 G1

4.7.4 JDK default garbage collector

Jdk1.7 default garbage collector to be insane. Apply for the application.

Be insane. Jdk1.8 is the default garbage collector to apply for the APPLICATION.

Jdk1.9 default garbage collector G1

5. Allocate memory

5.1 Memory Allocation – Overview

Memory allocation Strategy

  • Eden is assigned first
  • Large objects are allocated directly to older generations
  • Long-lived objects are allocated to old ages
  • Space allocation guarantee

When space is short, go to the old age to borrow space

  • Dynamic object age judgment

5.2 Memory Allocation – Allocates memory to Eden first

Generally, objects are allocated in the Eden zone of the new generation. If there is not enough space in the Eden zone, the VIRTUAL machine initiates a MinorGC. When MinorGC fails to fit a survivor space, it is transferred to the old one again through the allocation guarantee mechanism;

main(){
    byte [] b1 = new byte[4*1024*1024];
    
    //System.gc();
}
Copy the code

5.3 Memory allocation – Large objects go straight into the old age

The JVM Settings are as follows: -xms20M heap memory size start -xmx20M heap memory size Max -xMn10M Generation size -xx :SurvivorRatio=8 (eden80% : S1 10%, S2 10%) main(){byte [] b1 = new byte[8*1024*1024]; Byte [] b1 = new byte[7*1024*1024]; 7M is still in the new generation.Copy the code

– XX: PretenureSizeThreshold = 6 m is greater than this value will enter old age

5.4 Memory allocation – Long lived objects enter the old age

-xx :MaxTenuringThreshold Default 15 Survivor garbage collection (MinorGC) lasts once and is added once. Reach the configuration value to enter the old age. Earlier versions of JDK6 are available. Since JDK 7,8, this is less accurate.

The sum of sizes of all objects of the same age in SURVIVOR is greater than PretenureSizeThreshold, half of the parameter, and objects whose age is greater than or equal to that age enter the old age

5.5 Memory Allocation – Space allocation guarantee

-xx: -handlePromotionFailure Two steps: 1. 2. Whether the available space of the old age is greater than the average size of the new generation entering the old age. If the available space is less than the average, the guarantee fails.

If Eden does not fit in,Survivor does not fit in, and the new generation does not fit in, it borrows memory from the one with memory, and memory allocation guarantees.

JVM Settings are as follows: -xMS20M heap memory size start -XMx20M heap memory size maximum -xMN10M generation size -xx :SurvivorRatio=8 (EDen80% : S1 10%, S2 10%) The code is as follows: main{ byte [] b1 = new byte[20*1024*1024]; byte [] b2 = new byte[20*1024*1024]; byte [] b3 = new byte[20*1024*1024]; byte [] b4 = new byte[40*1024*1024]; } b1Eden,b2Eden,b3Eden,b4 came to Eden, so MinorGC prepared S1, found S1 is also large, then 3 2M space guarantee to tenred generation, then b4 is put into Eden.Copy the code

Before each MinorGC, it checks to see if the maximum continuous free space of the old generation is greater than the total space of all objects in the new generation,

  • 1) If greater than, it is safe to MinorGC;
  • 2) Check HandlePromotionFailure if it is less than (JDK6U24 does not check HandlePromotionFailure again, it is always at risk)
    • 2.1) HandlePromotionFailure is true: if the risk is allowed, then the average size of the object promoted to the old age is compared to the average size of the object promoted to the old age
      • 2.1.1) If it is greater than, MinorGC will be carried out (the object promoted to the old age this time > the average object promoted to the old age all the time),
      • 2.1.2) If less than, FullGC(the object promoted to the old age < the average object promoted to the old age);
    • 2.2) FullGC for HandlePromotionFailure;

5.6 Memory allocation – Escape analysis and stack allocation

5.6.1 Escape analysis

Main objectives: Analyze object scope:

  • If a method body it by access, method is limited to the body, once refer to external member, the object happened escape. Examples: StackAllocation getInstance ().

  • If an object is valid only inside the method body, it is considered not to have escaped, and the object that did not escape can be allocated to the stack.

Summary: The object memory that does not escape is allocated to the stack memory. When the method ends, the stack memory is removed and the memory is reclaimed. Don’t use class objects if you can use local variables.

public class StackAllocation { public StackAllocation obj; **/ public StackAllocation getInstance(){return obj==null? new StackAllocation():obj; } public void setObj(){this.obj = new StackAllocation(); } public void userStackAllocation(){StackAllocation s = new StackAllocation(); **/ public void useStackAllocation2(){StackAllocation s = getInstance(); }}Copy the code

5.6.2 Allocation on the stack

Stack methods are executed and released later, which is more efficient.

6. Summary of garbage collection

6.0 review

Program counters: thread private. Is a small piece of memory that is an indicator of the line number of bytecode executed by the current thread. Is the only area in the Java virtual Machine specification where OOM (OutOfMemoryError) is not specified.

Java stack: thread private. The life cycle is the same as the thread. Is the memory model for Java method execution. Executing each method creates a stack frame that stores local variables and operands (object references). The amount of memory required for local variables is allocated at compile time. So the stack frame size doesn’t change. There are two exceptions: If the thread request depth is greater than the stack depth, StackOverflowError is thrown. If the stack cannot request enough memory while dynamically expanding, throw OOM.

Java heap: shared by all threads. The vm is created when the VM starts. Store object properties and arrays. The memory is the largest. It is divided into the New Generation (Young) and the Old (Old). The Cenozoic is divided into Eden zone and Servior zone. The Servior section is divided into the From space section and the To space section. The memory ratio between Eden and Servior is 8:1. When the extended memory is greater than the available memory, throw OOM.

Method area: shared by all threads. Used to store data such as class information, constants, and static variables loaded by VMS. Also known as non-heap. Method areas are also called “permanent generations”. GC rarely takes place in this region, but that does not mean it will not be collected. This area reclamation target is mainly for constant pool reclamation and unloading of types. If the requested memory is larger than the available memory, throw OOM.

Local method stack: thread private. Similar to the Java stack, but instead of serving Java methods (bytecode), it serves native non-Java methods. StackOverflowError and OOM will also be thrown.

6.1 the GC Jane junction

6.1.1 GC mechanism

To accurately understand the Java garbage collection mechanism, it is necessary to analyze from three aspects: “when”, “what”, “what” to do.

First: “when” is the condition for GC to trigger. There are two conditions for GC to trigger.

(1) It can be triggered when the program calls system. gc; (2) The system itself determines the timing of GC triggering.

The system determines GC triggering based on the memory size of Eden and From Space. When memory is insufficient, the GC thread is started and the application thread is stopped.

Second: There’s nothing wrong with thinking “something” is a Java object. To be precise, however, GC operates on objects that cannot be searched through reachability analysis and objects that can be searched. Tag methods that cannot be searched.

Third: the most superficial understanding of “what is done” is to release the object. However, from the underlying mechanism of GC, we can see that the objects that can be searched are copied, and the objects that cannot be searched are released by calling finalize() method.

6.1.2 GC does not necessarily reclaim an object that cannot be searched by reachability analysis

GC does not necessarily reclaim an object that cannot be found using reachability analysis. To fully reclaim an object, you need to go through at least two tags.

First marking: for an object with no other reference, filter whether it is necessary to execute Finalize () method on the object. If not, it means that it can be collected directly. (Filter basis: whether finalize() method has been written or executed; Finalize method can only be executed once).

Second flag: If the filtered judgment bit is necessary to execute, it will be put into the FQueue queue and automatically create a low priority Finalize thread to execute the release operation. If an object is referenced by another object before it is released, it is removed from the FQueue.

Summary of GC details and trigger conditions for Minor GC and Full GC

6.2 MinorGC \ MajorGC \ FullGC

6.2.1 MinorGC

  • MinorGC is a process of cleaning up and integrating YouGen’s young generation Spaces (including Eden and Survivor zones).

  • The trigger condition

When the Eden zone is full, the Minor GC is triggered. Minor GC is triggered when the JVM is unable to allocate space for a new object (Allocation Failure), such as when the Eden region is nearly full

  • Characteristics of the

Because the life cycle of most new objects is short, the frequency of Cenozoic MinorGc is high and the time is short. Minor GC is triggered when the JVM is unable to allocate space for a new object. Therefore, the higher the allocation rate, the more frequently Minor GC is performed. Will trigger the stop – the – world

  • The working principle of

When the GC thread starts, it copies the surviving objects in Eden and From Space To To Space through reachaibility analysis, and then releases the objects in Eden Space and From Space. When GC rotates To scan the To Space region a certain number of times, it copies the surviving objects To the old age, and then releases the objects To Space region.

When Minor GC is performed, the permanent generation is not affected. References from the permanent generation to the young generation are treated as GC roots, and references from the young generation to the permanent generation are directly ignored during the marking phase

6.2.2 MajorGC

  • Introduction to the

The Major GC cleans Tenured extents and is used to collect old ages. A Major GC usually occurs at least once in a Minor GC.

  • The trigger condition
  • The working principle of

MajorGC: Clears the memory space for integrating OldGen

6.2.3 FullGC

  • Introduction to the

Full GC is a global GC for the whole new generation, old generation, metaspace (java8 + replaces perm gen). Full GC is not equal to Major GC, nor is it equal to Minor GC+Major GC. It depends on what combination of garbage collectors is used to explain what kind of garbage collection it is.

Full GC is to clean up the entire heap space – both young and permanent generations.

  • The trigger condition
    • System.gc: Full GC is recommended but not required
    • There is not enough space in the old age
    • Insufficient space in method area (persistent generation)
    • The average size of the entry into the old age after Minor GC is greater than the available memory in the old age (in order to avoid the failure of the new generation to advance to the old age)

    When using G1,CMS, FullGC occurs as Serial+SerialOld. When ParalOld is used, FullGC occurs as a ParallNew +ParallOld.

    • When an object is copied From Eden or From Space To To Space, if the size of the object is larger than the available memory of To Space, the object is transferred To the old age, and the available memory of the old age is smaller than the size of the object
    • Promotion failed (if promotion failed in Eden, if promotion failed in Old, if promotion failed in Old, if promotion failed)
    • The concurrent-mod-failure process of CMS is divided into four steps:

(1).CMS initial mark (2).CMS Concurrent mark (3).CMS Concurrent mark (4).

In step 2, the gc thread is executed at the same time as the user thread, so the user thread can still generate garbage at the same time. If there is too much garbage to fit into the reserved space, there will be cmS-mod-failure. Switch to SerialOld single thread for mark-sweep-compact.

  • Characteristics of the

FullGC has low frequency and long time.

  • The working principle of

7. Refer to articles

Serial,Parallel,CMS,G1 Four GC collector features summary series articles -G1 and CMS contrast in-depth understanding of Java G1 garbage collector in-depth understanding of G1 garbage collector G1 collector and CSM collector contrast JVM- series articles – Java Stop The Stop The World JVM Garbage Collector and memory allocation Strategy A good series of articles — Part 1 — JVM Memory Region partition and Object information in the Heap memory Model Heap In-depth understanding of GC — MinorGC\MajorGC\FullGC