01 preface

Last time we covered JVM memory, and today we continue the JVM topic.

02 JVM garbage collection algorithm

2.1 What is Garbage Collection?

The operation of the program must apply for memory resources, invalid object resources if not timely processing will always occupy memory resources, will eventually lead to memory overflow, so the management of memory resources is very important.

2.2 What objects need to be reclaimed?

Reference counting method

Reference counting is one of the oldest algorithms, first proposed by George E. Collins in 1960, and still used by many programming languages 50 years later.

The principle of

If there is an object A, any reference to A, then the reference counter of object A is +1. When the reference is invalid, the reference counter of object A is -1. If the value of the counter of object A is 0, then object A is unreferenced and can be reclaimed.

The advantages and disadvantages

Advantages:

  • Real-time performance is high, there is no need to wait until the memory is insufficient, before recycling, runtime according to the object counter is 0, can be directly recycled.

  • Applications do not need to be suspended during garbage collection. If the memory is insufficient, an outofMember error is reported immediately.

  • Culture, when an object’s counter is updated, only that object is affected, not all objects are scanned.

Disadvantages:

  • Every time an object is referenced, the counter needs to be updated, which has a little time overhead.

  • Wasting CPU resources, counting counters at run time even when memory is full.

  • Unable to resolve the circular reference problem. (Biggest weakness)

    class TestA{ public TestB b; }

    class TestB{ public TestA a; }

    // Even though both a and B are null, a and B are never recycled because they have circular references. public class Main{ public static void main(String[] args){ A a = new A(); B b = new B(); a.b=b; b.a=a; a = null; // Free resource b = null; // Release resources}}

Accessibility analysis algorithm

A series of root objects called “GC Roots” are used as the initial node set. From these nodes, search down according to Reference relationship. The path traversed in the search process is called “Reference Chain”. If the object is unreachable from GC Roots, then the object cannot be used again.

In JVM virtual machines, objects that can be used as GC Roots include the following:

  • Objects referenced in the virtual machine stack (the local variable table in the stack frame), such as parameters, local variables, temporary variables, etc. used in the method stack called by each thread.

  • An object referenced by a class static attribute in a method area, such as a Java class reference type static variable.

  • An object referenced by a constant in a method area, such as a reference in the String constant pool (String invariant).

  • Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack.

  • Internal references to the Java virtual machine, such as Class objects corresponding to basic data types, resident exception objects (NullPointExcepiton, OutOfMemoryError), and system Class loaders.

  • All objects held by the synchronized keyword. Jmxbeans that reflect Java virtual machine internals, callbacks registered in JVMTI, local code caches, and so on.

Object reference

In Java, an object Reference is classified into Strongly re-reference, Soft Reference, Weak Reference, and Phantom Reference.

  • Strong reference A reference assignment that is common in program code, such as “Object obj=new Object()”. In any case, the garbage collector will never reclaim the referenced object as long as the strong reference relationship exists.

  • Soft references are used to describe objects that are useful but not necessary. Only objects that are associated with soft references are listed in the recycle range for a second collection before an out-of-memory exception occurs. If there is not enough memory in the recycle range, an out-of-memory exception is thrown.

  • Weak references are used to describe non-essential objects, but they are weaker than soft references, and objects associated with weak references only survive until the next garbage collection occurs. When the garbage collector starts working, objects associated only with weak references are reclaimed, regardless of whether there is currently enough memory.

  • Virtual reference is the weakest kind of reference relation. The existence of virtual reference does not affect the lifetime of an object, and it cannot obtain an object instance through virtual reference. The sole purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector.

2.3 How Can I Recycle Garbage?

Automatic management of memory resources, garbage collection mechanism must have a set of algorithms to calculate which are valid objects, which are invalid objects, for the invalid objects will be recycled. Common garbage collection algorithms are: mark removal, mark compression, copy algorithm, generation algorithm, etc.

(1) Mark clearing algorithm

Mark removal algorithm, garbage collection is divided into two stages, respectively mark and remove. Tag: Tag referenced objects from the root node. Cleanup: Objects that are not referenced by the tag are garbage objects and can be cleaned up.

The tag removal method can be said to be the most basic collection algorithm, because most of the subsequent collection algorithms are based on the tag – removal algorithm, to improve its shortcomings.

Before mark

After the tag

After the recovery

The advantages and disadvantages

As you can see, the token clearing algorithm solves the problem of circular references in the reference counting algorithm, and objects that are not referenced from the root node are reclaimed.

Similarly, the tag-clearing algorithm has its drawbacks: it is inefficient, both tag-clearing and tag-clearing need to traverse all objects, and the application needs to be stopped during GC, which is a poor experience for highly interactive applications. The memory cleared by the mark clearing algorithm is more fragmented, because the objects to be reclaimed may exist in various corners of the memory, so the cleared memory is not coherent.

(2) Mark compression algorithm

The mark compression algorithm is an improved algorithm based on the mark clearing algorithm. In the cleanup phase, instead of simply cleaning the unlabeled objects, the surviving objects are compressed to one end of memory, and then the garbage beyond the boundary is cleaned, thus solving the problem of fragmentation.

The principle of

The advantages and disadvantages

The advantages and disadvantages are the same as the mark clearing algorithm, which solves the problem of fragmentation of the mark clearing algorithm. At the same time, the mark compression algorithm has one more step, which is the step of object moving memory location, and its efficiency also has a certain impact.

(3) Replication algorithm

The core of the replication algorithm is to divide the original memory space into two parts, and use only one part at a time. During garbage collection, the object being used is copied to another memory space, and then the memory space is emptied, and the roles of the two memory Spaces are swapped to complete garbage collection. If there are many junk objects in the memory, fewer objects need to be copied. In this case, this method is suitable and efficient; otherwise, it is not suitable.

The principle of

The advantages and disadvantages

Advantages: When there are a large number of garbage objects, the cleaning efficiency is high, and the memory is not fragmented after cleaning. Disadvantages: When there are a small number of garbage objects, this method is not applicable. For example, only half of the two memory space allocated by the memory in the old age can be used at the same time, resulting in low memory usage

(4) Generational algorithm

In the heap memory, some objects live for a short time while others live for a long time. Therefore, the heap memory needs to be divided into generations. In this way, the objects that live for a short time can be collected together for high frequency collection, while the objects that live for a long time can be collected together for low frequency collection. This is what generational algorithms are all about, choosing according to the characteristics of the reclaimed object. In the JVM, the replication algorithm is suitable for the younger generation and the tag clearing or tag compression algorithm is suitable for the older generation.

Partial GC Minor GC/Young GC: Refers to a collection that targets only the new generation. Major GC: Garbage collection that targets only Old GC. Mixed GC: Garbage collection that aims to collect the entire new generation and parts of the old generation. Full Heap Collection (Full GC)

03 Garbage Collector

In the JVM, various garbage collectors are implemented, including serial garbage collector, parallel garbage collector, CMS (concurrent) garbage collector, G1 garbage collector, and ZGC in JDK11.

3.1 Serial garbage collector

Serial garbage collector, which uses a single thread for garbage collection, garbage collection, only one thread is working, and all threads in the Java application are suspended, waiting for garbage collection to complete. This phenomenon is called STW (Stop-the-world). For more interactive applications, this garbage collector is unacceptable. This collector is not typically used in Javaweb applications.

3.2 Parallel garbage collector

The parallel garbage collector improves on the serial garbage collector by changing the single-thread garbage collection to multi-thread garbage collection, which can shorten the garbage collection time. Of course, the parallel garbage collector pauses the application while it collects, just like the serial garbage collector, but in parallel, faster and for shorter periods of time.

(1) ParNew garbage collector

The ParNew garbage collector works on the young generation, just changing the serial garbage collector to parallel. The -xx :+UseParNewGC parameter sets the young generation to use the ParNew collector and the old generation to use the serial collector.

test

-xx :+UseParNewGC -xx :+PrintGCDetails -xMS16m -xmx16m # Print the information [GC (Allocation Failure) [ParNew: [Times: user=0.00 sys=0.00, real=0.00 secs] Times: user=0.00 sys=0.00, real=0.00 secs]Copy the code

As you can see from the above information, ParNew: uses the ParNew collector. Other information is consistent with the serial collector.

(2) ParallelGC garbage collector

The ParallelGC collector works just like the ParNewGC collector, with the addition of two additional system throughput parameters to make it more flexible and efficient to use. Related parameters are as follows:

-xx :+UseParallelGC The younger generation uses the ParallelGC garbage collector and the older generation uses the serial collector.

-xx :+UseParallelOldGC The younger generation uses the ParallelGC garbage collector and the older generation uses the ParallelOldGC garbage collector.

-xx :MaxGCPauseMillis Sets the maximum pause time for garbage collection in milliseconds. Note that ParallelGC may adjust the heap size or other parameters to meet the pause time. Instead, it may affect performance. Use this parameter with caution.

-xx :GCTimeRatio Sets the percentage of the garbage collection time in the running time of the program. The formula is 1/(1+n). Its value is a number between 0 and 100, and the default is 99, meaning that the garbage collection time cannot exceed 1%

-xx :UseAdaptiveSizePolicy Indicates the adaptive GC mode. The garbage collector automatically adjusts parameters such as the young generation and the old generation to achieve a balance among throughput, heap size, and pause time. Generally used in scenarios where manually adjusting parameters is difficult and the collector automatically adjusts them.

Testing:

-xx :+UseParallelGC -xx :+UseParallelOldGC -xx :MaxGCPauseMillis=100 -xx :+PrintGCDetails -xms16m -xmx16m [PSYoungGen: 4096K->480K(4608K)] 4096K->1840K(15872K), 0.0034307 secs] [Times: Sys =0.00, real=0.00 secs] [Full GC (Ergonomics) [PSYoungGen: 505K->0K(4608K)] [ParOldGen: 10332K->10751K(11264K)] 10837K->10751K(15872K), [Metaspace: 3491K->3491K(1056768K)], 0.0793622 secs] [Times: User sys = = 0.13 0.00, real = 0.08 secs]Copy the code

As you can see from the above information, both young and old generations are using the ParallelGC garbage collector.

3.3 CMS garbage collector

CMS stands for Concurrent MarkSweep and is a Concurrent garbage collector that uses a mark-sweep algorithm for older garbage collection and is set with the parameter -xx :+UseConcMarkSweepGC

  • The initial mark (CMS-initial-mark), which marks root, results in STW;

  • Concurrent mark (cmS-concurrent-mark), which runs concurrently with the user thread;

  • Preclean (cmS-concurrent-preclean), running at the same time as the user thread;

  • Re-marking (CMS-remark) will result in STW;

  • Concurrent sweep (cmS-concurrent-sweep), which runs concurrently with the user thread;

  • Adjust the heap size, set the CMS to perform memory compression after cleaning, in order to clean up the debris in memory;

  • Concurrent state reset waits for the next CMS trigger (cms-concurrent-reset), running concurrently with the user thread;

test

# Set startup parameters -xx :+UseConcMarkSweepGC -xx :+PrintGCDetails -xMS16m -XMx16m # Run log [GC (Allocation Failure) [ParNew: Secs] [Times: Times: 0] [Times: 0] [Times: 0] [Times: 0] [GC (CMS Initial Mark) [1 CMS- Initial Mark: 6224 k (10944 k)] 6824 k (15872 k), 0.0004209 secs] [Times: User =0.00 sys=0.00, real=0.00 secs] [cms-concurrent-mark-start] 0.002/0.002 secs] [Times: [cms-concurrent-preclean-start] [CMs-concurrent-preclean-start] 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 178K (178K)][Rescan (PARALLEL), 0.000211secs][weak refs processing, 0.00211secs][class class TBM] 0.0003671 SECS][Scrub Symbol table, 0.0006813 secs][Scrub String table, 0.0001216 secs][1 CMS-remark: Secs] [Times: 1024K] 1024K [Times: 1024k] 1024k [Times: 1024k] 1024k [Times: 1024k] 1024k [Times: 1024k] [CMS-concurrent-sweep-start] [CMS-concurrent-sweep: 0.004/0.004 secs] [Times: User =0.00 sys=0.00, real=0.00 secs] # set [cms-concurrent-reset-start] 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]Copy the code

The preceding logs show the CMS process.

3.4 G1 garbage collector

The G1 garbage collector is a new garbage collector that was officially introduced in jdk1.7. Oracle officially plans to make G1 the default garbage collector in jdk9, replacing CMS. G1 was designed to simplify JVM performance tuning by developers in three simple steps:

  1. First, start the G1 garbage collector

  2. Second, set the maximum memory for the heap

  3. Step 3: Set a maximum pause time.

G1 provides three modes of garbage collection, Young GC, Mixed GC, and Full GC, which are triggered under different conditions.

The principle of

The main difference between the G1 garbage collector and other collectors is that it does away with the physical division of young and old generations, and instead divides the heap into regions that contain logical young and old regions. The advantage of this is that we no longer have to set up each generation in a separate space and worry about whether there is enough memory for each generation.

In the G1 zone, the garbage collection of the young generation still copies live objects to the old age or Survivor space by suspending all application threads, and the G1 collector completes the cleanup by copying objects from one zone to another. This means that, under normal processing, G1 compacts the heap (at least part of it) so that there is no CMS memory fragmentation problem.

In G1, there is a special kind of region called the Humongous region.

  • If an object occupies more than 50% of its partition capacity, the G1 collector considers it a giant object.

  • These giant objects, by default, are directly assigned to the old age, but if it is a short-lived giant object, it will have a negative impact on the garbage collector.

  • To solve this problem, G1 has a Humongous zone, which is dedicated to storing giant objects. If an H block does not fit a large object, G1 looks for contiguous H partitions to store it. Sometimes you have to start the Full GC in order to find consecutive H regions.

Young GC

The Young GC mainly GC the Eden region, which is triggered when the Eden space runs out. Data in Eden space is moved to Survivor space. If Survivor space is insufficient, some data in Eden space will be directly promoted to tenured space. Survivor zone data is moved to the new Survivor zone, and some data is promoted to the old chronosphere. Finally, the Eden space is empty, the GC stops working, and the application thread continues.

Remembered Set

When GC objects of the young generation, how do we find the root object of objects in the young generation? The root object can be in the young generation or in the old generation, so are all objects in the old generation roots? It would take a lot of time to scan the entire age. So G1 introduced the concept of RSet. Its full name is Remembered Set and it keeps track of references to objects in a heap.

When each Region is initialized, an RSet is initialized, which records and traces references from other regions to objects in the Region. Each Region is divided into multiple cards with 512Kb by default. Therefore, the RSet should record xx cards of xx Region.

Mixed GC

When more and more objects are promoted to old regions, in order to avoid running out of heap memory, the virtual machine will trigger a Mixed garbage collector, namely Mixed GC. This algorithm is not an old GC, but will reclaim the whole Young region as well as part of the old region. Note that you can select certain old regions for garbage collection rather than all old regions to control the garbage collection time. Also note that Mixed GC is not a Full GC.

When does the MixedGC fire? By the parameter – XX: InitiatingHeapOccupancyPercent = n. Default: 45%. This parameter is triggered when the age size as a percentage of the total heap size reaches this threshold.

Its GC steps are divided into two steps:

  1. Global Concurrent marking

  2. Copy alive objects (evacuation)

Global concurrent token

Global concurrent marking, which is performed in five steps: Initial mark (STW) marks objects that are directly reachable from the root node, and a young GENERATION GC is performed at this stage, resulting in a global pause. Root Region Scan G1 GC scans references to older ages in the live area of the initial tag and marks the referenced object. This phase runs concurrently with the application (non-STW), and only after this phase is complete can the next STW young generation garbage collection begin. Concurrent Marking G1 GC looks for accessible (live) objects throughout the heap. This phase runs concurrently with the application and can be interrupted by the STW young generation garbage collection. Remark (STW) This phase is STW recycling because the program is running to make corrections to the last mark. Cleanup (STW) counts and resets the token state, which STW does not actually do garbage collection and waits for the evacuation phase to collect.

Copy live objects

The Evacuation phase is fully suspended. In this phase, live objects in one Region are copied to another Region to recycle garbage.

G1 Collector parameters

  • -xx :+UseG1GC uses the G1 garbage collector

  • -xx :MaxGCPauseMillis Sets the maximum GC pause time to be expected (it will try to be achieved, but is not guaranteed). The default is 200 milliseconds.

  • -xx :G1HeapRegionSize=n Size of the G1 region. The value is a power of 2 and ranges from 1 MB to 32 MB. The goal is to partition about 2048 regions based on the minimum Java heap size. The default is 1/2000 of the heap.

  • -xx :ParallelGCThreads=n Sets the value of the number of STW worker threads. Sets the value of n to the number of logical processors. The value of n is the same as the number of logical processors, up to 8.

  • -xx :ConcGCThreads=n Sets the number of parallel threads to be tagged. Set n to about 1/4 of the number of parallel garbage collection threads (ParallelGCThreads).

  • – XX: InitiatingHeapOccupancyPercent = n set trigger Mixed GC Java heap usage rate threshold value. The default usage is 45% of the entire Java heap

test

-xx :+ useg1GC-xx :MaxGCPauseMillis= 100-xx :+PrintGCDetails -XMx256M # log [GC Pause (G1 Evacuation Pause) (young), [Parallel Time: 3.7ms, GC Workers: 3] [GC Worker Start (ms): Min: 14763.7, Avg: 14763.8, Max: 14763.8, Diff: 0.1] # Root Scanning (ms): Min: 0.2, Avg: 0.3, Max: 0.3, Diff: 0.1, Sum: 0.8] # Processed Buffers: Min: 1, Avg: 1.9, Max: 1.9, Diff: 0.2, Sum: 5.6 Max: 3, Diff: 2, Sum: 5] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 5] 0.0] [Code Root Scanning (MS): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] # Object Copy (ms): Min: [Termination (MS): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 3.6] [Termination (MS): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 3.6] 0.2] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 3] [GC Worker Other (MS): Min: 0.0, Avg: 0.0, Max: 2] 0.0, Diff: 0.0, Sum: 0.0] [GC Worker Total (ms): Min: 3.4, Avg: 3.4, Max: 3.5, Diff: 0.1, Sum: 0.0] [GC Worker Total (ms): Min: 3.4, Avg: 3.4, Diff: 0.1, Sum: 0.0] [GC Worker End (ms): Min: 14767.2, Avg: 14767.2, Max: 14767.3, Diff: 0.1] 0ms] [Code Root Purge: 0ms] [Clear CT: 0ms] # Purge CardTable [Other: 0ms] Ref Enq: 0.0 ms [Ref Enq: 0.0 ms] [Ref Enq: 0.0 ms] [Ref Enq: 0.0 ms] [Ref Enq: 0.0 ms] [Humongous Reclaim: 2.0ms] [Humongous Reclaim: 2.0ms] [Free CSet: 2.0ms] 7168.0K(7168.0K)-> 0.0b (13.0M) Survivors: 2048.0K->2048.0K Heap: 55.5m (192.0m)-> 48.5m (192.0m)] # User sys = = 0.00 0.00, real = 0.00 secs]Copy the code

Optimization recommendations for G1 garbage collector

  • Young generation size Avoid setting the young generation size explicitly with the -xMN option or other related options such as -xx :NewRatio. Fixed the size of the young generation to override the pause time target.

  • The throughput goal for the G1 GC is 90% application time and 10% garbage collection time. When evaluating G1 GC throughput, don’t be too harsh with pause time goals. Being too stringent in your goals indicates that you are willing to incur more garbage collection overhead, which directly affects throughput.

3.5 ZGC

ZGC is an experimental low-latency garbage collector added to JDK 11 and developed by Oracle corporation. The goal of the ZGC is to achieve low latency that can limit garbage collection pauses to less than 10 milliseconds at any heap memory size with as little impact on throughput as possible.

Memory layout

ZGC uses the same region-based heap memory layout as G1, but ZGC pages (called pages in ZGC, just like regions) are dynamic — they are created and destroyed dynamically, and the size of the Region is dynamic. Under x64 hardware platform, ZGC’S Pag can have large, medium and small capacities:

  • Small Page: the fixed capacity is 2MB. It is used to store Small objects smaller than 256KB.

  • Medium Page: the fixed capacity is 32MB. It is used to store objects larger than 256KB but smaller than 4MB.

  • Large Page: the capacity of Large pages is not fixed and can change dynamically, but must be an integer multiple of 2MB. It is used to place Large objects of 4MB or more. There is only one large object in each large Page, which indicates that, despite the name, the actual size of a “large Page” is likely to be smaller than that of a medium Page, which can be as small as 4MB. Large pages are not reallocated in the ZGC implementation (reallocation is a ZGC processing action) because copying a large object is very expensive.

performance

In terms of performance, though it is still in the experimental condition, have not completed all the features, the stability of polishing and performance tuning is still going on, but even this state ZGC, its performance is quite bright eye, from the point of the official test results are given, with a “shocking and revolutionary ZGC” to describe too much.

Test results of the ZGC and Parallel Insane, G1 collector through SPECjbb 2015 (Java server business test tool). In terms of the “weak” throughput of the ZGC, the LOW-latency primary target of the ZGC has achieved 99% of the high throughput target of the Parallel Insane, directly surpassing the G1. If the Throughput tests are set to “Critical Throughput” for Service Level Agreements (SLA) applications (the Throughput measured with a maximum latency not exceeding a set value (10 to 100 ms)), The ZGC even performs the superparallelscavenge.

The ZGC is strong on pause time testing, which is two orders of magnitude stronger than the Parallel Exploiter and G1. The AVERAGE pause, 95% pause, 99% pause, 99.9% pause, and maximum pause times are all within ten milliseconds with such ease that the ZGC bar (Figure A) is barely visible when compared with two other collectors that pause for hundreds of milliseconds. To observe the ZGC test results, the ordinate of the results must be adjusted from a linear scale to a numerical scale (FIG. B, the scale of the vertical axis is logarithmically increasing).

use

Under JDK11, ZGC can only be used on Linux 64-bit platforms. If you want to use ZGC on Windows, you need to upgrade the JDK to 14.

Gz tar -xvf JDK -11.0.7_linux-x64_bin.tar.gz First remove the Java version RPM - qa | grep Java RPM -e -- nodeps Java - its - XXXX - XXXX. X86_64 vim/etc/profile # # set Java write the following content Environment JAVA_HOME=/usr/local/ SRC/JDK-11.0.7 CLASSPATH=.:$JAVA_HOME/lib.tools.jar PATH=$JAVA_HOME/bin:$PATH export JAVA_HOME CLASSPATH effective PATH # source/etc/profile# execute the command Java - XX: + UnlockExperimentalVMOptions - XX: + UseZGC - Xmx256m - Xlog: gc * = info TestGC # parameter specification - XX: + UnlockExperimentalVMOptions unlock experimental parameters - XX: + UseZGC enable ZGC garbage collector - Xmx256m set the maximum amount of memory -xlog :gc*=info Set to print GC log information Garbage Collection (Warmup) [1.141s][info][GC,start] GC(2) Garbage Collection (Warmup) [1.141s][info][GC, Phases] [info][Phases, Phases] GC(2) Concurrent Mark 1.837ms [1.143s][info][PHASES, Phases] GC(2) Pause Mark End 0.136ms [1.144s][INFO][GC, Phases] GC(2) Concurrent Process non-strong References 0.308ms [info][GC, Phases] GC(2) Concurrent Reset Relocation Set 0.001ms [1.142s][INFO][GC, Phases] GC(2) Concurrent Destroy Detached Pages 0.000 MS [1.detached][info][GC, Phases] GC(2) Concurrent Select Relocation Set 1.219ms [info][GC, Phases] GC(2) Concurrent Prepare Relocation Set 0.009ms [1.145s][INFO][GC, Phases] GC(2) Pause D/S [info][GC, Phases] GC(2) Concurrent Relocate D/S [info][GC, LOAD] GC(2) load: 0.05/0.00/0.02 [s] 1.146 [info] [gc, mmu] gc (2) the mmu: 2 ms / 78.1%, 5 ms / 88.9%, 93.4%, 10 ms/ms / 96.7% 20, 50 ms / 98.7%, 100 ms / 99.0% [s] 1.146 [info] [gc, marking] gc (2) Mark: 1 stripe(s), 1 proactive flush(es), 1 terminate flush(es), 0 completion(s), 0 Continuation (s) [1.146s][info][GC,reloc] GC(2) Relocation: Successful, 1 m relocated 1.146 [s] [info] [gc, nmethod] gc (2) the NMethods: 59 registered, 0 unregistered [1.146s][info][GC,metaspace] GC(2) 4M used, 4M Capacity, 5M committed, 8M reserved [1.146s][info][GC,ref] GC(2) Soft: [info][GC,ref] GC(2) Weak: Unencountered, 0 discovered, 0 enqueued [1.146s][info][GC,ref] GC(2) Weak: [info][GC,ref] GC(2) Final: 0 encountered, 0 discovered, 0 enqueued [1.145s][info][gc,ref] GC(2) Phantom: 1 encountered, 1 discovered, 0 enqueued [1.146s][info][GC,heap] GC(2) Mark Start Mark End Relocate Start Relocate End High Low [s] 1.146 [info] [gc, heap] gc (2) Capacity: 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) 114M (45%) [1.146s][info][GC,heap] GC(2) Reserve: 36M (14%) 36M (14%) 36M (14%) 36M (14%) 36M (14%) 36M (14%) 36M (14%) [1.146s][info][GC,heap] GC(2) Free: [1.146s][info][GC,heap] GC(2) Used: 78M (30%) 78M (30%) 36M (14%) 36M (14%) 78M (30%) 36M (14%) [1.146s][info][GC,heap] GC(2) Live: 1 m (1%), 1 m (1%), 1 m (1%) - [s] 1.146 [info] [gc, heap] gc (2) Allocated: 0 m (0%) 0 m (0%), 4 m (2%) - [s] 1.146 [info] [gc, heap] gc (2) Garbage: -- [1.145s][INFO][GC,heap] GC(2) Reclaimed: - - 42M (16%) 42M (16%) - -[1.146s][info][GC] GC(2) Garbage Collection (Warmup) 78M(30%)->36M(14%)Copy the code

Dyeing pointer technique

In order to achieve this goal, ZGC added dyeing pointer technology.

A stained pointer is a technique for storing a small amount of additional information directly on a pointer, with theoretically accessible memory of up to 16EB (64th power of 2) bytes on 64-bit systems. In fact, 64-bit Linux supports 47 bits (128TB) of virtual process address space and 46 bits (64TB) of physical address space, while 64-bit Windows supports only 44 bits (16TB) of physical address space. The high 18 bits of the 64-bit pointer cannot be used for addressing under Linux, but the 64TB memory supported by the remaining 46 bits is still sufficient for large servers today. ZGC’s dyeing pointer technology uses the remaining 46 bits of pointer width, extracting the top four bits to store four flags. With these flag bits, the VIRTUAL machine can directly see the tricolor flag state of the reference object from the pointer, whether it is in the reallocation set (that is, moved), and whether it can only be accessed by finalize() method. Because these flag bits further compress the already 46-bit address space, the ZGC can manage no more than 4TB of memory (2 to the 42nd power).

The benefits of coloring Pointers

  • The dye pointer allows a Region to be freed and reused as soon as its live objects are removed, rather than waiting for all references to the Region in the heap to be corrected.

  • Coloring Pointers can significantly reduce the number of memory barriers used during garbage collection. The general purpose of write barriers is to keep track of changes in object references, and if this information is maintained directly in Pointers, some specialized logging operations can obviously be eliminated. Neither ZGC uses any write barriers, only read barriers. The dye pointer can be used as an extensible storage structure to record more data related to the object marking and relocation process to further improve performance later.

The working process of the

The operation of the ZGC can be roughly divided into four large phases, all of which can be executed concurrently. Only in the Mark Start and Initial Mark phases will there be a temporary STW.

  • Like G1, Concurrent marking is the stage of reachability analysis through the object graph, followed by a short pause between initial and final marks. ZGC marks on Pointers rather than objects, and the Marked 0, Marked 1 flag bits in the dye pointer are updated in the marking phase.

  • Concurrent Prepare for Relocate In this stage, regions to be cleared are calculated according to specific query conditions and formed into Relocation sets. ZGC scans all regions with each collection, trading the cost of a wider scan for the savings in G1 memory set maintenance. The ZGC redistribution set simply determines that the living objects in it will be copied to other regions and the regions in it will be freed.

  • Concurrent Relocate is a central stage in the ZGC execution process, which involves copying live objects in a Concurrent Relocate set to a new Region and maintaining a Forward T able for each Region in the Relocate set. Records the steering relationship from the old object to the new object. Due to the use of the support of dyeing pointer, ZGC collector can only from the reference on clear whether an object in the redistribution of set, if the user thread as concurrent access to the concentration of the redistribution of object, the visit will be intercepted by the preset memory barrier, then immediately turn according to the Region of the published records to forward the access to the new copy objects, The ZGC refers to this behavior as the pointer’s self-healing ability.

  • What Concurrent Remap does is correct all references in the heap to old objects in the reallocation set. Concurrent remapping is not an “urgent” task, but when all Pointers are fixed, the original forwarding table that records the relationship between old and new objects can be released.