Garbage collection is one of the most important parts of the Java architecture. It provides a fully automatic memory management solution. To master this management solution, you must understand how the garbage collector works. This article introduces the concept of garbage collection, algorithms, garbage collectors and some examples of GC optimization that I have encountered in my work.

First, take a look at the main components of the JVM:

Heap memory partition

The young generation

The younger generation is divided into three sections. One Eden zone and two Survivor zones (from Survivor(S0) zone and to Survivor(S1) zone). Most objects are generated in the Eden zone. When Eden is full, the surviving objects will be copied to the Survivor zone (one of the two). When this Survivor zone is full, the surviving objects in this Survivor zone will be copied to another Survivor zone. When this Survivor zone is full, Objects copied from the first Survivor zone that are still alive will be copied to “Tenured”.

Old generation

Objects that survive N (ParNew defaults to 15)) garbage collections in the young generation are put into the old generation. The younger generation can’t put down the big object directly into the old age.

Tip: The virtual machine does not always require that the age of the object must reach MaxTenuringThreshold (default: 15) to advance to the old age. If the sum of all object sizes of the same age in the Survivor space is greater than half of the Survivor space, Objects older than or equal to this age can go directly to the old age without waiting until the age specified in MaxTenuringThreshold.

Last generation

In JDK1.8, the permanent generation has been removed from the Java heap, and the String is stored directly in the heap. The meta space of the class is stored in meta space. Meta space takes up external memory and does not take up heap memory.

Second, GC recovery algorithm

Mark clearing algorithm

Mark clearing is divided into two stages, the mark stage (mark all reachable objects from the root node, which are not marked or referenced) and the clear stage. Disadvantages: Both stages are inefficient; After the collection, the memory space is not continuous, resulting in more fragments, which is easy to lead to early GC.

Replication algorithm

The memory is divided into two parts and the surviving objects are copied from each other. Disadvantages: Low memory usage.

Label compression

Mark first, then live to move to a segment, clean up memory beyond the mark of the live end. Advantages and disadvantages: No memory fragmentation, but time-consuming.

Generational algorithm

Copy algorithm (used by the new generation), mark compression method and mark elimination method (used by the old age). Card table (data structure, a collection of bits), used to indicate whether the old object holds a reference to the new object, the new generation does not need to spend time to confirm whether the object is held, which can speed up the generation of recycling.

Partitioning algorithm

The whole heap space is divided into continuous different small space, independent management, independent recycling.

References and palpable intensity

There are four levels of object reference and touchable strength

** Will not be recycled by the system at any time and may cause OOM.

StringBuffer str = new StringBuffer("juejin");
Copy the code

** Soft references: **GC is not necessarily collected, but is collected when there is insufficient heap space. OOM must be recycled before, so soft reference will not cause OOM. Object created using SoftReference.

SoftReference<User> userSoftReference = new SoftReference<User>(u);
Copy the code

** Weak references: ** Reclaim upon discovery. Objects created using WeakReference. Objects created using PhantomReference.

WeakReference<User> userWeakReference = new WeakReference<User>(u);
Copy the code

** Virtual references: ** is always recyclable.

PhantomReference<User> userPhantomReference = new PhantomReference<User>(u);
Copy the code

Three, waste recycling by generation

The basic idea of garbage collection is how to determine the accessibility of objects. Based on the token clearing algorithm, it is possible to scan for objects that are not held by the root node, but an unreachable object may resurrect itself at some point.

Three states of object accessibility:

  1. palpable
  2. Resurrection (Finalize () function)
  3. Untouchable (the finalize() function can only be called once)

Young Space uses a replication algorithm

Old space uses tag cleanup or tag cleanup

Tip1: Objects are allocated in Eden first. Large objects directly enter the old age, while long-term objects enter the old age.

Garbage collector

Serial collector

Single-threaded GC, which stops applications when started, is mostly deprecated for small servers (1C2G).

Parallel collector PS (Throughput first)

JDK1.6 to 1.8 By default. Garbage thread parallel, application waits when started (STW)

Stop The World why? 1. In order for the garbage collector to perform properly and efficiently. 2, to ensure the consistency of a certain moment of the system. 3. It helps the garbage collector better mark garbage objects.

There are two new generation of PS recyclers: **

  1. ParNew collector: Multiple threads perform garbage collection. The number of PS threads can be specified using -xx :ParallelGCThreads. ParallelGCThreads =3+((5*CPU_count)/8) if CPU<8, ParallelGCThreads =3+((5*CPU_count)/8) This method is applicable to scenarios with weak interaction. (JDK1.8 + has been removed)
  2. Parallel collector: Same as ParNew is multithreaded exclusive. -xx :+UseParallelGC(set old age -xx :+UseParallelOidGC)

Concurrent collector (response time first)

Unlike the parallel collector, the concurrent collector is non-exclusive and the application can run while garbage collection is being performed, triggering an additional generation of GCS before the parallel GC.

Concurrent Mask Sweep (CMS)

Procedure: initial mark (mark root object) –> concurrent mark –> pre-clean (prepare and control pause time) –> re-mark –> concurrent clear –> concurrent reset

** Advantages: ** concurrent collection, low pauses.

Disadvantages:

  1. CMS is sensitive to CPU resources. In the concurrent phase, it does not cause user threads to pause, but it does slow down the application and reduce overall throughput by taking up a portion of the threads
  2. The CMS is unable to handle floating garbage, which may result in a “Concurrent Mode Failure” resulting in a Full GC
  3. CMS is prone to large amounts of space debris. When space debris is too much, it will bring great trouble to the allocation of large objects. Often, there will be a lot of space left in the old years, but they cannot find a large enough continuous space to allocate the current object, so they have to trigger a Full GC in advance
  4. In the garbage collection process of old age, if there is insufficient resources, it will be forced to carry out serial collection of old age, which will have a longer application pause time and greater impact

** G1 (garbage-first) **

JDK1.7 is officially in use and uses a new algorithm, which seems to have a tendency to replace CMS. G1 retains the concept of generational memory, but it is not contiguous in heap structure. As shown in figure:

On the basis of parallelism and concurrency, G1 can take care of both young generation and old generation at the same time, and can also conduct space defragmentation, which will be automatically defragmentation after each GC to reduce the fragment space. Finally, there is predictability: the G1 can select areas for memory reclamation.

Process: 1) initial mark (mark root object) (Eden area will be cleared) 2) root area scan 3) concurrent mark 4) re-mark 5) exclusive clean (calculate the ratio of surviving objects and GC collection in each area) 6) concurrent clean

Mixed collection: Trigger young generation collection when young generation is full. Memory as the old s growth, when arrived at IHOP threshold – XX: InitiatingHeapOccupancyPercent (accounted for a whole heap of old s ratio, 45%) by default, the G1 to start preparing to collect old s space. First, the concurrent marking cycle is used to identify the old age partitions with a high percentage of garbage. But instead of immediately starting a mixed collection, G1 lets the app run for a while, waiting for a young collection to trigger. In this STW, G1 will keep the collation mixed collection cycle on track. Then let the application thread run again, and when the next few young collections are made, older partitions will be added to the CSet, triggering Mixed collections. These Mixed collections are called Mixed Collection cycles.

** Features: **

  1. Parallel to Concurrent: The G1 takes full advantage of multiple CPU cores, using multiple cpus to reduce stop-the-world pause times.
  2. Generational collection: Although G1 can manage the entire GC heap independently without the cooperation of other collectors, the concept of generational collection is retained.
  3. Spatial consolidation: Different from CMS’s “mark-clean” algorithm, G1 is a collector based on the “mark-clean” algorithm as a whole; Locally, it is based on a “copy” algorithm.
  4. Predictable pauses: This is another big advantage G1 has over CMS. Reducing pause times is a common concern of BOTH G1 and CMS, but G1 also models predictable pause times, allowing users to explicitly specify a time segment of M milliseconds in length.

Fifth, tuning ideas

Tuning foresight

  1. Try multiple garbage collectors, and G1 isn’t the best.
  2. Concurrency is not the same as parallelism, and garbage collection actually has two steps: starting the GC cycle and running the GC itself, which are two different things. Concurrency is for the GC cycle, while parallelism is for the GC algorithm itself.
  3. Average transaction times are not the most important metric to focus on, and it is possible that the user has just experienced the scenario of a long GC, which would be devastating.
  4. GC tuning is not the answer to everything. If the application is heavily modified, the architecture and code should be optimized first.
  5. GC logging does not have a significant impact on performance, and it is necessary to enable GC logging before the GC is optimized.
  6. Reducing the allocation rate of new objects improves GC health by roughly dividing the objects in the system into three categories: Long-lived objects, for which we generally can’t do much; The biggest problem with a medium-lived object is probably here; Short-lived objects, which are usually released and reclaimed quickly, disappear with the next GC cycle.

Tuning ideas

  1. Understand application requirements and problems.
  2. Master the state of GC.
  3. Consider whether the selected GC fits our application profile.
  4. Analyze and confirm the parameters to be adjusted.
  5. Verify tuning.

GC generally reasonable performance

Analysis shows that if Full GC takes less than 0.1-0.3 seconds, there is generally no need to spend extra time on GC tuning. However, if Full GC takes 1-3 seconds or even more than 10 seconds, you need to GC tune your system immediately.

  1. Minor GC executes quickly (less than 50 milliseconds)
  2. Minor GC executes infrequently (every 10 seconds or so)
  3. Full GC executes fast (less than 1 second)
  4. Full GC executes infrequently (every 10 minutes or so)

Sixth, parameter tuning

PS collector

-xms (initial heap memory) and -xmx (maximum heap memory)

If you know how much heap your application needs to work properly, you can set -xms and -xmx to the same value. If not, the JVM will use the initial heap size first and then grow automatically until it finds a balance between heap usage and performance.

It is recommended that you set ** -xms to the same as -xmx **, because the JVM also consumes resources when calculating how to scale up/down.

-XX:GCTimeRatio=

Throughput: The ratio of garbage collection time to application time is set to 1/ (1+) and the default is 99% (1% of garbage collection time)

-XX:MaxGCPauseTimeMillis

Set the maximum garbage collection time pause time

-XX:UseAdaptiveSizePolicy

Adaptive mode: the size of the new generation, the ratio of Eden to Survivor, and the age of the object promoted to the old age will be adjusted automatically, and the SurvivorRatio parameter will be disabled.

Priority assurance: pause time > Throughput > heap space. If initial heap memory and maximum heap memory are not set, the initial heap size is 1/64 of physical memory, the maximum memory is 1/4, and the young generation size is 1/3 of heap memory

CMS collector

– XX: + UseConcMarkSweepGC open CMS

Number of concurrent threads :(ParallelGCThreads+3)/4 It can also be set manually by ** -xx :ConcGCThreads or -xx :ParallelCMSThreads**

-XX:CMSInitiatingOccupancyFraction

Because of the concurrent nature, CMS does not wait until the heap is saturated for garbage collection. The default value is 68%. Set this parameter

-XX:CMSFullGCsBeforeCompaction

Memory compression: Set the number of times that memory is compressed after GC collection. Default: 0

-XX:CMSClassUnloadingEnable

Full GC is also triggered when Perm is Full

G1 collector

** -xx :+UseG1GC ** Enable G1

-xx :NewSize (minimum young generation) -xx :MaxNewSize (maximum young generation)

-xx :MaxGCPauseMillis= (default: 200ms) Maximum GC pause time

If ** -xmn ** is set, MaxGCPauseMillis is disabled

** -xx :MinHeapFreeRatio=40 **** -xx :MaxHeapFreeRatio=70 ** Free heap ratio

After GC, if free heap memory is found to be 40% of the total estimated heap memory, the estimated maximum of heap memory is enlarged, but not beyond the fixed maximum

-XX:ParallelGCThreads=

-xx :ParallelGCThreads= Number of concurrent GC threads collected at GC pauses: -xx :ParallelGCThreads= Number of CPU threads available on the host on which the VM is running: if there are less than 8 cpus this value is the number of cpus, otherwise it is the number of cpus x 5/8. The maximum number of GC threads at the start of each pause is also limited by the maximum heap memory that G1 can use per thread. The maximum heap memory is set by -xx :HeapSizePerGCThread, which defaults to 8M

-XX:ConcGCThreads=

The number of concurrent GC threads for application execution, default is -xx :ParallelGCThreads/4

-XX:G1HeapRegionSize=

Region size: The heap contains 2048 regions. The size of regions ranges from 1 to 32 MB and must be raised to the power of 2. This adjustment affects the size and pause time of the allocated object

-XX:G1MaxNewSizePercent

The size of the object can be assigned one of the biggest, must cooperate with parameters – XX: + UnlockExperimentalVMOptions use, and can only be added in later

Seven, the tools

  1. You can run jstat [- command options] [vmid] [interval/millisecond] [query times] to view heap memory usage.
  2. Gc monitoring can be done using the JDK’s own JVisualVM or JConsole
  3. Gc.log log analysis can be done using the free online analysis tool gceasy: blog.gceasy.io/

Eight, for the project experience set

In contrast, the G1 collector is recommended as long as the JDK version is 1.7U4 and above. Note the container project, the container set JVM configuration memory size must not be larger than the container memory size, otherwise the parameter configuration is invalid

Tuning instance

Pressure test performance: pressure does not rise during pressure test, server consumption is not fully loaded, but server resources cannot be fully utilized by increasing the number of concurrent requests. Moreover, TPS will drop precipitously when the pressure is prolonged, and the TPS and corresponding time are very unstable.

TPS and response time:

Resource monitoring:

The picture is not particularly clear, app cluster CPU consumption is 46%, memory consumption is 37%

Analyzing heap memory usage:

              

(-XMx7g-xMS7g-xx :NewSize= 3G-xx :MaxNewSize=3g)

              

TPS and response time: Performance improved by more than 100% and response time reduced by about 30%

Resource monitoring:

The APP cluster consumes 65% CPU and 50% memory

**PS: **JDK11 has introduced a new garbage collector ZGC, which can only be described as “beyond imagination” in four words. Our company is still the standard version of JDK8, and has not upgraded to JDK11 for the time being, so we will have a chance to study and share it later.

Future releases are still planned for the JVM:

Key PARAMETERS of the JVM, OOM, class loading, and Codecache……

Finally attached GC official documentation: docs.oracle.com/en/java/jav…