preface

In previous blog posts, we have provided an overview of the common garbage collection algorithms and the sorting collection algorithms common in the JVM. These are methodological analyses of garbage collection in Java from an algorithmic and normative perspective. In the JVM, the implementation of Garbage collection is the responsibility of the Garbage Collector.

The body of the

An overview of the

Before we look at garbage collector, we need to know a few terms for garbage collector.

1. The throughput

The ratio of the time the CPU spends running user code to the total CPU consumption. For example, if the total running time of the virtual machine is 100 minutes, user code time is 99 minutes, and garbage collection time is 1 minute, then the throughput is 99%.

Throughput = run user code time/(Run user code time + garbage collection time)

Pause time

Pause time refers to the pause time of an application while the garbage collector is running. Pause times can be long for exclusive collectors. With concurrent garbage collector, program pause times are shorter because the garbage collector and the application run alternately, but system throughput may be lower because it is likely to be less efficient than a sovereign garbage collector.

3. GC noun

3.1. Minor GC

Garbage collection occurs in the new generation, and because Java objects tend to be ephemeral, Minor GC is usually very frequent and generally fast.

3.2. Old GC (Major GC)

Garbage collection refers to a garbage collection that occurred in the old days, with a Major GC, often followed by at least one Minor GC (in which case the entire heap is GC, often referred to as Full GC). Major GC is typically 10 times slower than Minor GC.

4. Concurrency and parallelism

4.1. Parallel

A single thread does garbage collection, but the user thread is still waiting.

4.2. Concurrent

The parallel user thread here is executed alternately with the garbage collection thread.

4.3 Parallel

Parallelism here refers to user threads and multiple garbage collection threads working simultaneously on different cpus.

Garbage collection algorithm

1. Root search algorithm

The ROOT search algorithm is introduced from the graph theory in discrete mathematics. The program regards all reference relations as a graph, starts from a node GC ROOT, and searches for the corresponding reference node. After finding this node, it continues to search for the reference node of this node. When all reference nodes are searched, the remaining nodes are considered as unreferenced nodes, that is, useless nodes.

Red in the figure above are useless nodes that can be reclaimed. Currently, objects that can be used as GC ROOT in Java are:

  1. Objects referenced in the virtual machine stack (local variable table);

  2. Objects referenced by static variables in the method area;

  3. The object referenced by the constant in the method area;

  4. Objects referenced in the Native method stack (Native objects).

Almost all GC algorithms refer to the concept of the root search algorithm.

2. Mark-clear algorithm

The mark-sweep algorithm scans from the root collection, marking the surviving objects. After marking, scan unmarked objects in the whole space for direct recycling, as shown in the figure below:

The mark-clear algorithm does not need to move objects and only deals with non-viable objects, which is extremely efficient in the case of a large number of viable objects. However, because the mark-sweep algorithm directly reclaims the non-viable objects without defragmenting the surviving objects, memory fragmentation can result.

3. Replication algorithm

The replication algorithm divides memory into two segments. With this algorithm, all dynamically allocated objects can only be allocated in one of these segments (the active segment) and the other segment (the spatial segment) is free.

The replication algorithm also scans from the root collection, copying surviving objects into the free interval. After scanning the active range, the active range will be recycled at a time. At this point, the original idle interval becomes the active interval. The next GC repeats the same operation, and so on.

The copy algorithm is extremely efficient when there are few living objects, but comes with the cost of sacrificing half of the memory space for moving objects. So the replication algorithm must be used in situations where the survival rate of objects is very low. Most importantly, we need to overcome the 50% memory waste.

4. Mark-tidy algorithm

The mark-collation algorithm marks objects in the same way as the mark-clear algorithm, but after reclaiming the space occupied by non-viable objects, it moves all viable objects to the left free space and updates the corresponding Pointers.

Mark-collation is a move sort collation of objects on top of mark-clear, so it is more expensive, but it solves the problem of memory fragmentation.

To optimize memory collection, the JVM uses generational collection. For the new generation memory collection (Minor GC), the replication algorithm is mainly used. For older memory collection (Major GC), mark-collation algorithm is mostly used.

Garbage collector

1. Garbage collector classification criteria

2. Overview of seven garbage collectors

The JVM is implemented as Serial, ParNew, Parallel Insane, CMS, Serial Old (MSC), Parallel Old, G1, etc. In the figure below, you can see that different garbage collectors fit into different areas of memory, and if there is a wire between two garbage collectors, they can be used together.

If the garbage collector is doing garbage collection, all other worker threads must be suspended until it is fully collected. We call this strategy of suspending the worker thread in order to clean up stop-the-world. The above recyclers, Serial, ParNew, Parallel Insane, Serial Old, and Parallel Old, all use a stop-the-world strategy.

There are seven different garbage collectors for different generations of garbage collection.

  • Cenozoic recyclers: Serial, ParNew, Parallel Insane

  • Old collector: Serial Old, Parallel Old, CMS

  • Whole heap collector: G1

A wire between the two garbage collectors indicates that they can be used together. The alternative collocation schemes are as follows:

The new generation The old s
Serial Serial Old
Serial CMS
ParNew Serial Old
ParNew CMS
Parallel Scavenge Serial Old
Parallel Scavenge Parallel Old
G1 G1

3. Single-threaded garbage collector

3.1. Serial (-xx :+UseSerialGC)

Serial collector is the most basic new-generation garbage collector, which is a single-thread garbage collector. Because Serial collector does not switch between threads during garbage collection, it is efficient, especially in a single-CPU environment. For programs running in Client mode, the Serial collector is a good choice.

Serial next-generation recyclers use a replication algorithm.

3.2. Serial Old (-xx :+UseSerialGC)

The Serial Old collector is an older version of the Serial collector, a single-threaded collector that uses a mark-collation algorithm. For Server Mode VMS, it is often used in conjunction with the Parallel recycle insane for better throughput prior to JDK1.5 and is the backup solution to the CMS recycle when Concurrent Mode Failure occurs.

Serial collector and Serial Old collector execute as follows:

The Serial Old collector uses a mark-collation algorithm.

4. Multithreaded garbage collector (throughput first)

4.1. ParNew (-xx :+UseParNewGC)

The ParNew collector is a multi-threaded version of the Serial collector, which also runs in the new generation region. In implementation, they share a lot of code. In different operating environments, different threads are opened according to the number of CPU cores to achieve the optimal garbage collection effect. For Server mode applications, the ParNew collector is a good choice if you are considering CMS as a legacy collector.

ParNew New generation recycler adopts replication algorithm.

Insane (-xx :+UseParallelGC)

Like the ParNew recycle, the Parallel Recycle is a multithreaded recycle that operates in a new generation of applications. Unlike the ParNew collector, which adjusts parameters by controlling the number of garbage collection threads, the Parallel Collector is more concerned with the throughput of the application. The percentage of user code run time over time.

The Parallel Exploiter uses a replication algorithm.

4.3. ParallelOld (-xx :+UseParallelOldGC)

The Parallel Old collector is an older version of the Parallel Exploiter. It is a multi-threaded collector and uses a mark-collation algorithm. The Parallel Old recycler and the Parallel Recycle recycle also take into account throughput first, making them ideal for applications where throughput and CPU resource are important.

The Parallel Old collector uses a mark-collation algorithm.

5. Other recyclers (pause time is preferred)

5.1. CMS (-xx :+UseConcMarkSweepGC)

Concurrent Mark Sweep (CMS) collector is a collector based on the premise of the shortest recovery pause time. It belongs to multi-thread collector and adopts mark-sweep algorithm.

Compared with the previous collector, the operation process of CMS collector is more complex, which is divided into four steps:

  1. CMS Initial Mark

Initial tags simply mark objects directly associated within GC Roots. This stage is very fast and needs to Stop the World.

  1. CMS Concurrent Mark

Concurrent markers are GC Tracing, which starts with GC Roots to analyze the heap accessibility and find out the living objects.

  1. Re-marking (CMS Remark)

The re-mark phase corrects the mark record of the part of the object whose mark changes due to user actions during concurrency. The pause time in this phase is generally slightly longer than in The initial tagging phase, but much shorter than in The concurrent tagging phase, and also requires Stop The World.

  1. CMS Concurrent sweep

The concurrent cleanup phase removes garbage objects.

CMS Initial mark and CMS remark can cause user threads to stall and Stop the World.

During the entire process, the CMS collector’s memory collection is basically executed concurrently with the user thread, as shown below:

Because the CMS Collector collects concurrently and has Low pauses, some places become Concurrent Low Pause collectors.

Disadvantages of CMS collector:

  1. The CMS collector is very CPU dependent

The CMS collector is overly dependent on the multi-threaded environment. By default, the number of open threads is (number of cpus + 3) / 4. When the number of cpus is less than 4, the CMS has a significant impact on users’ queries because they have to devote half of their computing power to executing the collector thread.

  1. The CMS collector cannot remove floating garbage

New garbage is generated because the user thread is still running when the CMS collector clears the marked garbage (in its final phase). But this part of the garbage is not marked until the next GC, so it is called floating garbage.

Since memory reclamation and user threads occur simultaneously, memory is allocated as well as reclaimed. When the old generation memory usage exceeds a certain percentage, the system will carry out garbage collection; When the remaining memory cannot meet the program running requirements, a Concurrent Mode Failure occurs and the system uses the Serial Old algorithm to temporarily remove the Failure, which degrades the system performance.

  1. A large amount of space debris remains after garbage collection

The mark clearing algorithm adopted by CMS collector has the disadvantage of large amount of residual space debris after garbage collection. CMS can solve this problem to some extent with appropriate memory defragment strategy.

5.2. G1 Collector (Garbage Region priority)

G1 is the compression collector officially launched in JDK 1.7 to replace CMS. Although it does not physically separate the new generation from the old generation, it is still a generational garbage collector. G1 still distinguishes the young generation from the old generation, and the young generation still divides into Eden zone and Survivor zone.

G1 first divides the heap into equal-sized regions to avoid region-wide garbage collection. Then, the value of garbage accumulation in each Region is tracked, and a priority list is maintained in the background. The Region with the highest value is reclaimed first based on the allowed collection time. In addition, G1 uses Remembered Set to store references between regions, and references between new generations and old generations in other recyclers to avoid full heap scanning. An example of G1 partitioning is shown below:

This method of using regions to divide memory space and prioritized Region collection ensures that the G1 collector can achieve the highest collection efficiency in a limited time.

There are many similarities between G1 and CMS, and the whole process is divided into four steps:

  1. CMS Initial Mark

Initial tags simply mark objects directly associated within GC Roots. This stage is very fast and needs to Stop the World.

  1. CMS Concurrent Mark

Concurrent markers are GC Tracing, which starts with GC Roots to analyze the heap accessibility and find out the living objects.

  1. Re-marking (CMS Remark)

The re-mark phase corrects the mark record of the part of the object whose mark changes due to user actions during concurrency. The pause time in this phase is generally slightly longer than in The initial tagging phase, but much shorter than in The concurrent tagging phase, and also requires Stop The World.

  1. Screening of recycling

First, the collection value and cost of each Region are sorted, and a collection plan is made based on the expected GC pause time of the user. This phase can be executed concurrently with the user program, but because only a portion of the Region is reclaimed, the time is user-controlled, and halting the user thread greatly improves collection efficiency.

Compared with other GC collections, G1 has the following four characteristics:

  • Parallelism and concurrency

Using multiple cpus to shorten stop-the-world pause times, while other collectors need to pause GC actions performed by Java threads, the G1 collector can still allow Java programs to continue executing concurrently.

  • Generational recycling

As with other recyclers, the generational concept remains in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can use different strategies to handle newly created objects and old objects that have been around for a while and have survived multiple GC’s for better collection results. Cenozoic and old age are no longer physically isolated, but independent regions of equal size.

  • Spatial integration

Unlike CMS’s mark-clean algorithm, G1 is a collector based on the mark-clean algorithm as a whole. Locally (between two regions) it is based on the replication algorithm.

However, both algorithms mean that G1 does not generate memory fragmentation during operation and can be recycled to provide clean free memory. This feature helps programs run for a long time and allocate large objects without triggering the next GC prematurely because contiguity memory space cannot be found.

  • Predictable pauses

This is another big advantage G1 has over CMS, as reducing pause times is a common focus for both G1 and CMS. In addition to pursuing low pauses, G1 also models predictable pause times, allowing users to explicitly specify that no more than N milliseconds should be spent on garbage collection within a time segment of M milliseconds. (Priority list for background maintenance. Regions with high value are reclaimed first.)

reference

Zhou, zhiming, understanding the Java Virtual Machine in depth: Advanced JVM features and Best Practices, China Machine Press


Welcome to pay attention to the technical public number: Zero one Technology Stack

This account will continue to share learning materials and articles on back-end technologies, including virtual machine basics, multithreaded programming, high-performance frameworks, asynchronous, caching and messaging middleware, distributed and microservices, architecture learning and progression.