This is the 14th day of my participation in the August More Text Challenge. For details, see: August More Text Challenge

If the collection algorithm is the methodology of memory collection, then the garbage collector is the concrete implementation of memory collection. While we are comparing collectors, we are not trying to pick the best one. Because until now there is no best garbage collector, let alone universal garbage collector, we can do is according to the specific application scenario to choose their own garbage collector. Think about it: if there were a universal, perfect collector for all scenarios, our Java Virtual Machine wouldn’t implement so many different garbage collectors.

Serial collector (-xx :+UseSerialGC -xx :+UseSerialOldGC)

The Serial collector is the most basic and oldest garbage collector. As you can see from the name, this collector is a single-threaded collector. Its “single-threaded” meaning means not only that it uses only one garbage collection thread to complete The garbage collection, but that it must Stop The World all other worker threads while it is garbage collecting until The collection is complete. The new generation uses the replication algorithm, and the old generation uses the mark-sorting algorithm.

The virtual machine designers of course knew about The bad user experience caused by Stop The World, so in subsequent garbage collector designs pause times have been shortened (there are still pauses, and The search for The best garbage collector continues). But does the Serial collector have any advantages over other garbage collectors? Sure, it’s simple and efficient (compared to the single threads of other collectors). The Serial collector naturally achieves high single-threaded collection efficiency because it does not have the overhead of thread interaction. The SerialOld collector is an older version of the Serial collector, which is also a single-threaded collector. It has two main uses: to be used with the ParallelScavenge collector in JDK1.5 and older, and as a back-up solution to the CMS collector.

ParNew Collector (-xx :+UseParNewGC)

The ParNew collector is essentially a multithreaded version of the Serial collector, with exactly the same behavior (control parameters, collection algorithms, collection policies, and so on) as the Serial collector, except that it uses multithreading for garbage collection. The default number of collection threads is the same as the number of CPU cores. You can also use the -xx :ParallelGCThreads parameter to specify the number of collection threads, but it is not recommended to change the value. The new generation uses the replication algorithm, and the old generation uses the mark-sorting algorithm.

It is the first choice for many virtual machines running in Server mode, and only it works with the CMS collector other than the Serial collector.

Exploiture Collector (-xx :+UseParallelGC, -xx :+UseParallelOldGC)

The ParallelScavenge collector is similar to the ParNew collector. It is the default collection for Server mode.

So what’s so special about it?

The ParallelScavenge collector is focused on throughput. Throughput is the ratio of the CPU time spent running user code to the total CPU consumption. The ParallelScavenge collector provides a variety of parameters to be used to determine the optimal pause time or maximum throughput. If you are not familiar with the collector, it is a good option to give memory management optimization to a virtual machine. The new generation uses the replication algorithm, and the old generation uses the mark-sorting algorithm.

The ParallelOld collector is an older version of the ParallelScavenge collector. Use multithreading and mark-unscramble algorithms. The ParallelScavenge and ParallelOld collector are preferable in situations where throughput and CPU resources are important.

CMS collector (-xx :+UseConcMarkSweepGC(old))

The CMS (ConcurrentMarkSweep) collector is a collector whose goal is to obtain the shortest collection pause time. Well suited for use in applications that focus on user experience, it is the first true concurrent collector for the HotSpot virtual machine, enabling the garbage collection thread and the user thread to work (basically) simultaneously for the first time.

As the words Mark Sweep in the name indicate, the CMS collector is a “mark-sweep” algorithm, which is a bit more complicated than the previous garbage collectors.

The whole process is divided into four steps:

  1. Initial marker: Pause all other threads and record the objects that GC Roots can refer to directly, very fast;
  2. Concurrency flag: Open both GC and user threads, using a closure structure to record reachable objects. But at the end of this phase, the closure structure is not guaranteed to contain all currently reachable objects. Because the user thread may be constantly updating the reference field, the GC thread cannot guarantee real-time reachability analysis. So the algorithm keeps track of where these references are updated.
  3. Re-marking: The re-marking phase is designed to correct the part of the object’s marking record that changes as the user program continues to run during concurrent marking. The pause time in this phase is usually slightly longer than in the initial marking phase and much shorter than in the concurrent marking phase.
  4. Concurrent cleanup: The user thread is started while the GC thread starts cleaning the unmarked areas.

It has the following disadvantages:

  • The CMS collector is very processor-sensitive (competing with services for resources).
  • Unable to handle floating garbage.
  • A large amount of memory fragmentation can be generated using the tag cleanup algorithm. You can enable memory fragmentation using the -xx: + usecms-compactatFullCollection switch parameter.

CMS parameters:

  • -xx :+UseConcMarkSweepGC: CMS is enabled
  • -xx :ConcGCThreads: Indicates the number of concurrent GC threads
  • – XX: + UseCMSCompactAtFullCollection: FullGC do compression after finishing (pieces)
  • FullGC – XX: CMSFullGCsBeforeCompaction: how many times after compression, the default is 0, on behalf of each FullGC was followed by compression
  • – XX: CMSInitiatingOccupancyFraction: triggered when use the old s reached the proportion FullGC (default is 92, this is the percentage)
  • – XX: + UseCMSInitiatingOccupancyOnly: use only set recycling threshold value) (XX: CMSInitiatingOccupancyFraction setting, if not specified, the JVM is only used for the first time set value, the subsequent will automatically adjust
  • -xx :+CMSScavengeBeforeRemark: Starts a minor GC before the CMS GC. The purpose is to reduce the reference of the old generation to the young generation and reduce the overhead in the marking phase of the CMS GC. Generally, 80% of the CMS GC is spent in the remark phase

G1(-XX:+UseG1GC)

The G1 collector is a server-based garbage collector for machines with multiple processors and large amounts of memory. At the same time, it meets the GC pause time requirement with high probability and has high throughput performance characteristics.

G1 divides the Java heap into equal-sized independent regions, and the JVM can have up to 2048 regions. G1 retains the concept of young and old generations, but no longer as physical barriers, they are both sets of (possibly discontinuous) regions. A Region may be a young generation. If a Region is garbage collected, it may become an old generation. That is to say, the Region function of a Region may change dynamically.

The G1 garbage collector has the same principles for when objects are transferred to the old era, except for the handling of large objects. G1 has a Region called Humongous for allocating large objects, rather than having large objects go directly into the old Region.

This includes the following steps:

  1. Initial tag: Mark the objects that GC Roots can directly relate to.
  2. Concurrency markup: Reachability analysis of objects in the heap starting with GC Root, recursively scanning the object graph throughout the heap.
  3. Final flag: A short pause on the user thread to process the REMAINING SATB records after the end of the concurrent phase.
  4. Filtering reclamation: Updates Region statistics and makes a reclamation plan based on the desired pause time. The reclamation algorithm mainly uses the replication algorithm.

The characteristics of:

  • Parallelism and concurrency: The G1 takes full advantage of hardware in a CPU, multi-core environment, using multiple cpus to reduce stop-the-world pause times.
  • Spatial integration: Different from the “tag-cleaning” algorithm of CMS, G1 is a collector based on the “tag-cleaning” algorithm. From the local point of view is based on the “copy” algorithm.
  • Predictable pauses: The -xx :MaxGCPauseMillis parameter specifies that the garbage collection is completed in M milliseconds (the G1 collector maintains a priority list in the background, prioritises the Region with the largest collection value each time based on the allowed collection time).

ZGC (JDK11)

The goal is to achieve low latency that can limit the garbage collection pause time to less than 10ms at any heap memory size with little impact on throughput. It has the following characteristics:

  • Based on dynamic Region memory layout. ZGC regions are dynamic — dynamically created and destroyed. Under X64, it can be divided into three categories: large, medium and small:

    1. Small: the capacity is fixed to 2MB, which is used to place small objects smaller than 256KB.
    2. Medium: the capacity is 32MB. It is used to house objects that are larger than or equal to 256KB but smaller than 4MB.
    3. Large: the capacity is not fixed and can be changed dynamically. However, the capacity must be an integer multiple of 2MB for large objects of 4MB or larger. Each large Region stores only one large object.
  • There are no generations.

  • A concurrent mark-collation algorithm is implemented using dyed pointer, read barrier and memory multiple mapping techniques.

    The dyed pointer is the signature design of ZGC, which is a technique to store a small amount of additional information directly on the pointer (the high 4 bits are extracted to store 4 logo information). Through these flag bits, the virtual machine can directly see the three-color marker state of the referenced object from the pointer, whether it has entered the reallocation set (has been moved), and whether it can only be accessed through the Finalize () method.

    Three advantages of dyed Pointers:

    1. Dyed Pointers allow a Region to be released and reused as soon as its living objects are removed, rather than waiting for all references to the Region to be corrected in the heap.
    2. Dyeing Pointers can significantly reduce the amount of memory barriers used during garbage collection. Memory barriers, especially write barriers, are designed to record changes in object references. If this information is maintained directly in Pointers, it obviously eliminates the need for specialized logging operations.
    3. Dyed Pointers can be used as an extensible storage structure to record more data related to the object marking and relocation process for further performance improvement in the future.

The procedure is as follows:

  1. Concurrent marking: Like G1 and Shenandoah, traversing the object graph for accessibility analysis also goes through a short pause similar to G1 and Shenandoah’s initial marking and final marking.
  2. Concurrent reassignment: The system calculates the regions to be cleared during the collection based on specific query conditions and forms these regions into a reassignment set.
  3. Concurrent reassignment: Copies the surviving objects in the reassignment set to the new Region, and maintains a forwarding table for each Region in the reassignment set to record the transfer relationship from the old object to the new object.
  4. Concurrent remap: Fixes all references throughout the heap to old objects in the reallocation set.

Shenandoah (JDK 12)

The goal is to achieve a garbage collector that can limit the garbage collection pause time to less than 10ms for any heap memory size.

Shenandoah is similar to G1 in that it has a similar memory layout, is highly consistent in many stages of initial markup, concurrent markup, and even shares some implementation code. The following improvements are made to the G1 collector:

  • Collation algorithms that support concurrency.
  • By default, generational collection is not used (not that generational collection is not valuable to Shenandoah, more of a cost-performance tradeoff).
  • Discard the memory set in G1, which takes a lot of memory and computing resources to maintain, and use a global data structure called the connection matrix to record cross-region reference relationships. (The connection matrix can be simply understood as a two-dimensional table. If Region N has objects pointing to Region M, the N rows and M columns of the table are marked.) .

The procedure is as follows:

  1. Initial tagging: As in G1, the objects directly associated with GC Roots are first tagged.
  2. Concurrent marking: As in G1, the object graph is traversed, marking all reachable objects.
  3. Final markup: As with G1, the remaining SATB scans are processed.
  4. Concurrent cleanup: This phase is used to clean up regions where not a single living object has been found in the entire Region.
  5. Concurrent collection: Unlike other previous HotSpot collectors, the core of the problem is that objects move during replication with read barriers and forward Pointers called “Brooks Pointers”.
  6. Initial reference update: Establishes a thread collection point to ensure that all collector threads in the concurrent collection phase complete the object movement task.
  7. Final reference Update: Fixes references that exist in GC Roots.
  8. Concurrent clearing: reclaims memory space. After the concurrent reclamation and reference update, all Regions in the reclamation set have no living objects, and these Regions become Immediate Carbage Regions. Finally, the concurrent cleanup process is called again to reclaim the memory space of these Regions

How to select the

The garbage collector collocation relationship is as follows. G1 is officially recommended because of its high performance.