Abstract:The G1 Garbage Collector is a garbage collector primarily for server-side applications.

This article is from Huawei Cloud Community, “JVM interview points: from the simple to the deep you know G1 Garbage Collector!!” “, by Code Mantis Shrimp.

Introduction to the G1 garbage collector

The G1 Garbage Collector is a garbage collector primarily for server-side applications. As a milestone in the evolution of garbage collector technology, the G1 garbage collector is different from previous garbage collectors. The first is a change in thinking, as shown below:

G1 partitioning of the Java heap



The picture above may not be clear for the first time, because you don’t know about G1. If you look at the picture below, it should be about right.

The G1 garbage collector’s partitioning of Java heap regions is different from the way we’ve known Java to partition regions

The previous division of the Java heap area is as follows:Cenozoic is divided into Eden area and Survivor area, Survivor area is divided into from area and to area.

However, instead of insisting on a fixed size and a fixed number of generational regions, G1 now divides the contigous Java heap space into separate regions of equal size, each of which can be an Eden space, a Survivor space, or an old age space.

This change in thinking and design makes it possible for G1 to form a collection for any part of the heap memory. The criterion is no longer which generation it belongs to, but which part of the heap contains the most garbage and the most collection benefits. This is the Mixed GC mode of the G1 collector.

Region also has a special class of Humongous regions for storing large objects. G1 considers that an object whose size exceeds half of a Region’s capacity can be considered large. Very large objects that exceed the size of the entire Region will be placed in N consecutive Humongous Region regions.

The value range of Region is between 1M and 32M

The default number of Region is 2048

-XX:G1HeapRegionSize = N

G1 seems to do so by a new feeling, but careful partners may have found that if there are cross-region reference objects between regions, then how to solve these objects?

  1. Whether G1 or any other generational collector, the JVM uses a Remembered Set to avoid global scanning.
  2. Each Region has a corresponding memory set.
  3. Each Write to Reference type data creates a Write Barrier that temporarily terminates the operation
  4. It then checks to see if the Reference to be written points to an object in a different Region from the Reference type data.
  5. If not, the relevant reference information shall be recorded in the Remembered Set of the Region where the reference points to the object through the Card Table
  6. When garbage collection is performed, add the memory set to the GC Roots enumeration range; You can guarantee that you don’t have to do a global scan.

The memory set of G1 can be understood as a hash table, where the Key is the starting address of other regions and the Value is the set of index numbers of the card table.

Because G1 partitions the Java heap into regions, and the number of regions is significantly larger than the number of traditional generations, G1 consumes an additional space equivalent to 10% to 20% of the Java heap capacity compared to the traditional garbage collector to maintain the collector’s work.

G1 Garbage Collector workflow

  • Initial Marking: This stage only marks the objects that GC Roots can directly relate to and modifies the values of TAMS(Next Top at Mark Start) so that when the user program is running concurrently in the Next stage, new objects can be created in the correct available Region. This stage requires a pause thread, but the time is very short. This is done synchronously from Minor collections, so the G1 collector doesn’t actually have any additional pauses at this stage.
  • Concurrent Marking: Starting from the GC Roots, the objects of the heap are analyzed for reacheability, recursively scanning the object graph of the entire heap to find the surviving objects. This stage takes a long time, but can be performed concurrently with the user program. After the object graph scan is complete, the objects recorded by SATB that have changed references at the time of concurrency are reprocessed.
  • Final Marking: Another brief pause for the user thread to handle the last few SATB records that remain after the concurrency phase has ended.
  • Screening (Live Data Counting and Evacuation) : responsible for updating the statistics of the Region, the recovery value and cost of each Region is ordered, according to the user’s expected pause time to develop recovery plan. You can select as many regions as you like to form a collection, and then copy the surviving objects in that part of the collected Region == = into the empty Region to empty those regions.

STW is required for all procedures except concurrent markup

The difference between G1 and CMS

  • G1 is a mark-collate algorithm on the whole, but a copy algorithm locally (between two regions). CMS is a mark-and-sweep algorithm so G1 doesn’t fragment, CMS does fragment
  • Whereas CMS uses a post-write barrier to maintain the card table, G1 uses a post-write barrier not only to maintain the card table, but also to keep track of pointer changes during concurrency (to achieve the original snapshot).
  • Whereas CMS uses the traditional new and old partitioning of Java heap memory, G1 uses a new partitioning method.
  • The CMS collector only collects old ages and can be used with the new generation of Serial and Parnew collectors. The G1 collector collection range is old age and new age. It does not need to be used in combination with other collectors
  • CMS uses incremental updates to solve the problem of false markup under concurrent markup, while G1 uses raw snapshots

Click on the attention, the first time to understand Huawei cloud fresh technology ~