ZGC introduction

ZGC (The Z Garbage Collector) is an experimental Garbage Collector with minimal latency that was introduced in JDK 11. It was designed to include:

  • The pause time shall not exceed 10ms;
  • The pause time does not increase with the size of the heap, or with the size of the active object;
  • Support 8MB~4TB heap (16TB will be supported in the future).

At the beginning, put forward this goal, there are a lot of people think the designer is bragging.

But today, it seems that these “blown down the niubi” are being realized one by one.

Based on the latest JDK15, the goals of “no more than 10ms of pause time” and “16TB of heap support” have been met, and it has been made clear that ZGC in JDK15 is no longer an experimental garbage collector and is now recommended for production.

ZGC has been familiar, can the interview question be far behind?

This article will start from the design idea of ZGC and explain why ZGC can be used in low latency scenarios with such excellent performance.

The core technology

Multiple mapping

To better understand ZGC memory management, let’s take a look at this example:

You are a son in the eyes of your parents and a boyfriend in the eyes of your girlfriend. He is the most handsome person in front of the whole world. You also have a name, but your name is just a code name for you, not you. Draw a map of this relation to express:

  • In your father’s eyes, you are a son;
  • In the eyes of your girlfriend, you say boyfriend;
  • From the point of view of the whole world, you say the most handsome person in the world;

If your name is the only one in the world, you will be identified by “your name”, “your father’s son”, “your girlfriend’s boyfriend”, “the most handsome person in the world”.

Now let’s look at the memory management of the ZGC.

In order to manage memory efficiently and flexibly, ZGC implements two levels of memory management: virtual memory and physical memory, and realizes the mapping relationship between physical memory and virtual memory. This is basically the same idea as virtual and physical addresses in the operating system.

When an application creates an object, it will first apply for a virtual address in the heap space. ZGC will also apply for a virtual address in the Marked0, Marked1 and Remapped view space for the object. These three virtual addresses correspond to the same physical address.

What do the Marked0, Marked1 and Remapped views mean?

In the example above, these three views correspond to “in the eyes of your dad”, “in the eyes of your girlfriend”, and “in the eyes of the world”.

The three views inside the address, are virtual address, corresponding to the “son in the eyes of your father”, “boyfriend in the eyes of your girlfriend”……

Finally, each of these virtual addresses maps to the same physical address, which corresponds to “yourself” in the example above.

This relationship is represented in a simple piece of Java code:

In ZGC, only one of these three Spaces is available at the same time.

Why is it so? This is the brillant part that the great ZGC, use of virtual space, in time, the three dimensional switch is triggered by different stage of garbage collection, through the limited space of three points in one and only one space at the same time the effective and efficient completion of process GC concurrent operation, the specific implementation will speak behind ZGC concurrent processing part and a detailed description of the algorithm.

Dyeing pointer

Before we talk about the ZGC concurrent processing algorithm, we need to add one more point of knowledge — the dye pointer.

As we all know, previous garbage collectors have been collecting GC information (tag information, GC generation age…). Store in Mark Word in the object header. Here’s an example:

If someone is a trash man, put a “trash” stamp on his head; If the person is not garbage, wash the “garbage” stamp off the person’s head.

Here’s what ZGC does:

If someone is trash. In this person’s ID card information inside mark this person is garbage, later no matter where this person brushes ID card, others all know he is a garbage person. Maybe one day, this person woke up is no longer a garbage man, the ID card inside the “garbage” mark removed.

In this example, the “person” is an object, and the “ID” is a pointer to the object.

ZGC stores information in Pointers, a technique with a fancy name: Colored Pointers.

On a 64-bit machine, the object pointer is 64-bit.

  • ZGC uses bits 0 to 43 of the 64-bit address space to store object addresses, 2^44 = 16TB, so ZGC supports a maximum 16TB heap.
  • Bits 44~47 are the color flags, Marked0, Marked1 and Remapped represent the three view flags, and Finalizable means that the object can only be accessed through finalizer.
  • Bits 48 to 63 are fixed at 0 and are not utilized.

Read barrier

A read barrier is a technique by which the JVM inserts a small piece of code into the application code. This code is executed when the application thread reads the object reference from the heap. This read barrier is not to be confused with the read barrier in the Java memory model, which is not the same thing at all. The read barrier in ZGC is more like an AOP technique that adds an extra processing to read operations at the bytecode level or compiled code level.

Read barrier example:

<load barrier needed here bb0 Object p = o // <load barrier needed here bb0 Object p = o Int I = obj.fieldb // No need to add a read barrier because it is not an object reference

Read barrier code in ZGC:

GC thread and the application is concurrent execution threads, so there is application thread to the internal reference points to object B, when the object is moving by GC thread B or other operations, plus after read barrier, if B application thread to detect objects by GC thread operation, and then waiting for the operation to complete reading object, to ensure the accuracy of the data. The specific detection and operation steps are as follows:

Does this affect the performance of the program?

Will be. According to the test, up to 4 percent performance loss. But this is the basis of concurrent migration in ZGC, and the designers think this sacrifice is acceptable in order to reduce STW.

ZGC concurrent processing algorithm

ZGC concurrent processing algorithm uses global space view switch and object address view switch, combined with SATB algorithm to achieve efficient concurrency.

Above all, in order to clear the concurrent processing algorithm of ZGC, on some blog, said barrier is the core of ZGC dyeing pointer and read, but is not clear how both inside the algorithms used, I think, concurrent processing algorithm is the core of ZGC ZGC dyeing pointer and read barrier is only serve for the algorithm.

The global view switch of the three stages of the concurrent processing algorithm of ZGC is as follows:

  • Initialization: After ZGC initialization, the address view of the entire memory space is set to Remapped
  • Marking phase: When entering the marking phase, the view changes to either Marked0 (M0) or Marked1 (M1).
  • Transition phase: the view from the end of the mark phase to the transition phase is set to Remapped again

Mark phase

The mark phase global view switches to the M0 view. Because the application and the tag thread execute concurrently, the access to the object may come from both the tag thread and the application thread.

After the marking phase, the object’s address view is either M0 or Remapped.

  • If the object’s address view is M0, the object is active.
  • If the address view of the object is Remapped, the object is inactive. The memory used by the object can be reclaimed.

When the marking phase is over, ZGC stores the addresses of all the active objects in the object active information table. The address view of the active objects is M0.

Transfer stage

Transition phase switches to Remapped view. Because the application and the transfered thread also execute concurrently, the access to the object may come from the transfered thread and the application thread.

At this point, concurrent marking and concurrent transitions are over for one garbage collection cycle of the ZGC.

Why design M0 and M1

We mentioned that there are two address views M0 and M1 in the marking phase. The above algorithm procedure shows only one address view. Why two? Simply to distinguish the previous tag from the current tag.

ZGC is a page-by-page partial memory garbage collection, that is, when the page where the object is located needs to be collected, the objects in the page need to be moved, and if the page does not need to be moved, the objects in the page do not need to be moved.

As shown in the figure, the address view of this object is still M0 at the beginning of the second GC cycle. If the marking phase of the second GC cuts to the M0 view, it is not possible to tell whether the object is active or marked by the previous garbage collection. At this point, the marking phase of the second GC cycle can be distinguished by cutting to the M1 view, where the three address views represent:

  • M1: Active object identified in this garbage collection.
  • M0: An active object that was flagged during the marking phase of the previous garbage collection, but was not moved during the transfer phase, but was identified as inactive in this garbage collection.
  • Remapped: An object that was transferred during the transfer phase of the previous garbage collection or that was accessed by the application thread, but was identified as inactive in the current garbage collection.

Now we can answer the question “What are the benefits of using address views and dye Pointers?

Use the address view and the dye pointer to speed up markup and migration. Previous garbage collectors mark GC information by modifying the mark bits of the object header, which are memory-access, while ZGC uses address view and dye pointer technology, which does not require any object access, only setting the corresponding mark bits in the address. This is why ZGC is faster in the markup and migration phases.

When GC information is no longer stored in the object head and there is a reference pointer, when an object is determined to be useless, the corresponding memory space can be reused immediately, which cannot be done by putting GC information in the object head.

ZGC steps

ZGC adopts mark-copy algorithm, marking, transfer and relocation stages are almost all concurrent, and the garbage collection cycle of ZGC is shown in the figure below:

ZGC has only three STW phases: initial marking, re-marking, and initial transfer.

Among them, initial labeling and initial transfer only need to scan all GC ROOTS respectively, and the processing time is proportional to the number of GC ROOTS. Generally, the time is very short.

In the re-marking stage, the STW time is very short, up to 1ms. After 1ms, the concurrent marking stage will be entered again. That is, almost all of the ZGC pauses depend solely on the GC Roots collection size, and the pause time does not increase with the size of the heap or the size of the active object. In contrast to ZGC, the transition phase of G1 is completely STW, and the pause time increases with the size of the living objects.

The development of ZGC

ZGC was born in JDK11, and over time, ZGC in JDK15 is no longer experimental.

From only supporting Linux/x64 to now supporting multiple platforms; Never support pointer compression, to support compressed class Pointers…..

In JDK16, ZGC will support Concurrent Thread Stack Scanning. According to SPECjbb2015 test results, after the implementation of Concurrent Thread Stack Scanning, ZGC’s STW time can be reduced by another order of magnitude, and the pause time will enter the millisecond era.

ZGC is already a good garbage collector that borrows from the Pauseless GC and seems to be moving in the same direction as the C4 GC by introducing generational thinking.

Oracle’s efforts have made it possible for us developers to see commercial-level GC “flying into the common denominator.” As the JDK progresses, I believe that in the future, JVM tuning will no longer be an anti-human operation, and the underlying GC will be automatically optimized for any situation.

ZGC is indeed the cutting edge of Java technology, but it seems premature to talk about ZGC today when G1 is not widespread. But maybe we’re not talking about ZGC, but the design ideas behind it.

Hope you can gain something!

Write in the last

In order to be responsible for the accuracy of each article I send out, I usually refer to official documents and industry authority books. Sometimes, I also need to read some papers and read some source code. The official documents and papers are usually in English, which is very difficult for a person who only got 456 points in CET-4. I was accompanied by Google translation and youdao dictionary throughout the whole process. Because the translation of some professional terms is not accurate enough, English and translation are also needed to understand slowly.

But even then, there will be bugs, and if you find them, you are welcome to bring them up and I’ll fix them.

Your positive feedback is very important to me, like, click to see again, click attention is the biggest support to me!

Thanks for reading and we’ll see you next time!