This is the 19th day of my participation in the August Wenwen Challenge.More challenges in August

CMS collector

We are familiar with the mechanism of the young generation garbage collector. Next, we will introduce the core of the old generation garbage collector.

Most of you don’t even think about garbage collection when you’re writing code. We write code like crazy, and then we deploy it, regardless of whether the code we write has an impact on memory or garbage collection. Then, after the system has been up for a while, various stutters are detected, frequently triggering the Full GC.

There are many similar cases, we should not be too idealistic to expect that there will never be Full GC, or we should have a mental understanding of the old garbage collector working principle, so as to do better tuning work.

The garbage collector of choice in the old days was the CMS (Concurrent Mark Sweep) collector, which was a collector whose goal was to obtain the shortest collection pause time. Most of our Internet websites or the server end of B/S system based on browser, such applications usually pay more attention to the response speed of the service and hope that the system pause time is as short as possible, so as to bring good interactive experience to users. The CMS collector is a good fit for such applications.

As the name (including “Mark Sweep”) suggests, the CMS collector is based on a mark-sweep algorithm. Its operation is more complex than the previous collectors, and the whole process is divided into four steps:

CMS Initial mark 2) CMS Concurrent mark 3) CMS re-mark 4) CMS Concurrent sweep

Next, we will analyze the operation logic and process of the whole CMS step by step, and try to master its core ideas.

1) CMS Initial Mark

Firstly, according to the “reachability analysis algorithm” mentioned above, we determine which objects are referenced by GC Roots. If so, they are alive objects, otherwise, they are garbage objects. Then mark all the garbage objects as shown below:

Note: The initial tagging process will cause The system to Stop working and enter The “Stop The World” state, but this process is quick and marks only those objects directly referenced by GCRoots. (Recall that GCRoots objects are static variables of classes, local variables of methods, but instance variables of classes are not GCRoots.)

Suppose I have this code in my system:

public class Test{
    private static Company company = new Company();
}
public class Company{
    private Employee employee = new Employee();
}
Copy the code

The corresponding initial marking phase in memory will mark only the objects referenced directly by GC Roots, namely Company(), while the Employee object is simply an instance variable of the class and will not be marked. The memory diagram is as follows:

Note: The Employee object is only referenced by the instance variable of the class, not directly referenced by GCRoot, so the initial tag is not marked.

2) CMS Concurrent Mark

The concurrent marking stage restores the normal operation of the system, and objects can be created at will. At the same time, the concurrent marking thread also starts to work. Because concurrent marking and object creation are carried out at the same time, new objects will continue to be generated, and it is possible that some objects lose reference and become garbage objects.

So what objects do concurrent tags mainly mark? For example, the Employee object in the figure above, the garbage collector thread will determine by whom the object is referenced, in this case by company, and again by whom the Company object is referenced. Since the initial tag already knows that it is referenced directly by GCRoots, It is determined that the Employee object is indirectly referenced by the GCRoots object and thus marked as a viable object.

In short, all objects existing in the old era and new objects are marked, and our system thread is always working to generate objects, so this stage is also the most time-consuming. Although time consuming, garbage collection takes place in parallel with the system, so there is no impact on system performance.

3) Re-marking (CMS Remark)

Since our second phase is concurrent marking, it is certain that some objects have lost their references and become junk objects without correction, and newly created objects have not been marked, as shown in the following figure:

So phase 3: re-marking suspends our system threads and begins reordering, as shown below:

However, this stage will be very fast, mainly for the second stage of the system program run changed a few objects for marking, so the speed is very fast.

Then the system threads resume work and the fourth stage begins: concurrent cleanup.

CMS Concurrent sweep

Finally, there is the concurrent cleanup phase, which cleans and deletes dead objects judged by the marking phase. Since there is no need to move living objects, this phase can also be concurrent with the user thread.

5) summary

Through the whole process of the above CMS work, we can summarize as follows:

  • The most time-consuming phases are: concurrent tagging and concurrent cleanup –> however, this phase is executed concurrently with user threads without affecting the system
  • Initial tagging and re-tagging phases: you need to Stop the World and pause the system —-> But these two phases are fast and have little impact

Use a complete flow chart to represent the working logic of our CMS: