This is the 20th day of my participation in the August Text Challenge.More challenges in August

We’ve already described how the CMS garbage collector works and how it flows. In this article, we’ll go into more detail about the shortcomings of the CMS garbage collector and how some of the problems it causes can be addressed. Let’s start with a complete diagram to review the logic of the CMS’s work:

Analysis of shortcomings of CMS

CMS is an excellent Collector, its main advantages in the name already reflects: Concurrent collection, Low Pause, some official public documentation is also called a Concurrent Low Pause Collector (Concurrent Low Pause Collector). The CMS collector is the first successful attempt at low pauses for the HotSpot VIRTUAL machine, but it is far from perfect and has at least three obvious drawbacks:

1. CPU resources are strained due to concurrency

First, the CMS collector is very sensitive to processor resources. In fact, programs designed for concurrency are sensitive to processor resources. In the concurrent phase, it does not cause user threads to pause, but it does slow down the application and reduce overall throughput by taking up a portion of the threads (or processor computing power).

The default number of garbage collection threads started by CMS is (number of processor cores +3) /4, that is, if the number of processor cores is four or more, the garbage collection threads will occupy no more than 25% of the processor computing resources during concurrent collection, and will decrease as the number of processor cores increases. But when there are fewer than four processor cores, the CMS’s impact on user programs can become significant. If the application’s original processor load is high and half of its computing power is allocated to executing the collector thread, the execution speed of the user program can suddenly be drastically reduced.

For example, if our common machine is 2-core 4G, the number of reclaim threads allocated to CMS = (2+3)/4 =1, which directly occupies half of the CPU resources

So the first problem with CMS is the impact on CPU resource usage, especially if the NUMBER of CPU cores is small.

2.Con-current Mode Failure

Because The CMS collector cannot handle “Floating Garbage,” it is possible to have a “con-current Mode Failure” that results in another Full “Stop The World” GC.

Concurrent tags in CMS and concurrent cleanup phase, user thread is continues to run, program is running will also naturally accompanied by a new garbage objects are generated, but this is part of the garbage objects appear in the marking process after the CMS cannot be in when to collect to get rid of them, so we have to leave to clean up again when the next garbage collection. This part of garbage is called “floating garbage”.

Because the user thread in garbage collection phase needs to be run continuously, then you also need to set aside enough memory space for user thread is used, therefore the CMS collector can’t wait to old s almost like other collector is completely filled to be collected, must set aside part of space for the use of the concurrent collection program operation.

In the JDK5 default setting, the CMS collector will be activated when the old generation is using 68% of the space. This is a conservative setting. If the old generation is not growing too fast in practice, you can adjust the parameter to -xx: The value of cmsinitiatingoccu-PancyFraction can improve the trigger percentage of CMS, reduce the frequency of memory reclamation, and obtain better performance. By JDK 6, the startup threshold for the CMS collector was raised to 92% by default.

However, this makes it more vulnerable to a Concurrent Mode Failure if the memory set aside during the CMS run cannot meet the application’s need to allocate new objects, at which point the virtual machine will have to initiate a backup plan: Freezing the execution of the user thread temporarily enables the Serial Old collector to redo the Old garbage collection, but this is a long pause. So parameters – XX: CMSInitiatingOccupancyFraction set too high will be very easy to cause a lot of concurrent failures, performance is reduced, the user should be in a production environment according to the actual application situation to balance Settings.

3. Memory fragmentation problem

The final drawback, as mentioned at the beginning of this section, is that CMS is a collector based on a mark-and-sweep algorithm, which, if you remember from the previous section, means a lot of space debris is generated at the end of the collection. (as shown in the red circle below)

When there is too much space debris, the allocation of large objects will cause a lot of trouble. It is often the case that there is still a lot of free space in the old years, but there is not enough contiguous space to allocate the current object, and you have to trigger a Full GC in advance.

To solve this problem, the CMS collector provides a -xx: The +UseCMS -compactatFullCollection switch parameter (on by default, deprecated from JDK 9) is used to enable the merge defragmentation process when the CMS collector has to perform Full GC. Since this defragmentation must move living objects, There was no concurrency (before Shenandoah and ZGC). The space debris problem is solved, but the pause time is longer, so the virtual machine designers provide another parameter -XX: CMSFullGCsBeforeCompaction (this parameter from the JDK 9), the effect of this parameter is the requirement that CMS collector in performing several times (quantity is determined by the parameter values) is not tidy space after a Full GC, Defragmentation is performed before the next Full GC (default is 0, which shows defragmentation every time Full GC is entered).