This is the 13th day of my participation in the More text Challenge. For details, see more text Challenge

The JVM solves the first garbage collection problem — what objects are garbage objects and should be collected — with an reachability analysis algorithm. Once the object has been determined that it must be collected, the next consideration is how to collect the garbage objects.

There are three main methods for garbage collection: mark-clean algorithm, mark-copy algorithm, and mark-collate algorithm.

Import from an algorithm problem

Removing elements is a typical algorithm problem. Given an array nums and a value val, you need to remove all elements equal to val in place and return the new length of the removed array.

This is a simple algorithm problem that iterates through an array and, when it encounters an element with a value of val, deletes that element and moves the following elements forward. By scanning the array with two Pointers, we can remove all elements equal to val in place. This is the basic idea of the tag-tidy algorithm.

class Solution {
    public int removeElement(int[] nums, int val) {
    	int result = 0;
        for(int i = 0; i<nums.length; i++){if(nums[i] != val){
        		nums[result] = nums[i];
        		result++;
        	}
        }
    	returnresult; }}Copy the code

If we make the condition more relaxed, we can use another array to remove an element and assign an element that is not val to the new array. As a result, the new array is returned. That’s the basic idea of the mark-copy algorithm.

If we make the condition more relaxed, we can remove elements regardless of whether the array is fragmented or not, and assign the original element of val to the initial value. This is the basic idea of the mark-clear algorithm.

2 Mark-clear algorithm

Tag-clear algorithm As its name suggests, the entire collection process is divided into two phases: tag-clear and cleanup. All objects to be reclaimed are marked first, and then the marked objects are cleared in place. Although the mark-clear algorithm is simple, it has two problems:

  • Time problem: the efficiency of marking and clearing is unstable. If a large number of objects need to be collected in memory, the JVM needs to perform a large number of marking and clearing actions. As a result, the efficiency of marking and clearing decreases with the increase of the number of objects.
  • Space problem: If the objects to be collected are directly cleared in place, memory fragmentation will be generated, which increases the frequency of garbage collection (if a large object cannot find a suitable debris space, the next garbage collection will be triggered).

Mark-copy algorithm

The mark-copy algorithm directly copies objects that do not need to be reclaimed to another space, and then clears all objects in the old memory space. In the JVM, the heap memory is divided into two equally sized areas, and only one area is used at a time. When the area is full, the living objects are copied to the other area, and the data in the used area is cleared until the next reclamation. The mark-copy algorithm can avoid the overhead of marking and removing a large number of garbage objects. At the same time, since the pointer is always incrementing during replication, there is no space fragmentation after the object is reclaimed. However, the other obvious problem is that the space is too expensive, and only half of the heap memory is used at a time.

4 Mark-collation algorithm

Each time a garbage object is collected, the mark-collation algorithm moves the remaining objects to the empty space. The mark-tidy algorithm can avoid the space debris or space waste problems caused by the first two algorithms. But there is another big overhead, the overhead of moving objects. At the same time, the mark-collation algorithm also causes the entire main program to stop running when the memory space is reclaimed, which can have a significant impact on the main program if the garbage collection process takes too long.

5 GC algorithm selection

Due to the advantages and disadvantages of the above three algorithms, the algorithm should be selected after weighing the advantages and disadvantages in practical application to better design garbage collector. Based on the generational collection theory, different collection algorithms are adopted for the new generation and the old generation.

  • The new generation
    • The new generation is characterized by a short lifetime of objects, with 80% of objects not surviving their first garbage collection.
    • Therefore, the optimized mark-copy algorithm can be used for garbage collection, and a small space can be opened up to store the surviving objects during the replication to avoid excessive space waste.
    • “Appel recycle” : The new generation of memory space is divided into a larger Eden and two smaller Survivor Spaces. Eden and one Survivor space are used each time. When the two Spaces are full, the surviving objects are copied to the other Survivor space, and the two used Spaces are freed.
  • The old generation
    • Objects in the old generation generally live for a long time, and the space released by each collection of objects is not too much.
    • When considering the time cost of garbage collection, we should look at the time cost of new object space reclamation as well as memory space reclamation. While the mark-clean algorithm avoids the time cost of moving objects, the fragmentation space needs to be carefully used when allocating new memory space, resulting in more time cost. The selective mark-tidy algorithm is better than the mark-clean algorithm when combined with the collection and new allocation.
    • There is another lazy collation strategy, in which the mark-clean algorithm is preferred for garbage collection; When you can’t stand the space fragmentation, use the mark-collation algorithm to break the space fragmentation into a whole block of memory.