Garbage is useless things, not only useless, and take up space, do not throw to stay why. The JVM also wants to get rid of its garbage, because JVM memory space is expensive.

So there are two questions:

First, what is garbage, how to determine whether an object is garbage?

Two, how to throw garbage, what posture throw garbage elegant (algorithm)?

One, how to judge whether an object is garbage

Garbage is a worthless object to the JVM, and an object that is not used is a worthless garbage object. There are so many objects in the memory world of the JVM that there are rules that govern whether each object is garbage or not. There are two main methods, one is reference counting and the other is reachability analysis algorithm also called root search algorithm.

1. Reference Counting

Reference-counting is simple, it’s literal. Add a counter to the object, which increases by 1 each time the object is referenced. Conversely, every time an object reference is invalidated, the counter is -1, and an object counter of 0 is considered garbage.

The reference counting algorithm is simple and efficient, but it is not used in Java because it cannot handle circular references. According to the logic of this algorithm, only counters that are non-zero are not garbage. If two objects, A and B, refer to each other (counters +1 each), but are not referenced by other objects, they are not actually used, but they are not recycled, just like two connected boats floating in the sea of memory. As the number of objects referred to each other increases, memory can no longer handle OOM exceptions.

2. Accessibility Analysis Algorithm (GC Roots Tracing)

In the mainstream commercial programming languages (Java and C#), GC Roots Tracing is used to determine whether an object is alive or not. The basic idea of this algorithm is to search down from a series of objects named “GC Roots” as the starting point, and the search path is called the Reference Chain. When an object is not connected to GC Roots by any Reference Chain, it is proved that the object is unavailable. The diagram below:

GC Roots objects include:

  • Objects (thread stack variables) referenced in the virtual machine stack (local variables in the frame)
  • The object referenced by the class static property in the method area
  • The object referenced by a constant in the method area
  • Objects referenced by JNI(commonly referred to as Native methods) in the Native method stack

Two, what posture to throw garbage elegant (algorithm)?

In order to determine whether an object is garbage or not, there are several different garbage collection approaches in the JVM: mark-clear, copy, mark-collation, and generational collection. Here’s a look at each algorithm and its pros and cons.

1. Mark-sweep

The algorithm is the most basic algorithm, divided into “mark” and “clear” two stages: first, mark all the objects to be recycled, after the completion of the mark all the marked objects are recycled. It is the most basic algorithm because subsequent collection algorithms are based on this idea and improve on its shortcomings.

It has two main disadvantages:

  • One is efficiency. The labeling and cleaning processes are inefficient.
  • The other is the space problem, which can result in a large number of marks being clearedDiscontinuous memory fragmentationMemory fragmentation is so high that the program cannot find enough contiguous memory to allocate large objects and has to trigger another garbage collection operation ahead of time.

The execution process is as follows:

2. Copying algorithms

To solve the efficiency problem, the “copy” collection algorithm emerged, which divides the available memory into two equally sized pieces by capacity and uses only one piece at a time. When this block of memory is used up, the surviving objects are copied to another block, and the used memory space is cleaned up once and for all. In this way, each time a piece of memory is reclaimed, memory allocation does not consider the problem of memory fragmentation, just need to move the heap top pointer, in order to allocate memory, simple implementation, efficient operation. But the cost of this algorithm is to reduce the memory to half of the original, half of the memory is idle, too wasteful.

The execution process of the replication algorithm is shown as follows:

At present, all commercial virtual machines use this algorithm to recover the new generation. A special study by IBM shows that 98% of the objects in the new generation are short-lived and their life cycle is very short. Therefore, it is not necessary to divide the memory space according to the 1:1 ratio, but to divide the memory into a large Eden space and two small Survivor Spaces. Use Eden and one of the Survivor Spaces each time. When recycling is done, the surviving objects in Eden and Survivor are copied to another Survivor space at a time, and Eden and the Survivor space that was just used are cleaned up. By default, the HotSpot VIRTUAL machine has an 8:2 ratio of Eden and Survivor space, which means that each new generation has 90%(80%+10%) available memory of the entire new generation, and only 10% of the new generation memory is "wasted". Of course, 98% of the recyclable objects are only data in the general scenario, but there is no way to guarantee that no more than 10% of the objects survive each collection. When Survivor space is insufficient, other memory (old age) is required for allocation guarantee.

3. Mark-compact algorithm

The replication algorithm will perform more replication operations when the object survival rate is high, and the efficiency will be lower. If you do not want to waste 50% of the space, you need to have extra space to guarantee, in order to cope with the extreme situation that all objects in the memory used are 100% alive, so this algorithm generally cannot be directly selected in the old age. In accordance with the characteristics of the old days, the “mark-clean” algorithm was proposed, the marking process is still the same as the “mark-clean” algorithm, but the next step is not to clean the recyclable objects directly, but to move all the living objects to one end, and then directly clean the memory beyond the end boundary.

The execution diagram of “mark-tidy” algorithm is as follows:

4. Generational collection algorithm

Current garbage Collection on commercial virtual machines uses a “Generational Collection” algorithm, which divides memory into chunks based on the life cycle of an object. The Java heap is generally divided into the new generation and the old generation, so that the most appropriate collection algorithm can be adopted according to the characteristics of each generation.

  • In the new generation, when a large number of objects die and only a small number of objects survive each garbage collection, the copy-collection algorithm is selected, and the collection can be completed by paying only a small amount of the replication cost of the surviving objects.
  • In the old age, because the object has a high survival rate and there is no extra space to allocate it, it must use “mark-clean” or “mark-tidy” algorithm for recycling.

This article refers to “in-depth understanding of Java VIRTUAL machine” ===== questions welcome to raise, discuss, learn