Questions to be explored in this chapter:

When GC collects memory:

  1. How do I determine which memory needs to be reclaimed?
  2. When is it recycled?

In several thread-private runtime areas:

  • The virtual machine stack
  • Program counter
  • Local method stack

Most of their memory allocation and reclamation are deterministic, generated as a thread is created and reclaimed as a thread is terminated. The amount of memory in a stack frame is basically known when the structure of the class is determined.

The Class(Method) Area of the Java Heap is different from that of the Class(Method) Area:

For example, if an interface has different implementation classes (the class information is in the method area), the memory size of these implementation classes must be different, so you can’t know how much memory is needed before running, and only at run time can you know the size of the object being created.


First, which memory needs to be reclaimed?

Before we know which memory needs to be reclaimed, we need to know how to determine if an object is alive and reclaim it when it is no longer alive. The reference counting algorithm is used to determine whether an object is alive or not.

1. Reference Counting algorithm

Algorithm description: add a reference counter to the object, when there is a reference to it, the counter +1, when the reference is invalid, the counter -1, at any point, the counter is 0 the object can no longer be used.

Reference counting works well in most cases and has been used by many companies. However, this algorithm is not used in the JVM for one reason: there is no way to solve the problem of objects referring to each other.

public class Person {
    Object instance = null;

    public static void main(String[] args) {
        Person a = new Person();
        Person b = new Person();
        
        a.instance = b;
        b.instance = a;
        
        a = null;
        b = null;// In normal case, GC will collect a and B}}Copy the code

Under normal circumstances, the JVM’s GC will recycle a and B when executing 11-12 lines of code, but in the case of the reference counting algorithm:

  • performa=null,a’s reference counter value is 1 because object B is referencing it.
  • performb=null,b’s reference counter value is 1 because object A is referencing it.

2. Reachability Analysis algorithm

In the Java language, reachability analysis is used to determine whether an object is alive or not. Algorithm description: A series of GC Roots are used as the starting points to search downward from these starting points. The objects that can be searched are indicated to be available and will not be recycled by GC. The path traversed by the search is called the Reference Chain. Conversely, if an object has no path to GC Roots, it is not available and is judged to be recoverable by GC.

As shown in figure: although the objects in region 1 are related to each other, they cannot reach GC Roots, so they will be recycled, while the objects in region 2 have reachable paths with GC Roots, so they will not be recycled.

What are GC Roots?

  • The object referenced in the virtual machine stack (the local variable table in the stack frame)
  • The object referenced by the static property of the class in the method area
  • The object referenced by the constant in the method area
  • Objects referenced by JNI(Native methods) in the Native method stack

These can be used as GC Roots.

3. What is Reference

We mentioned reference relationships between objects in the reference counting algorithm and reachability analysis above.

Before Java1.2, the definition of a reference was:

If the value of a reference datastore represents the starting address of another chunk of memory, the chunk is said to represent a reference.

After JDK1,2, the four concepts of strong reference, soft reference, weak reference and virtual reference are introduced, and the reference relationship of these four manifestations is getting weaker and weaker.

  • Strong Reference:

Ex. :

Object o = new Object();
Copy the code

As long as a strong reference exists, the GC will never reclaim the referenced object.

  • Soft Reference:

Useful, but not required. When a memory overflow is about to occur, the soft-referenced object is reclaimed. If the memory is still insufficient, an OOM exception is raised.

  • Weak Reference:

Non-essential objects, weakly referenced objects are collected whenever garbage collection occurs in the GC, regardless of whether there is sufficient memory at the time.

  • Phantom Reference:
    • The weakest reference relationship
    • A market instance cannot be constructed from a virtual reference.
    • The only effect is to receive a signal when the object associated with the virtual reference is reclaimed by the GC.

4. How to tell if an object is recyclable (dead)?

It is not enough for an object to be collectable simply by seeing that it is not associated with GC ROOTS.

An object goes through the following process to determine whether it should be reclaimed. :

5, method area recycling

We are talking about the collection of objects that exist in the Java heap, but there are also the following things to recycle in the method area:

① Recycle discarded constants

If there is a String “ABC” in the constant pool, but no String in the system refers to it, that constant is not referenced, the GC will reclaim the literal when it is collected.

② Recycle discarded classes (useless classes)

  1. All instances of this class have been collected, and there are no instances of this class in the Java heap.
  2. The ClassLoader that loaded the class has been reclaimed.
  3. The java.lang. Class object of this class is not referenced (it would be used in reflection).

③ Recycling strategy of method area:

There are two ways that GC can reclaim a method area:

  • Mark-tidy
  • Mark-clear



Two, how to recycle?

There are a number of garbage collection algorithms that GC uses to reclaim memory, each of which has its pros and cons.

1. Mark-sweep algorithm

This algorithm is the most basic and oldest garbage collection algorithm, the algorithm mainly through two processes

① Algorithm Description

  1. Mark stage: Mark recyclable objects after determining how an object can be recycled.
  2. Cleanup phase: The marked object is uniformly recycled.

② Algorithm defects

  1. Efficiency problem: the efficiency of labeling and clearing of this algorithm is not high.
  2. A large number of discrete memory space fragments are generated after the flag is cleared.

2. Copying algorithms

The replication algorithm is optimized for efficiency by dividing the memory region into two blocks, one of which is used at a time.

  • Active area
  • The free zone

① Algorithm Description

As shown in figure:

  • Before reclamation: The memory is divided into left and right areas, and the right area is free. It is not used for the time being
  • Recycle: Recycle the left part (black) to be recycled, then move the 4 live objects (light gray) to the free area on the right, and do 2 things
    • Live objects moved to the free area are sorted by memory address.
    • Points the old address that the living object points to to the new memory address.
  • After reclamation, the free area on the right becomes active, and the active area on the left becomes free.

The state of the left and right zones shifts back and forth after each recycle…

② Algorithm defects

  1. Obviously, this algorithm wastes ordinary memory.
  2. When 100% of the objects in the live area are still active, it is inefficient to copy all objects to the free area on the right during reclamation.

③ Algorithm application

IBM research shows that 98 percent of the new generation of Java heap objects are ‘ephemeral’ objects, such as temporary variables that have little scope. Therefore, the virtual machine does not divide the two regions in a 1:1 ratio.

In current JVM virtual machines, the new generation is divided into a Eden zone and two smaller Survivor zones (from,to zone). Eden region and one Survivor region (FROM region) are used as the active region each time. When memory reclamation occurs, the surviving objects in these two regions are copied to another Survivor region (TO region).

In the HotSpot VIRTUAL machine, the Eden zone and Survivor zone are divided into 8:1, so active zone accounts for (8+1)/10 *100% = 90% of the new generation, with only 10% memory waste.

Old age: When a live object is copied from the active region (Eden,from) to the to region, if the TO region is insufficient, the remaining live objects are put into the old age.

3. Mark-tidy algorithm

Marking – finishing is mainly used in the old age.

① Algorithm Description

This algorithm is similar to the mark-clear algorithm and also goes through two stages:

  1. Mark stage: This stage is the same as the mark stage in mark – clean in that it marks the object to be reclaimed.
  2. Collation stage: all surviving objects are moved to one end of the memory area according to the memory address arrangement, and the area outside the end boundary is reclaimed.

② Algorithm application

Due to the characteristics of the old age, the survival rate of the object is high and there is no extra free area, so the old age applies mark-clean and mark-tidy algorithms.




















Reference: Gain an in-depth understanding of the Java VIRTUAL machine