Determines whether the object is alive

Reference counting

Each object has a reference counter that is incremented by one when a reference is referenced and decayed by one when a reference is invalid. A counter of 0 indicates that the object can be reclaimed.

Advantages: simple counting, high efficiency

Disadvantages: It is easy for two objects to reference each other and cannot be reclaimed

Accessibility analysis

GC Root

Available as objects for GC Roots:

  1. The object referenced in the virtual machine stack (the local variable table in the stack frame)
  2. The object referenced by the class static property in the method area
  3. The object referenced by the constant in the method area
  4. Objects referenced by JNI (Native methods) in the Native method stack

You start with a root object and iterate through it, and anything that you don’t iterate through is the recyclable object. Simply put, the root object is the vertex of the tree, traverses the tree, and is recyclable if it’s not in the tree. Figure F below is recyclable.

graph TB

A((GC Root))
B((B))
C((C))
D((D))
E((E))
A-->B
A-->C
A-->D
B-->E
F((F))
Tricolor marking

Each object, can be divided into black and white ash three color White: hasn’t been garbage collector tag object grey: itself has been marked, but it has a member variable has not been marked black: itself has been marked, and all the members of the object itself variables has also been marked Finally the white object is can be recycled.

More than theDuring the marking process, objects that have been marked as active become objects that should be reclaimed. C loses the reference to B during the marking process and becomes floating garbage.

Missing mark: as shown in the figure below, in the marking process, B was marked, but the member variable of B had not been marked. At this time, the reference between B and C was broken. A and C established A reference. Solution: When A adds A reference, gray it out.

Three basic garbage collection algorithms

Mark-clear, mark-compress and copy algorithms are the basic algorithms, while other algorithms are basically combinations or improvements of these algorithms.

  1. Mark-clean: Starting from GC Root, all live objects are marked, and unmarked objects are reclaimed. Disadvantages: Large memory fragmentation can occur, resulting in sufficient memory space but large object memory may fail to allocate contiguous memory space.

  1. Mark-compress: Starting from GC Root, mark all live objects, then move the mark objects to the other end. Advantages: No fragmentation problems Disadvantages: The compression/collation phase requires extra time compared to mark clearing

  1. Copy algorithm: divide the memory into two pieces, use one piece at a time, when one piece is not fit, copy the remaining piece to the other, and then clear the space of the first piece. Two pieces alternate.

    Advantages: High efficiency and no memory fragmentation

    Disadvantages: Only half of the memory space can be used at a time, and the space utilization is low

Generational collection algorithms in Java

Young generation: replication algorithm is adopted. It is composed of Eden region and two Survivor regions, with a ratio of 8:1:1. The two Survivor regions are also called from region and to region.

The newly generated objects apply for space from Eden. When Eden space is insufficient, Minor GC is triggered to copy Eden and Survivable objects in Survivor0 to Survivor1. If Survivor1 space is insufficient, survivable objects are directly placed in the old age. And then empty Eden and Survivor0, and swap Survivor0 and Survivor1. The age of the GC object is incremented by 1 each time the age reaches a certain value (15 by default).

In the old days: use mark-clean or mark-tidy algorithms

Various garbage collectors

STW: Stop The World, which means all other worker threads are paused while The garbage is collected. When there is no STW, there may be multiple marks and missing marks.


Serial

For young generation, single thread, copy algorithm, STW required

Serial Old

Used with Serial for older ages


Parallel Scavenge

For the young generation, multi-threading, the main concern is throughput, using the replication algorithm, requires STW throughput = run user code time /(run user code time + garbage collection time

Parallel Old

To apply to the older age


ParNew

For younger generations, a multithreaded version of Serial, using a copy algorithm, requires STW

CMS (Concurrent Mark Sweep)

Concurrent token cleanup, where concurrency is the worker thread and garbage collection thread running at the same time, CMS was running in the old days. Use with ParNew. The main steps are as follows: 1. Initial tag: STW, mainly marked GC Root, because there are not many GC Root, so this stage is not time-consuming. 2. Concurrent marking: The worker thread and the garbage collector thread run at the same time. There is no STW, so there will be missed or wrong marking. 3. Relabelling: STW, which is mainly used to correct the mismarked objects in the process of concurrent marking. 4. Concurrent cleanup: The worker thread and the garbage collector thread run simultaneously.

Incremental Update: To solve the missing mark problem, when A new reference to an object A, turn A gray, and iterate again. The following figure

This solution can lead to the problem that during the concurrent tagging phase, when multiple garbage collection threads are executing, flags can be missed, so flags need to be traversed from the GC Root during the re-tagging phase.

  1. Garbage collection thread 1: B is marked and C is started
  2. Worker thread 1: points D to B
  3. Garbage collector Thread 2: Grays B
  4. Garbage collection thread 1: marks C, turns A black, and misses D


G1

  1. The initial tag is STW
  2. Concurrent tags
  3. Relabel STW
  4. Filter collection STW, multiple garbage collection threads concurrently

As shown in the figure above, memory is divided into regions, and the gray part is the unallocated memory. If the size of an object exceeds half of the size of a region, it is allocated to the old region. Each region has only one large object. The young generation contains Eden Region and Survivor Region. Different from CMS, the space of the young generation will change dynamically. When the Eden Region space is insufficient, a new Eden Region will be allocated from the free memory space.

Remembered Sets (RSet)

A region in each region records other region objects that reference the current region

Collection Set (CSet)

The collection of regions to be cleared by the GC

Yonug GC (Young generation GC)

The replication algorithm is triggered when the Eden space is insufficient to store new objects. After the YONG GC is executed, the surviving objects will be copied survivor or old age.

Mixed GC

GC is carried out for the young generation, and the old generation with high recovery returns is selected for recycling, and the replication algorithm is adopted for the old generation.

Snapshot-At-The-Beginning(SATB)

Solve missing label problems in CMS

When a reference is deleted, the state of the parent object is not changed. Instead, a connection is made from GC Root to the deleted objects, which are marked the next time we remark.


All see here, wechat search [order member said] pay attention to the public number, continue to get the latest article