Take a look at my in-depth understanding of the Java Virtual Machine notes

In-depth Understanding of Java VIRTUAL Machine Notes Chapter2 In-depth understanding of Java Virtual Machine Notes Chapter3- Garbage Collector In-depth Understanding of Java Virtual Machine Notes Chapter3- Memory Allocation Policy In-depth Understanding of Java Virtual Machine Notes Chapter4 In – Depth Understanding of Java VIRTUAL Machine Notes Supplement – COMMON JVM Parameter Settings In – Depth Understanding of Java Virtual Machine Notes Chapter7 in – Depth Understanding of Java Virtual Machine Notes Chapter8 in – Depth Understanding of Java Virtual Machine Notes Chapter11 in – depth Understanding of Java Virtual Machine Notes Chapter12

How is the memory in the JVM divided? (see Chapter2)

Memory in the JVM is divided into five main areas: method area, heap memory, program counter, virtual machine stack, and local method stack

  • Method area: The method area is an area shared between threads. Constants, static variables, and JIT-compiled code are all in the methods area. It is used to store information about classes that have been loaded by VMS. It can also be called permanent generation. The garbage collection effect is not good.
  • Heap memory: Heap memory is the main place for garbage collection and is also the area shared between threads. It is mainly used to store object instances that are created, with -xmx and -xms controlling the size.
  • Virtual machine stack (stack memory) : Stack memory mainly holds local variables, basic data type variables, and reference variables of an object in heap memory. Each method is executed with a Stack Frame that is used to store information about local variables, operand stacks, dynamic links, method exits, etc. The stack frames in the stack methodically perform the operations on and off the stack as the methods enter and exit.
  • Program counter: The program counter is a position indicator of the bytecode executed by the current thread. The bytecode interpreter works by changing the value of this counter to select the next bytecode instruction to execute, and is the only area of memory where the virtual machine specification does not specify any OutOfMemoryError cases.
  • Native method stack: Primarily provides services for JVMS to use native methods.

Memory allocation during object creation (see Chapter12)

In general, we use the new instruction to create objects. When the virtual machine reaches a new instruction, it checks whether the parameter of the instruction can locate the symbolic reference of a class in the constant pool, and checks whether the symbolic reference represents a class that has been loaded, parsed, and initialized. If not, the class loading process is performed.

After the class is loaded, validated, prepared, parsed, and initialized, the object is allocated memory, that is, a certain size of memory is divided from the Java heap, and the object is created on the allocated memory.

Object memory can be allocated in two ways: pointer collision and free list.

Pointer collision mode:

Given that memory in the Java heap is perfectly neat, with used memory on one side and unused memory on the other, and a pointer in the middle, all memory allocation is to move that pointer to free space by an amount equal to the size of the object.

Free list mode:

If the Java heap memory is unstructured and used and unused memory is interlocked, the virtual machine must maintain a list of which memory blocks are available, find a large enough space allocation object instance at allocation time, and update the list of records.

It is important to note that the cleanliness of Java heap memory depends on whether the garbage collector used has collation capabilities, which we will focus on in the next section.

So how can memory allocation be thread-safe?

  • The operation of allocating memory space is synchronized, and the atomicity of pointer operation is guaranteed by CAS + retry after failure
  • The action of allocating memory is divided into different Spaces according to threads, that is, each thread is pre-allocated a small segment of memory, called local Thread allocation cache (TLAB), only when the TLAB is used up and new TLAB is allocated, synchronization lock is required. Specifies whether the VM uses TLAB-XX: +/-UserTLABParameter to set

How objects are found when accessed (see Chapter2)

When an object is created, there is a reference variable in stack memory that refers to a specific object instance in heap memory.

The Java Virtual Machine specification does not specify how this reference variable should locate and access specific objects in heap memory. At present, there are two common object access modes, namely handle access mode and direct pointer access mode, which are described as follows.

Handle access:

The reference variable stores the address of the object’s handle. The handle contains the specific address information of the instance data and the type data respectively. After memory garbage collection, objects move, but reference stores stable handle addresses, which are not direct and slow to access.

Direct pointer access:

A reference variable stores the direct address of the object, which is accessed directly through a pointer. Direct pointer access saves the time cost of a pointer location and is faster. Sun HotSpot uses direct Pointers to access objects.

Memory allocation and garbage collection (see Chapter3)

The MEMORY of the JVM can be divided into heap and non-heap memory. Heap memory is divided into young generation and old generation. The young generation can be further divided into one Eden region and two Survivor regions.

JVM heap memory allocation:

The JVM’s initial allocation of heap memory is specified by -xms, which defaults to 1/64 of physical memory. The maximum heap memory allocated by the JVM is specified by -xmx, which defaults to 1/4 of the physical memory.

When the default free heap is less than 40%, the JVM increases the heap to the maximum limit of -xmx. When free heap memory is greater than 70%, the JVM reduces the heap to the minimum limit of -xMS. So we generally set -xms to be equal to -xmx to avoid resizing the heap after each GC.

The -xmn2g parameter is used to set the size of the young generation to 2G. Run the -xx :SurvivorRatio command to set the ratio of Eden zones to Survivor zones in the young generation. If the ratio is set to 8, the ratio of Eden zones to Survivor zones in the young generation is 8:1. Note that there are two Survivor regions in the young generation.

JVM non-heap memory allocation:

The JVM uses -xx :PermSize to set the initial value of non-heap memory, which defaults to 1/64 of physical memory. Set the maximum non-heap memory size by -xx :MaxPermSize. The default is 1/4 of the physical memory.

Allocation and collection of objects in heap memory:

The objects we create will be allocated in Eden first. If they are large objects (long string arrays), they will go straight to the old age. Virtual machine provides a – XX: PretenureSizeThreshold parameters, make greater than this parameter value distribution of objects directly in the old era, avoid in Eden area and two Survivor area copy a lot of memory.

In addition, the long-term survival object will enter the old age, each MinorGC (young generation GC), the object age is one year older, the default age is 15 to be promoted to the old age, through -xx :MaxTenuringThreshold to set the promotion age.

Collection of objects on heap memory is also called garbage collection. When does garbage collection begin?

Garbage collection is mainly to clean up objects and tidy up memory. It is mentioned above that GC often occurs in the heap area, which can be subdivided into Cenozoic and old age. The Cenozoic era is also divided into one Eden region and two Survivor regions. Garbage collection is divided into Minor GC that occurs in the young generation region and Full GC that occurs in the old generation region, as described below.

Minor GC:

Objects are allocated first in Eden, and when there is not enough space in Eden, the virtual machine will have a Minor GC. Since most Java objects are ephemeral, Minor GC is very frequent and very fast.

Full GC:

A Full GC is a GC that occurs in an old age, when there is not enough space in the old age, and a Full GC usually occurs with a Minor GC.

Next, let’s look at two important concepts related to memory allocation and reclamation.

Dynamic object age determination:

If the sum of the size of all objects of the same age in the Survivor space is greater than half of the size in the Survivor space, then objects whose age is greater than or equal to this object’s age can be promoted to the old age without waiting until -xx :MaxTenuringThreshold.

Space allocation guarantee:

When a Minor GC occurs, the virtual machine checks whether the average size of each previous promotion to the old age is greater than the size of the remaining space in the old age. If it is larger, a Full GC (old-age GC) is performed, if it is smaller, the HandlePromotionFailure setting is checked to see if the guarantee fails, if so, only a Minor GC is performed, and if not, a Full GC is performed instead.

How does the JVM determine whether an object should be reclaimed? (see Chapter3)

Reference counting method:

Is a relatively old collection algorithm. The principle is that the object has a reference, that is, one count is increased, and one count is decreased by removing a reference. In garbage collection, only objects with a count of 0 need to be collected. The most deadly aspect of this algorithm is its inability to handle circular references.

Root Search method:

The basic idea of the root search method is to search down from a set of objects that can be used as root as a starting point. When an object has no reference link to the root node, the object is proved to be recyclable.

The following objects are considered root objects (important) :

  • Objects referenced in stack memory
  • Objects to which static and constant references point in the method area
  • The classes loaded by the bootstrap loader and the objects created
  • Objects referenced by JNI in Native methods.

Object reference

If the value stored in a reference data type represents the starting address of another chunk of memory, the chunk is said to represent a reference. JDK1.2 divides references into four types: strong, soft, weak and virtual.

  • Strong references: Normal presence, P P = new P(), as long as strong references exist, garbage collector will never reclaim the referenced object.
  • SoftReference: a SoftReference is implemented using the SoftReference class. If memory is insufficient, the SoftReference is reclaimed.
  • WeakReference: WeakReference is realized through the WeakReference class. Every garbage collection will definitely recycle the WeakReference.
  • Virtual references: Also known as ghost references or phantom references, implemented through the PhantomReference class. The virtual reference is set up only to receive a system notification when the object is reclaimed.

What are the JVM garbage collection algorithms? (see Chapter3)

Basis: mark-clear algorithm

  • Algorithm description:
    • Mark all objects that need to be reclaimed (dark area).
    • After marking, all marked objects are uniformly reclaimed (leaving a dog of free memory area…). .
  • Inadequate:
    • Efficiency problem: Both marking and cleaning processes are inefficient.
    • Space fragmentation problem: A large number of discrete memory fragments can be generated after the tag is cleared, causing not enough contiguous memory to be found when allocating memory for larger objects later, triggering another GC prematurely.

Solve efficiency problem: copy algorithm

  • Algorithm description:
    • Divide the available memory into two equal chunks and use only one at a time.
    • When a piece of memory is used up, the surviving objects in this piece of memory are copied to another piece of memory to clean up this piece of memory.
  • Deficiency: Available memory is reduced by half, suitable for the new generation where only a few objects survive after GC.
  • Ways to save memory:
    • 98% of objects in the new generation are born and die, so there is no need to divide the memory according to the 1:1 ratio;
    • Partition memory into:
      • A relatively large Eden area;
      • 2 smaller Survivor zones;
    • Use Eden zone and 1 Survivor zone each time;
    • When recycling, the surviving objects in the above two parts of the region are copied to another Survivor region, and then the above two parts of the region are emptied.
    • JVM parameter Settings:-XX:SurvivorRatio=8 saidEden zone size/Size of one Survivor zone = 8.

Solving the space debris problem: mark-de-clutter algorithms

  • Algorithm description:
    • The marking method is the same as “mark-clear algorithm”;
    • Once marked, all living objects are moved to one end, and memory beyond the boundary is cleaned up.
  • Deficiencies: there are efficiency problems, suitable for the old age.

Evolution: Generational collection algorithms

  • New generation: Only a few objects survive after GC – copy algorithm
  • Old age: High survival rate of objects after GC — mark-collation algorithm

Garbage collector (see Chapter3, with emphasis on G1 and CMS)

Seven garbage collectors

The garbage collector is a concrete implementation of the memory collection operation, and there are seven of them in HotSpot, why so many, because they all have their own application scenarios. Some are Cenozoic collectors and some are old-age collectors, so they are generally used in combination (except for the versatile G1). For a brief introduction and classification, see the following figure.

Serial collector

  • Single-threaded collector
  • All other worker threads must be suspended while garbage collection is taking place (when cleaning, the room must be stopped from producing garbage)
  • Simple and efficient, focus on garbage collection
  • The default generation collector for virtual machines running in Client mode

ParNew collector

  • The ParNew collector is essentially a multithreaded version of the Serial collector and is the preferred next-generation collector for virtual machines running in Server mode.

  • Parallel: Multiple garbage collection threads work in Parallel while the user thread is in a waiting state.

  • Concurrent: When the user thread executes concurrently (but not necessarily in parallel and may run alternately) with the garbage collector thread running on another CPU, the user program continues to run while the garbage collector thread runs.

Parallel Scavenge

The Parallel Scavenge collector is a Parallel multithreaded collector that uses replication algorithms to focus on improving Throughput. In addition, the adaptive adjustment strategy is the difference between the Parallel Exploiter and the ParNew collector.

Serial Old collector

Is an older version of the Serial collector.

Parallel Old collector

Is an older version of the Parallel Exploiter.


Here’s how the garbage collector works together

Serial/ParNew is paired with the Serial Old collector

  • The Serial collector is the default generation collector for virtual machines in Client mode. Its advantages are simple and efficient, and it works well in single-CPU mode.

  • The ParNew collector is a multithreaded version of the Serial collector, and while there are few other innovations, it is the preferred new generation collector for many virtual machines running in Server mode, because it is the only one besides the Serial collector that works with the CMS collector.

Avenge

First, the two collectors are definitely intended to be used together, and not only that, their focus is different from that of the other collectors, which focus on minimizing the pause time of user threads during garbage collection, but the Parallel Scavenge collector is designed to achieve a controlled throughput.

Throughput = time to run user code/(Time to run user code + garbage collection time)Copy the code

The Parallel Scavenge collector, both new and old, is designed to collect garbage simultaneously from multiple threads and is ideally suited to throughput and CPU resource-sensitive applications.

Adjustable VM parameters:

  • -XX:MaxGCPauseMillis: Maximum number of seconds for GC pauses;
  • -XX:GCTimeRatio: Throughput size, a number from 0 to 100,Ratio of maximum GC time to total time = 1 / (GCTimeRatio + 1);
  • -XX:+UseAdaptiveSizePolicy: a switch parameter that does not need to be manually specified-Xmn.-XX:SurvivorRatioThe vM collects performance monitoring information based on the current system running status and adjusts the parameters.

CMS collector

The CMS collector is a collector whose goal is to obtain the shortest collection pause time. The CMS collector, which is implemented based on the mark-sweep algorithm, is an old-age collector that is commonly used with ParNew.

The CMS garbage collection process is divided into four steps:

  • Initial tag: You need to “Stop the World”, and the initial tag simply marks objects that GC Root can be directly associated with, which is fast.
  • Concurrent tagging: Is the main tagging procedure that is executed concurrently with the user thread.
  • Relabelling: You need to “Stop the World” in order to correct the marking record for that part of the object that changes the marking during concurrent marking as the user program continues to operate (the pause time is longer than the initial marking, but much shorter than the concurrent marking).
  • Concurrent cleanup: Concurrent cleanup with the user thread, based on the result of the markup.

Advantages: concurrent collection, low pause disadvantages: very sensitive to CPU resources, unable to handle floating garbage, after collection will generate a large amount of space debris, causing trouble in allocating space for large objects

More complete explanation of shortcomings:

  • This is very sensitive to CPU resources, because the concurrent tagging and concurrent cleanup phases run alongside user threads and are prone to performance issues when the CPU count becomes smaller.
  • Floating garbage will be generated during the collection process, so garbage collection can not be carried out in the old years when the memory is out, but must be carried out in advance. Through the parameter-XX:CMSInitiatingOccupancyFractionTo control the percentage of memory used. If this value is set too high, the amount of memory set aside during CMS operation may not meet the requirements of the program, resulting in Concurrent Mode Failure, followed by the temporary use of the Serial Old collector as the Old collector, resulting in longer pauses.
  • The mark-clear mode generates memory fragmentation and can be used with parameters- XX: UseCMSCompactAtFullCollectionTo control whether memory marshalling is enabled (concurrency is not possible, it is enabled by default). parameter-XX:CMSFullGCsBeforeCompactionUsed to set the number of uncompressed Full GC followed by a memory defragmentation with compression (default is 0).

Floating garbage:

Because Garbage collection occurs at the same time as the application runs, some Garbage may be generated when the Garbage collection completes, resulting in “Floating Garbage” that will need to be collected in the next Garbage collection cycle. As a result, concurrent collectors typically require 20% of their reserved space for floating garbage.

  • Parameter Settings:
    • -XX:+UseCMSCompactAtFullCollection: Defragmentation occurs when the CMS is about to perform Full GC (on by default)
    • -XX:CMSFullGCsBeforeCompaction: How many Full GCS do you clean up after (default is 0, that is, every Full GC does a clean up after)

G1 collector

The G1 (Garbage First) collector is the latest achievement in the development of the current collector technology, which has two significant improvements over the previous CMS collector:

  • It is based on a “mark-and-tidy” algorithm, which means it produces no space debris
  • Very precise control of the pause

The G1 features:

  • Parallelism and concurrency: G1 can take full advantage of the hardware advantages of multi-CPU, multi-core environment to shorten the Stop the World, is a concurrent collector.
  • Generational collection: G1 manages the entire GC heap independently without the need for other collectors and can handle new objects, objects that have survived for a while, and objects that have survived multiple GCS in different ways.
  • Space integration: G1 is based on mark-collation algorithm as a whole, and replication algorithm as a local (two regions). There is no memory space fragmentation during G1 operation.
  • Predictable pause: Ability to model predictable pause times and predict pause times.

The G1 collector also has four stages of garbage collection:

  • Initial tag
  • Concurrent tags
  • In the end tag
  • Screening of recycling

In the screening recovery phase, the recovery value and cost of each Region are firstly calculated, and the recovery plan is made according to the expected GC pause time of users.

It divides the entire Java heap into regions of fixed size. It tracks the extent of garbage accumulation in these regions. It maintains a priority list in the background, and each time, based on the allowed collection time, The region that gets the most Garbage collected First (hence the name Garbage First). In summary, zone partitioning and prioritized zone collection ensure that the G1 collector can achieve maximum collection efficiency in a limited amount of time.

Subsequent updates

  • Class loading mechanism (see Chapter7)
  • Parental delegation (see Chapter7)
  • The JVM tuning