Class Class loading

– load Loading

Load a Class from the binary Class file into memory by fully qualifying the Class name

– Class loader

  • Bootstrap: starts the class loader. Load lib/rt.jar charset.jar core class C++ implementation
  • Extension: extends the class loader. Load the extended JAR package jre/lib/ext/*.jar
  • Application/System: System class loader. Load the content specified by the classpath, which is the default class loader in Java and is used to load all the classes we write
  • Custom ClassLoader: user class loader. The custom this

The relationship between loaders is not inherited, but a relationship between a superior and a subordinate

– Parental assignment

When a call to a class, will first check whether the class has been loaded, if loaded directly call; Otherwise, the corresponding class loader will be found to load the class. Therefore, it is divided into check process and loading process.

The checking process is a bottom-up process, which is understandable because the lower the level, the higher the controllability, and only the lower the core level will be placed in the top level loader.

When asked if the top-level Bootstrap loader has not loaded the class, it is assumed that the class has not been loaded.

The loading process is a top-down process, and each class loader has its own responsibility. If the class does not fall within the scope of the loader’s responsibility, it will ask down.

– The significance of parental delegation

  1. The decoupling

– connect the Linking

– check the verification

There are four main types of verification to verify that a binary stream Class file meets the requirements of the JVM: file format verification, metadata verification, bytecode verification, symbol reference verification.

– ready to Preparation

Int a = 1; int a = 1; int a = 1; We assign 0 to a during this process, not 1, but 1 during the initialization process and that’s why we need volatile in the DCL problem, okay

Analytical Resolution –

Converts symbolic references to direct references

– Initializing

Assign an initial value to a variable

Taking JDK1.8 as an example, the components of the JVM are shown below

Instruction reordering problem

We know that CPU execution is much faster than memory execution, and our code is essentially a sequence of instructions, or a combination of instructions

When we debug, we know that the code is executed line by line, but in the CPU, the instructions are executed line by line?

The answer is not necessarily!

int a = 1;
int b = 2;
Copy the code

For example, in this example, two variables are defined, there is no relationship between the two lines of code, and the CPU can execute them out of order for speed.

No matter how the instructions are ordered, the last thing you need to do is make sure that the code runs consistently on a single machine

– as-if-serial

No matter how much reordering is done (by the compiler and processor to improve parallelism), the execution result of a (single-threaded) program cannot be changed. The compiler, runtime, and processor must comply with the AS-IF-Serial semantics.

– Whether the DCL singleton needs to be volatile


public class Instance {

    private int a = 1;
    // Whether to add the volatile keyword
    private volatile static Instance ins = null;

    public static Instance getInstance(a){
        if (ins == null) {synchronized (Instance.class){
                if (ins == null){
                    ins = newInstance(); }}}returnins; }}Copy the code

The answer is yes!

When we create an object, we go through three steps.

  1. Request to create a memory space. Assign default values to member variables
  2. Call the constructor. Assign an initial value to a member variable
  3. Make connections. Convert symbolic references to direct references

When thread 1 calls the DCL singleton to create the instance object, it assigns default values to the member variables when it completes the first step. The instance object can then be retrieved.

In other words, ins is not null at the outermost layer of the DCL, so it will return ins directly.

However, this ins only completes the assignment of default a = 0

If we have an order number a that starts at 1000, and the current a is 0, this can cause a lot of problems.

With volatile, thread visibility is guaranteed and instruction reordering is prohibited. For those interested, search for keyword barriers. Volatile ensures that instructions are not reordered through a combination of LLLS and SSSL memory barriers. I’m going to leave it at that.

A description of the components of the JVM

The heap

Used to hold object instances, including arrays (after jdk1.7, the string constant pool was removed from the permanent generation and stored in the heap)

Default memory allocation for heap space:

  • The old age accounts for two-thirds
  • The new generation accounts for one-third
    • The Eden district accounts for eight out of ten
    • From area, to area each account for one tenth

Developers can configure the partition size with parameters

-xx :SurvivorRatio=8 (Cenozoic zone ratio 8:2)

The heap is also where garbage collector (GC) occurs, YGC occurs when the young generation is full, and FGC occurs when the old generation is full. The algorithms for each garbage collector are described below

– String constant pool

Existed in the method area prior to jdk1.7

In jdk1.7 and later, moved to the heap

Before 1.7, String pools contain String constants. After 1.7, String pools contain references to String objects.

String s1 = "abc";
String s2 = "abc";
Copy the code

He looks in the constant pool to see if there is an object reference to “ABC”, and returns that address if there is; Otherwise, it will create a String of “ABC” itself.

– Static constant pool

  • Static constant pool, also known asA Class constant pool, which stores some information about the class file, such as version, field, method, interface, and other description information
  • Each class file has a class constant pool

– Runtime constant pool

  • The runtime constant pool exists in memory, whereSymbolic referenceCan be resolved to thetaDirect reference
  • When the class is loaded, data from the constant pool is stored in the runtime constant pool

– the young generation

It refers to the newly generated objects and objects that have not gone through the GC process. FGC frequently occurs in the young generation of Eden. When FGC occurs, objects in Eden area will be copied to from area through the copy algorithm. When the next FGC occurs, objects in the Eden and FROM areas are copied to the TO area. The next FGC will copy Eden and TO to from, and then repeat the above process. It counts +1 each time, and by default, when it reaches 15 (which the user can set) it becomes old.

– the old s

An old age is an object that has survived multiple GC’s. Objects from the old generation, or objects from the younger generation. Or some big object just becomes old. If the old age is full,FGC is triggered, which is not very frequent.

Local method stack

The local method stack serves native methods used by virtual machines. The virtual machine stack serves Java methods executed by virtual machines

– native method

A Native Method is a Java interface that calls non-Java code. A Native Method is a Java Method whose implementation is implemented in a non-Java language, such as C.

The virtual machine stack

A virtual machine stack is essentially a stack structure. Each stack frame in it is essentially a method. We can see that the main components of stack frame are 4, which are local variable table, operand stack, dynamic link and method exit information respectively.

– Local variable scale

– Eight primitive types (beginning with small letters in Java) : Boolean, bye, char, short, int, float, long, double, andObject reference

– Object reference

Object references are divided into direct pointer references and indirect (handle) references

– Direct pointer reference

A direct reference pointer to the starting address of the other party

Advantages:

  • Fast access and direct arrival

Disadvantages:

The reference pointer needs to be changed when the object is moved, such as when GC occurs

– Handle reference

Points to a data structure that contains Pointers to the object’s instance data and type data

Advantages:

  • When an object is moved, you only need to change the instance data pointer in the structure, during GC

Disadvantages:

  • Extra space is needed to store the structure

## # operand stack

Operand stacks are also called operation stacks. For example, to perform a simple function that adds two parameters, you need to push two values from the operand stack, and then push the result onto the stack when the operation is complete

– Dynamic link

Each stack frame contains an internal reference to the method that the stack frame belongs to in the runtime constant pool. The purpose of including this reference is for the code supporting the current method to achieve Dynamic Linking. Example: Invokedynamic instructions (added in 1.7)

When a Java source file is compiled into a bytecode file, all variable and method references are kept as symbolic references in the class file’s constant pool. For example, describing a method that calls another method is represented by symbolic references to the method in the constant pool, so dynamic linking is used to convert these symbolic references to direct references to the calling method.

– Method export information

When each Java method is called, a stack frame is created to merge into the stack. Once the call is done, it is automatically removed from the stack, so there is no need for gc to collect, and the heap needs GC to collect

Program counter

The footprint of the JVM is negligible, but it is also the fastest area to compute

Used to store the address pointing to the next instruction, that is, the instruction code to be executed.

Direct memory (off-heap memory)

It belongs to off-heap memory and can be allocated through native methods

Direct memory I/O reads perform better than regular heap memory

It is not affected by GC and requires manual collection

Direct memory is implemented through the Unsafe class

Methods area

Both the metadata area and the persistent generation are essentially implementations of the method area, a logical abstraction that each version of the JVM has different implementations of. The metadata area is the implementation of hotspot

– Permanent

In Java version 1.7, the method area in Hotspot is in the persistent generation, and the persistent generation and the heap are isolated from each other, but the physical memory they use is continuous

Related JVM parameters:

  • - XX: PermSize: Indicates the size of the initial non-heap memory allocation. The abbreviation is permanent size
  • - XX: MaxPermSize: indicates the maximum amount of memory that can be allocated for non-heap areas

Java1.8 uses native memory to implement the method area and puts the constant pool of the method area in the heap

Reasons for abandoning persistent generation: maintenance and merge

  • If you manually set the size of the permanent generation, on the one hand, if it is too small, it will cause an exception (OOM). If it is too large, it will waste space and be difficult to maintain. When using a meta-space, it is the actual available space of the system that controls the data of the loaded classes
  • The deeper reason is that Hotspot merges jRocKit code, which has no so-called permanent generation, but still performs very well

– MetaSpace

Used for storage:

  • Class method code
  • The variable name
  • The method name
  • Access permissions

Compare superficial things

The method region exists in the meta-space, which is no longer contiguous to the heap, but in Native memory.

Garbage Collector

What exactly is garbage? Why garbage collector? To understand the garbage collection mechanism, we first need to understand the JVM’s definition of garbage.

Reference Count

Intuitively, we assume that an object is garbage if no other object references it. So we count every time another object makes a reference to that object.

Bug: Reference counting does not solve loop dependencies.

RootSearching algorithm

The root reachable algorithm is whether the path of the object to GcRoot is still reachable, that is, whether there is a referencable chain. If there is, this indicates that the object still has a reference, if not, this indicates that the object has no reference and will be collected in the next garbage collection

The types of GcRoot

1. Virtual machine stack: the object referenced by the local variable table in the stack frame

2. Objects referenced by native methods

3. Objects referenced by static variables and constants in the method area

Sweep algorithm

– Copying algorithms

The algorithm of the younger generation divides the available memory into the same two pieces, using only one piece at a time, and when it runs out, copies the surviving objects onto the other piece and cleans up the used space. Objects in Eden zone are scanned for the first time, surviving ones are copied to survivor1, and Eden zone is released. Second scan Eden, survivor1, survivor2, and survivor1 and Eden. Next time scan Eden and survivor2, save survivor1, cross replicate like this.

– Mark-sweep algorithm

Mark clearing: mark first, and then unified reclamation, will produce a large number of discrete memory fragments

– Mark-Compact

Mark collation: mark first, move all surviving objects to one end, then clean up the memory outside the boundary without fragmentation

Garbage collector

The driving force of technology is business, divorced from business, talk about technology, are playing rogue.

As our business grows, the garbage generated by the JVM grows. At the beginning, there may be only a few megabytes ~ tens of megabytes, and then more than tens of gigabytes. And the garbage collector is used with the young generation + the Old age, such as Serial + Serial Old

– Serial

Features: Single-threaded, STW During garbage collection, a single-threaded method is used to collect garbage through the STW. Usually used with Serial Old.

If there is too much garbage, the STW process is long and suitable only for lightweight systems

STW (Stop the Word)

Stop the World, STW for short, refers to the application pauses that occur during a Gc event. A pause occurs when the entire application thread is suspended without any response, a bit like being stuck. This pause is called STW.

– Parallel

Features: Multithreading, STW

The bottleneck of Serial algorithm is that it is a single thread, Parallel is a multi-thread, but still have STW process.

Be insane. 8. be insane

JDK 1.8 is PS + PO by default

Even with multi-threading, there are still bottlenecks, and recycling will affect the business

– CMS

CMS(Concurrent Mark sweep) old age

Can be used with ParNew, a Parallel algorithm for CMS compatibility

Concurrent reclamation can be performed without stopping services

  • Initial tag:STWHowever, this pause is not the entire memory, but from the root to find, so it takes less time.
  • Concurrent tags: There will be an error label here. For example, an object is marked as garbage, but there is another object pointing to the object. Or maybe the object wasn’t garbage and then became garbage.
  • To mark:STWPhase, revise part two
  • Concurrent cleaning: Uses a mark sweep algorithm

– Tricolor marking method

Black: it is marked, and the reference object is marked gray: it is marked, but every time the reference object is marked white: it is not marked

– Incremental Update

Assumption: can trichromatic marking guarantee against missing marks?

First GC scans object A and its reference object B, so A is black GC scans object B, but does not scan its reference object C, so B is gray C has not been scanned yet so it is white

At this point, due to business requirements, B’s reference to C is cancelled, but A establishes A reference to C.

Problem: But because A is marked black, GC no longer sweeps references to A and A. But C is marked white, so it will be recycled, and NullPointException will appear if A calls C

Solution: Relabelling solves this problem by marking A in grey when creating A new reference.

Disadvantages of CMS: Memory fragmentation garbage due to the Mark Sweep algorithm

– G1

G1 garbage collector abandons the previous generation algorithm and adopts Region algorithm, although physically non-generational, it still divides generations logically. The names of Eden district and Old District are retained.

Partition algorithm Region

Each region can be a young generation, an old generation, but all belong to a single generation at the same time

The partitioning algorithm refers to that, if some partitions have a large number of garbage objects and some have a small number of garbage objects, G1 will preferentially recycle the partitions with a large number of garbage objects, thus reducing the waiting time for the collection

The reason:

  1. Because as the memory gets bigger and bigger, the younger generation gets bigger and bigger, the scan speed gets slower and slower
  2. The fatal flaw of CMS

G1:

  • The young generation: Every time you recycle a young generation, you recycle all the young generations (YGC)
  • The old s: G1 comes with a collector for compression. When reclaiming older partitions, it copies surviving objects from one partition to another, a process that achieves local compression. Each partition is between 1 and 32 megabytes in sizeTwo to the power.

RSet

remember set

A Region within a Region

Records that other regions reference the current Region

When a reference disappears, the reference is logged to the RSet

CSet

clean set

Records the regions that need to be cleared by the current GC

YGC

When YGC occurs, objects from E and S (from) are merged and copied into a new S (to) region

MixGC

The G1 does not have a separate OGC, so the traditional OGC is cleaned along with YGC, called MixGC

  • First mark: STW phase
  • Parallel markup: same as CMS, but with reduced traversal scope, only Region recorded in Rset is traversed
  • Re-mark: same as CMS, but withSATBIt’s also the SWT phase
  • Cleanup: Select only the Region with the most garbage cleanup

SATB

snapshot ai the beginning

A snapshot is taken at concurrent markup and references in rsets are processed at re-markup

Does gray perform white when references disappear?

When a reference to B->D disappears, the reference is stored on the GC stack to ensure that D can still be scanned by GC

With RSet, we only need to scan which regions refer to Region D

When the GC comes back, it finds that the reference has been increased and the reference representing the object has disappeared, so it will scan the D to see if it is garbage

Related parameters

-xx :+UseG1GC opens G1 garbage collector

-xx :GCHeapRegionSize Partition size

– XX: InitializingHeapOccupancyPercent = 30 triggered G1 recovery percentage of the maximum heap memory

-xx :MaxGCPauseMillis maximum pause time target

-xx :GCPauseIntervalMillis GC interval