Memory area

During the execution of Java programs, the Java virtual Machine divides the memory managed by it into several different data areas. Each of these zones has its own purpose and time to create and destroy. Some zones exist with the start of the virtual machine process, while others are created and destroyed depending on the start and end of user threads. The memory managed by the Java virtual Machine will include the following runtime data areas, as shown in the figure below

Java runtime area.png

Program Count Register

Program counter: a small memory space that can be seen as an indicator of the line number of bytecode executed by the current thread. Each thread has its own independent program counter. If the thread is executing a Java method, the value of this counter is the address of the virtual machine bytecode instruction being executed. If a Native method is being executed, the counter value is null (undefined). This memory region is the only region where no OutOfMemoryError condition is specified in the Java Virtual Machine specification.

Stack

The thread is private and has the same lifetime as the thread. The virtual stack describes the memory model for Java method execution: each method execution creates a stack frame to store information about local variables, operand stack, dynamic links, method exits, and so on. The local variable table holds the various basic data types known at compile time (Boolean, byte, char, short, int, float, long, double), object references, and the returnAddress type (which points to the address of a bytecode instruction). There are two slots for 64-bit long and double data types, and one for the remaining data types. The memory space required for the local variable table is allocated during compilation. When entering a method, how much local variable space the method needs to allocate in the frame is completely determined, and the size of the local variable table does not change during the run of the method. StackOverflowError is thrown if the thread requests a stack deeper than the vm allows. Failed to obtain an OutOfMemoryError from memory.

Native Stack

The function of the local method stack is very similar to that of the virtual machine stack, except that the virtual machine stack executes Java methods for the virtual machine, while the local stack serves the Native methods used by the virtual machine.

Heap

The Java heap is shared by threads and is created when the virtual machine starts. The sole purpose of this area is to hold object instances, and almost all object instances are allocated memory here. The Java heap is the primary area that the garbage collector manages, so it is often referred to as the “GC heap.” Since collectors are now largely based on generational collection algorithms, the Java heap can also be subdivided into: new generation and old generation; More detailed are the Eden space, From Survivor space, To Survivor space and so on. When implemented, they can be either fixed size or scalable, but the current mainstream virtual machines are implemented as scalable (controlled by -xMX and -xMS).

Method Area

Shared by threads. Used to store data such as class information, constants, static variables, and code compiled by real-time compilers that have been loaded by virtual machines. This area of memory recycling target is mainly for constant pool recycling and unloading of types! The local method area has a special memory area called the Constant Pool.

A Constant Pool is data that is determined at compile time and stored in a compiled.class file. It includes constants about classes, methods, interfaces, and so on, as well as string constants. Java divides memory into heap memory and stack memory. The former is used to store objects and the latter is used to store variables of basic types and references to objects.

The GC mechanism

Three conditions need to be determined during GC collection:

  1. What memory needs to be reclaimed?
  2. When to recycle?
  3. How to recycle?

Java memory program counter, virtual machine stack, local method stack three areas born with the thread, with the thread and out; The stack frame within the stack methodically performs the push and exit operations as methods enter and exit. The amount of memory allocated in each stack frame is basically known when the class structure is determined, so the allocation and reclamation of memory in these areas are deterministic, and there is no need to think too much about reclamation in these areas, because the memory will naturally follow the reclamation when the method ends or the thread ends. The Java heap and methods area is different, an interface of multiple implementation classes need memory may not be the same, a method of multiple implementation classes need memory may not be the same, a method of multiple branch need memory may also be different, only in a program at run time to know what object is created, The allocation and reclamation of this part of memory is dynamic, and it is this part of memory that the garbage collector focuses on.

Object reference

Garbage collection in Java is typically done in the Java heap, because the heap holds almost all of the object instances in Java. When it comes to garbage collection in the Java heap, it’s natural to talk about references. Prior to JDK1.2, reference definitions in Java were very pure: if the value stored in reference data represented the starting address of another piece of memory, that piece of memory represented a reference. But after JDK1.2, Java expanded on the concept of references, It is divided into four types: Strong Reference, Soft Reference, Weak Reference, and Phantom Reference. The strength of Reference decreases in sequence.

  • Strong references: Such as “Object obj = new Object ()”, these references are the most common in Java programs. As long as the strong reference exists, the garbage collector will never reclaim the referenced object.
  • Soft references: These are used to describe objects that may be useful, but are not necessary. Objects associated with such references are collected by the garbage collector when the system runs out of memory. After JDK1.2, the SoftReference class is provided to implement soft references.
  • Weak reference: This is also used to describe an unwanted object, but it is weaker than a soft reference. The object associated with a weak reference can only survive until the next garbage collection occurs. When the garbage collector works, objects that are only associated with weak references are recollected, regardless of whether the current memory is sufficient. After JDK1.2, WeakReference class is provided to implement WeakReference.
  • Virtual reference: The weakest type of reference that has no effect on its lifetime and cannot be used to obtain an object instance

GC determination algorithm

Reference counting algorithm

Add a reference counter to the object, incrementing the counter by 1 every time it is referenced; When the reference is invalidated, the counter is subtracted by 1; An object with a counter of 0 at any time cannot be used any more. Reference Counting algorithm is simple to implement and efficient to determine. It is a good algorithm in most cases, but at least the mainstream Java Virtual Machine does not use Reference Counting algorithm to manage memory. The main reason is that it is difficult to solve the problem of circular references between objects.

Accessibility analysis algorithm

Accessibility analysis diagram

Java uses Reachability Analysis to determine whether an object is alive. Through a series of objects called “GCRoots” as the starting point, starting from these nodes, the search path is called the Reference Chain. When an object is not connected to GCRoots by any Reference Chain, it proves that the object is unavailable.

GC collection algorithm

Mark-clear algorithm

The algorithm is divided into two stages: marking and clearing: first, all the objects to be reclaimed are marked, and all the marked objects are reclaimed uniformly after the completion of marking. The marking process is marked by using the reachability algorithm.

Tag removal algorithm. PNG

There are two main disadvantages:

  • Efficiency issues, both labeling and cleanup processes are inefficient
  • Space problem, after the flag clearance will produce a large number of discontinuous memory fragments
Replication algorithm

Replication algorithm: The available memory is divided into two equal size pieces according to the capacity, and only one of them is used at a time. When the memory in this area runs out, the remaining objects are copied to another area, and the used memory space is cleaned up at once. Memory allocation does not need to consider the problem of memory fragmentation, as long as the top of the heap pointer, in order to allocate memory, simple implementation, efficient operation. The cost is to reduce memory size by half.

Copy algorithm.png

Mark-collation algorithm

In mark-Compact, the marking process is still the same as “mark-clean”, but instead of directly cleaning the recyclable objects, all the living objects are moved to one end, and then the memory outside the end boundary is cleaned directly.

Tag collation.png

Generational algorithm

The memory is divided into blocks depending on the lifetime of the object. Generally, the Java heap is divided into the new generation and the old generation, and the most appropriate collection algorithm is adopted according to the characteristics of each generation. In the new generation, where a large number of objects die and a small number survive each garbage collection, the replication algorithm can be used. While older objects have a higher survival rate, use tag cleaning or tag sorting algorithms.

Class loading mechanism

An overview of the

The virtual machine loads the data describing the Class from the Class file to the memory, verifies, converts, parses, and initializes the data, and finally forms a Java type that can be directly used by the virtual machine. This is the Class loading mechanism of the virtual machine.

Class loading process

Class loading process.png

From the time a class is loaded into virtual machine memory to the time it is unloaded out of memory, its entire life cycle includes loading, validation, preparation, parsing, initialization, use, and unloading.

The process of class loading includes loading, verification, preparation, parsing and initialization. Of the five phases, load, validation, preparation, and initialization occur in a certain order, while the parsing phase does not necessarily begin after the initialization phase in some cases, in order to support runtime binding (also known as dynamic or late binding) in the Java language. Also note that the phases here begin in sequence, rather than proceeding in sequence or completion, because these phases are often interlaced with each other, usually calling or activating one phase in the process of execution.

Binding refers to the association of a method call with the class (method body) of the method. For Java, binding is divided into static binding and dynamic binding:

• Static binding: early binding. Methods are bound before the program is executed and are implemented by the compiler or other linker. For Java, it can simply be understood as a compile-time binding. Only final, static, private, and constructors in Java are pre-bound. • Dynamic binding: Late binding, also known as runtime binding. Binding is made at run time based on the type of the concrete object. In Java, almost all methods are late-bound. The following details what you do at each stage of the class loading process.

  • Load phase: The first phase of Class Loading. During this phase, the virtual machine needs to do the following:
  1. Gets the binary byte stream that defines a class through its fully qualified name.
  2. Converts the static storage structure represented by this byte stream into the runtime data structure of the method area.
  3. A java.lang.Class object representing the Class is generated in the Java heap as the access point to the data in the method area.
  • The authentication phase is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and does not compromise the security of the virtual machine.

  • Preparation: Allocates memory for the class’s static variables and initializes them to default values, which will be allocated in the method area. The preparation phase does not allocate memory for instance variables in the class; instance variables are allocated in the Java heap along with the object when it is instantiated.

  • Parsing: The VM replaces symbolic references in the constant pool with direct references.

  • Class initialization: is the last step in the class loading process, the previous class loading process, except for the user application can participate in the load phase through the custom class loader, the rest of the action is completely dominated and controlled by the virtual machine. It is during the initialization phase that the Java program code defined in the class is actually executed.

Class loader

  1. The bootclass loader, which is responsible for storing it in the JAVA_HOME\lib directory, or in the path specified by the -xbootclasspath parameter, and is recognized by the virtual machine (identified by file name only, such as rt.jar, A class library with a different name will be stored in the lib directory and will not be loaded. The startup classloader cannot be referenced directly by a Java program.
  2. Extension class loader: It is responsible for loading all libraries in the JAVA_HOME\lib\ext directory, or in the path specified by the java.ext.dirs system variable. Developers can use this class loader directly.
  3. Application classloader: Is responsible for loading the class library specified on the user’s path. The developer can use this class loader directly and is the default class loader. The relationship between the three loaders: start classloader -> Extend classloader -> Application classloader -> custom classloader.

Parent delegation model

As you can see from the class loaders above, the class loaders are loaded from the bottom up.

Parent delegate model.png

We have the parent delegate model for this load order of class loaders. It requires that all class loaders, in addition to the bootstrap class loader, have their own parent class loaders. In this case, the parent-child relationship between classloaders is generally not implemented by inheritance, but by combining the code of the parent class.

The working process of the parent delegate model: If a classloader receives a classloading request, it first delegates the request to its parent classloader. This is true for each level of classloader, so all loading requests should be passed to the top-level startup classloader. Only when the parent responds that it cannot complete the load request (it does not find the required class in the search scope) will the child loader attempt to load it itself.

Benefit: Java classes have a hierarchy of priorities along with their class loaders. For example, the class java.lang.object, which is stored in rt.jar, will eventually be delegated to the starting class loader by whichever class loader wants to load the class, so the Object class is the same in the various class loader environments of the program. Conversely, if a user writes a class called java.lang.Object and places it in the program’s Classpath, the system will end up with multiple different Object classes, the most basic behavior of the Java type system will not be guaranteed, and the application will become a mess.

Implementation: in the loadClass() method of the java.lang.classloader, it checks to see if it has been loaded. If it has not been loaded, it calls the loadClass() method of the parent ClassLoader. If the parent is empty, it defaults to use the bootstrap ClassLoader as the parent. If the parent fails to load, it throws a ClassNotFoundException and then calls its own findClass() method to load it.

References:

Understanding the Java Virtual Machine