“This is the second day of my participation in the November Gwen Challenge. See details of the event: The last Gwen Challenge 2021”.

The HotSpot VIRTUAL machine object memory layout, compression pointer, using joL to view and calculate the object memory usage, object access location method!

1 Memory layout of the object

In the HotSpot virtual machine, the layout of objects stored in memory can be divided into three areas: object headers, Instance Data, and alignment Padding.

1.1 object head

If the object is an array type, then the HotSpot VIRTUAL machine stores the object header with 3 Word widths (Word widths) or 2 Word widths if the object is not an array type. On a 32-bit VM, the width of 1 character is 4 bytes, that is, 32 bits. On a 64-bit VM, a character width equals 8 bytes, which is 64-bit.

The common object header of the HotSpot VIRTUAL machine contains two parts of information: “Mark Word”, “Class Pointer”, and an Array object including “Array Length”.

The length of the content instructions
32/64bit Mark Word It is used primarily to represent the thread-lock status of an object, but can also be used in conjunction with GC and the hashCode that holds the object.
32/64bit class pointer(klass) Storing Pointers to information about an object’s Class means that the object can always know which instance of Class type it is. 64-bit JVMS turn on pointer compression at 32 bits.
32/64bit Array Length The length of the array (present only if the current object is an array). 64-bit JVMS turn on pointer compression at 32 bits.

1.1.1 Mark Word

It is used to store the runtime data of the object itself, such as HashCode, GC generation age, lock status flag, lock held by the thread, bias thread ID, bias timestamp, etc. The length of this data is 32bit and 64bit in 32-bit and 64-bit VMS respectively.

Objects need to store a lot of runtime data, which is beyond the scope of the 32-bit and 64-bit BitMap structures. However, the object header information is an additional storage cost unrelated to the data defined by the object itself. Considering the space efficiency of the VIRTUAL machine, Mark Word is designed as an unfixed data structure to store as much information as possible in a very small space. It will reuse its storage space according to the state of the object, that is, different states store different data.

The Mark Word in the Java object header stores the object’s HashCode, generational age, and lock marker bits by default. In fact, the key to Synchronized implementation in Java is its reliance on the Mark Word in the object header! A more detailed explanation of The Mark Word and the relationship with Synchronized is provided in this article: the underlying implementation principle of Synchronized in Java and detailed explanation of lock upgrade optimization.

During runtime, the data stored in Mark Word changes as the lock flag bit changes. 32 The following table lists the Mark Word information in different VM states:

Store content Identify a state
Object hash code, object generation age (0 01) Unlocked (unlocked state)
Bias thread ID, bias timestamp, object generation age (1) 01 Biased (biased lock)
Pointer to a Lock Record in a thread 00 Lightweight lock
A pointer to a heavyweight lock (mutex) 10 Heavyweight lock
Empty (tag information used by the CMS garbage collector, empty at other times) 11 The GC tag

1.1.2 Class Pointer

The other part of an object’s header is a type Pointer that the object points to its Class metadata. The virtual machine uses this Pointer to determine which Class the object is an instance of. Not all virtual machine implementations must keep type Pointers on object data; in other words, finding metadata information about an object does not have to go through the object itself.

The length of this data is 32 and 64 bits, respectively, for 32-bit and 64-bit VMS.

Can use – XX: + UseCompressedClassPointers parameters on 64 virtual machine type pointer compression, compressed length is 32 bit.

1.1.3 Array Length

If the object is a Java array, there must also be a piece of data in the object header to record the length of the array, because the virtual machine can determine the size of the Java object from the metadata information of ordinary Java objects, but not from the metadata of the array.

The length of this data is 32 and 64 bits, respectively, for 32-bit and 64-bit VMS.

You can use the -xx :+ UseCompressedOops parameter to compress the object pointer of a 64-bit VM. After compression, the length of the 64-bit VM is 32 bits.

1.2 Instance Data

The instance data portion is the valid information that the object actually stores, as well as the content of various types of fields defined in program code. Both inherited from a parent class and defined in a subclass need to be logged.

The order in which this part is stored is affected by the order in which the vm allocation policy parameters (FieldsAllocationStyle) and fields are defined in the Java source code.

The default allocation strategies for HotSpot virtual machine are long/double, int/float, short/char, byte/ Boolean, oop(Ordinary Object Pointers), The types separated by/are sorted in defined order, and as you can see from the allocation strategy, fields of the same width are always assigned together. In addition, memory reordering is used to optimize space usage, that is, if the object header occupies 12bytes, a type less than or equal to 4bytes will be selected to try to fill up the 4bytes space.

The base type and reference type Pointers are padded with an alignment of 4bytes, and there is a final alignment of 8bytes after the instance data has been padded.

A variable defined in a parent class precedes a subclass if the above conditions are met. If the CompactFields parameter is true (the default is true), narrower variables in the subclass may also be inserted into the gap between the parent class variables.

1.3 Align fill (Padding)

Alignment padding does not necessarily exist and has no special meaning. It serves only as a placeholder, mainly to improve the efficiency of reading. Since HotSpot VM’s automatic memory management system requires that the object’s starting address be an integer multiple of 8 bytes, in other words, the object’s size must be an integer multiple of 8 bytes. The object header is exactly a multiple (1 or 2) of 8 bytes, so when the object instance data part is not aligned, it needs to be filled by alignment.

Object o = new Object() occupies 16 bytes in memory (compression is enabled), of which the last 4 bytes are aligned padding;

2 Pointer compression

64-bit JVMS typically consume 1.5 times more memory than 32-bit JVMS because object Pointers double in length (wider addressing) in 64-bit architectures. If a project is moving from a 32-bit virtual machine to a 64-bit virtual machine, the sudden increase in memory requirements can crash the project.

Since the JDK 6 update14, a 64 – bit JVM officially support – XX: + UseCompressedOops parameters and – XX: + UseCompressedClassPointers parameters, the size of the used to compress a pointer, saving memory footprint. We can use Java -xx :+PrintCommandLineFlags -version to check whether it is enabled by default (JDK6 is enabled by default).

  1. -xx :+UseCompressedOops: compressed common object pointer (OOP). This parameter is enabled by default. You can disable it by using -xx: -usecompressedoops. The data that will be compressed are: property Pointers for each Class (static member variables), property Pointers for each object, and Pointers for each element of an ordinary object array. Pointers to PermGen Class objects, local variables, stack elements, input parameters, return values, and NULL Pointers are not compressed.
  2. – XX: + UseCompressedClassPointers: type pointer compression, namely for klass pointer pointer compression. Use – XX: + UseCompressedClassPointers open parameters, JDK 1.6 after update14 is opened by default, you can use – XX: – UseCompressedClassPointers shut down.

Four combinations of the above two compression strategies can be calculated:

-XX:+UseCompressedOops -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseCompressedClassPointers -XX:-UseCompressedOops -XX:-UseCompressedClassPointers -XX:-UseCompressedOops -XX:+UseCompressedClassPointers

However, there is a warning when using the fourth open strategy:

Java HotSpot(TM) 64-Bit Server VM warning: UseCompressedClassPointers requires UseCompressedOops

This is due to JVM limitations:

  // UseCompressedOops must be on for UseCompressedClassPointers to be on.
  if(! UseCompressedOops) {if (UseCompressedClassPointers) {
      warning("UseCompressedClassPointers requires UseCompressedOops");
    }
    FLAG_SET_DEFAULT(UseCompressedClassPointers, false);
  }
Copy the code

In fact, open UseCompressedClassPointers parameters depend on the UseCompressedOops UseCompressedOops parameters, the default open UseCompressedClassPointers parameters, Closed UseCompressedOops, UseCompressedClassPointers parameters also followed close.

Here’s a look at the compressed vs. uncompressed sizes:

  1. On 32-bit systems, the size of the Class pointer is 4 bytes, the MarkWord is 4 bytes, and the object header is 8 bytes.
  2. On 64-bit systems, the size of the Class pointer is 8 bytes, the MarkWord is 8 bytes, and the object header is 16 bytes.
  3. With 64-bit normal object pointer compression enabled -xx :+UseCompressedOops, the size of the Class pointer is 4 bytes, MarkWord is 8 bytes, and the object header is 12 bytes. Pointer compression is turned on by default.
  4. If the object is an array, then 32-bit adds an additional 4 bytes, and 64-bit adds an additional 8 bytes (4 bytes after compression).

The following table compares the data sizes of 64-bit virtual machines with and without compression:

type 64-bit (bytes, uncompressed) 64-bit (bytes, compressed)
boolean 1 1
byte 1 1
short 2 2
char 2 2
int 4 4
float 4 4
long 8 8
double 8 8
reference 8 4
Plain object header 16 12
Array object header 24 16

3 jol View the object memory

Jol (Java Object Layout) is a toolkit provided by the OpenJDK, which can help us calculate the memory Layout of an Object and the size of the Object at runtime.

Jol is introduced:Openjdk.java.net/projects/co…

Maven depends on:

<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.9</version>
</dependency>
Copy the code

3.1 VM Information

Let’s start with the basic information about the JVM:

@Test
public void test1(a) {
    // Returns detailed information about the current VM schema
    System.out.println(VM.current().details());
}

Copy the code

My computer output is as follows:

# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4.1.1.2.2.4.4.8.8 [bytes]
# Array element sizes: 4.1.1.2.2.4.4.8.8 [bytes]
Copy the code

Explanation:

Line 1: indicates that a 64-bit VM is used.Copy the code

Line 2: indicates that normal object pointer compression is enabled, namely -xx :+UseCompressedOops. Said the third line: enabled type pointer compression, namely – XX: + UseCompressedClassPointers open parameters. Line 4: Object size must be 8bytes aligned. Line 5: Indicates the length of the pointer to the field type (bytes), followed by the reference handle (object pointer), byte, Boolean, char, short, int, float, double, long. Line 6: Indicates the length of the pointer to the array type (bytes), followed by the reference handle (object pointer), byte, Boolean, char, short, int, float, double, long.

3.2 object

This is a classic Java interview question: What is the size of the new object ()?

@Test
public void test2(a) {
    //ClassLayout: memory layout of the class
    //parseInstance: parses the incoming object
    //toPrintable: indicates conversion to a printable format

    // Parse the memory layout of an object
    System.out.println(ClassLayout.parseInstance(new Object()).toPrintable());
}

Copy the code

The output is as follows:

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Copy the code

First, explain what the corresponding noun means:

  1. Java.lang. Object Object internals: The internal layout of an Object;
  2. OFFSET: An internal OFFSET of an object as the starting position of a part.
  3. SIZE: indicates the SIZE of the corresponding component, in bytes.
  4. TYPE DESCRIPTION: TYPE DESCRIPTION of this section.
  5. VALUE: specifies the byte VALUE.

Let’s look at the layout and size of the object:

  1. The first is the 4-bytes object header, the object header;
  2. This is followed by the 4bytes object header;
  3. This is followed by the 4bytes object header;
  4. Finally, there is the 4bytes (loss due to the next object alignment) alignment, which literally means “loss due to the alignment of the next object.” Since the object header takes up 12bytes, a further 4bytes of alignment padding is required.

To sum up:

The object object occupies 16bytes of memory and consists of two parts: a 12bytes object header and a 4bytes aligned padding.

Object The header of an object consists of two parts: an 8bytes Mark Word and a 4bytes class Pointer (already compressed). The instance data portion is 0bytes.

If type pointer compression is not enabled, how many bytes does object occupy? It’s actually still 16bytes, but without the padded 4bytes.

Also note that the memory size of the array is computed by the size of the array length (which can be compressed) and the size of the pointer to the inner array element (which can be compressed).

3.3 Layout Sequence

The layout order here is mainly the layout order of the instance data, which is affected by the virtual machine allocation policy parameters (FieldsAllocationStyle) and the order in which the fields are defined in the Java source code.

The default allocation policies for the HotSpot VM are long/double, int/float, short/char, byte/ Boolean, and Ordinary Object Pointers (OOP). The types separated by/are sorted in the order defined. As you can see from the allocation strategy, fields of the same width are always assigned together. In addition, references to basic types are used for memory reordering to optimize space usage. That is, if the object header occupies 12bytes, a type less than or equal to 4bytes will be selected to fill up the 4bytes space.

The pointer to the basic type is populated in steps of 4bytes between it and the reference type, and is finally populated in steps of 8bytes after the instance data has been populated.

A variable defined in a parent class precedes a subclass if the above conditions are met. If the CompactFields parameter is true(the default is true), narrower variables in the subclass may also be inserted into the gap between the parent class variables.

Test cases are as follows:

public class A {
    long l;
    String s1;
    String s2;
    String s3;
    byte i;
}
public class JolOrder extends A {
    String s;
    long l;
    double d;
    int i;
    short sh;
    boolean bo;
    char c;
    byte b;
    float f;
}
@Test
public void test3(a) {
    System.out.println(ClassLayout.parseInstance(new JolOrder()).toPrintable());
}
Copy the code

The native test results are as follows, with a final use of 72bytes. Did you guess correctly?

4 Locating objects

Objects are created to use objects, and our Java program needs reference data on the stack to manipulate specific objects on the heap. Since the Java Virtual Machine specification only specifies a reference to an object, it does not define how the reference should locate and access the object in the heap, so object access is implementation-dependent.

At present, there are two main access methods: handle and direct pointer.

4.1 Using a Handle

If handle access is used, the Java heap will allocate a chunk of memory as a handle pool. Reference stores the handle address of the object, and the handle contains the specific address information of the instance data and the type data respectively.

The diagram below:

4.2 Direct Pointer

If direct pointer access is used, then the layout of the Java heap object must consider how to place the information related to accessing the type data, and the direct stored in Reference is the object address, which is stored in the klass of the opposite header.

The diagram below:

4.3 Comparison of the two methods

These two methods of object access have their own advantages. The biggest advantage of using handles for access is that reference stores a stable handle address, and only changes the instance data pointer in the handle when the object is moved (which is a very common behavior in garbage collection). Reference itself does not need to be modified.

The biggest benefit of using direct pointer access is that it is faster. It saves the time cost of a pointer location, and since objects are accessed very frequently in Java, this overhead can add up to a very significant execution cost. The HotSpot VIRTUAL machine uses direct Pointers for object access, but across the software development spectrum, it is common for languages and frameworks to use handles to access objects.

Related articles:

  1. In-depth Understanding of the Java Virtual Machine
  2. Java Virtual Machine Specification

If you need to communicate, or the article is wrong, please leave a message directly. In addition, I hope to like, collect, pay attention to, I will continue to update a variety of Java learning blog!