Today I’m going to talk about something abstract — object headers, because I found a lot of things related to object headers in the process of learning, such as synchronized lock optimization in JDK and object age upgrade in JVM, and so on. To understand how this works, it is necessary to understand the concept of object headers, and to prepare you for sharing your knowledge of synchronized and the JVM later on.

Object memory composition

In Java, an instance object of a class is created using the new keyword. The object is stored in the memory heap and assigned a memory address.

  • How does this instance object exist in memory?
  • How much memory does an Object occupy?
  • How are properties in objects allocated in memory?

In the JVM, a Java object, when stored in the heap, consists of three parts:

  • Object Header: This contains basic information about the heap object’s layout, type, GC state, synchronization state, and identification hash code. Java objects and VM internal objects have a common object header format.
  • Instance Data: mainly stores the Data information of the class, the information of the parent class, and the property information of the object field.
  • Padding: Padding is not required for byte alignment.

Object head

You can find its description in the Hotspot official documentation (below). As you can see, it is a common format for Both Java objects and objects inside virtual machines, and consists of two words (computer terms). In addition, if the object is a Java array, there must also be a piece of data in the object header to record the length of the array, because the virtual machine can determine the size of the Java object from the metadata information of ordinary Java objects, but not from the array metadata.

It says that the object header consists of two words. What are these two words? If we look up from the Hotspot documentation above, we can see that there are two other definitions: Mark Word and Klass Pointer.

You can see the two words in the object header: the first word is mark Word and the second word is klass pointer.

Mark Word

Used to store the runtime data of the object itself, such as HashCode, GC generation age, lock status flags, locks held by threads, bias thread ids, bias timestamps, and so on.

Mark Word is 32bit in 32-bit JVMS and 64bit in 64-bit JVMS. We open its source code package, the corresponding path/its/hotspot/SRC/share/vm/oops, Mark Word corresponding to c + + code markOop. The HPP, can see from the comments of what they are made, all the code in this paper is based on Jdk1.8.

Mark Word stores different contents in different lock states, as it does in 32-bit JVMS

This is how it is stored in 64-bit JVMS

Although they vary in length across different bit JVMS, the basic composition is the same.

  • Lock flag bit (LOCK) : identifies the lock state. 11 indicates the state of the object to be collected by GC. Only the last two lock flags (11) are valid.
  • Biased_lock: indicates whether biased lock is used. Because the lock identifier of normal and biased locks is 01, it is impossible to distinguish them.
  • Age: Indicates the number of times an object is GC. When this threshold is reached, the object will be moved to the old age.
  • Object hashCode (hash) : Call system.identityHashcode () at runtime to evaluate, delay the evaluation, and assign the result here. When the object is locked, the calculated result of 31 bits is not enough to indicate that the hashcode is transferred to Monitor in bias locking, light locking, weight locking.
  • Biased lock Thread ID (JavaThread) : In biased mode, when a thread holds an object, the object is set to the ID of that thread. In subsequent operations, there is no need to attempt to acquire the lock.
  • Epoch: Bias lock In the CAS lock operation, the bias identifier indicates which lock the object prefers.
  • Ptr_to_lock_record: pointer to the lock record in the stack in the lightweight lock state. When lock acquisition is uncontested, the JVM uses atomic operations instead of OS mutexes. This technique is called lightweight locking. In the case of lightweight locking, the JVM uses CAS operations to set a pointer to the lock record in the object’s header.
  • Ptr_to_heavyweight_monitor: pointer to the object Monitor Monitor in the heavyweight lock state. If two different threads are competing on the same object at the same time, lightweight locking must be upgraded to Monitor to manage waiting threads. In the case of heavyweight locking, the JVM sets a pointer to Monitor on the object’s ptr_TO_HEAVYweight_monitor.

Klass Pointer

A type pointer is a pointer that an object points to its class metadata, and the VIRTUAL machine uses this pointer to determine which class instance the object is.

The instance data

If the object has property fields, there will be data information. If the object has no property fields, there will be no data. Different bytes are used for different types of fields, such as Boolean (1 byte), int (4 bytes), and so on.

Align data

Objects may or may not have alignment data. By default, the starting addresses of objects in the Java virtual machine heap need to be aligned to multiples of 8. If an object is less than 8N bytes, it needs to be populated to make up for the amount of space left after the object header and instance data occupy memory. If the object header and instance data already fill up the memory space allocated by the JVM, alignment padding is no longer necessary.

The total SIZE of the bytes allocated by all objects must be a multiple of 8. If the total SIZE of the preceding object headers and instance data does not meet the requirements, align the data to fill it.

Why align data? One reason for field memory alignment is to have fields appear only in cached rows on the same CPU. If the fields are not aligned, then it is possible to have fields that span cached rows. That is, reading the field may require replacing two cache rows, and the storage of the field may pollute both cache rows at the same time. Both cases are detrimental to the efficiency of the program. In fact, the ultimate goal of filling it is for efficient computer addressing.

At this point, we know the overall structure layout of objects in heap memory, as shown in the figure below

Talk is cheap, show me code

The conceptual thing is abstract: is it true, when you say that it is thus constituted? Learning requires skepticism, and any theory or concept can only be accepted after it has been proven and practiced. Fortunately, openJDK provides a toolkit that can be used to obtain object information and virtual machine information, we just need to introduce the Jul-core dependency, as follows

<dependency>
  <groupId>org.openjdk.jol</groupId>
  <artifactId>jol-core</artifactId>
  <version>0.8</version>
</dependency>
Copy the code

There are three commonly used methods for jol-core

  • ClassLayout.parseInstance(object).toPrintable(): Displays internal information about an object.
  • GraphLayout.parseInstance(object).toPrintable(): Displays external information about an object, including referenced objects.
  • GraphLayout.parseInstance(object).totalSize(): Displays the total size of an object.

Ordinary objects

To keep things simple, let’s create a class D instead of a complex object, and let’s look at fields without attributes first

public class D {}Copy the code

Using the Jol-core API, we print out the internal information of the object

public static void main(String[] args) {
    D d = new D();
    System.out.println(ClassLayout.parseInstance(d).toPrintable());
}
Copy the code

The final printed result is

OFFSET, SIZE, TYPE DESCRIPTION, VALUE

  • OFFSET: indicates the OFFSET address, in bytes.
  • SIZE: Indicates the occupied memory SIZE, in bytes.
  • TYPE DESCRIPTION: Indicates the TYPE DESCRIPTION. Object Header is the object header.
  • VALUE: indicates the current VALUE stored in the memory.

It can be seen that the d object instance occupies 16 bytes, and the object header occupies 12 bytes (96 bits), including 8 bytes for Mark Word and 4 bytes for Klass pointe. The remaining 4 bytes are padded and aligned.

Pointer compression is enabled by default, so the object header is 12 bytes. The concept of pointer compression is not described here, but you can refer to the official documentation for your own interest. Pointer compression is enabled by default in JDK8. You can enable or disable pointer compression by configuring VM parameters, such as -xx: -usecompressedoops.

If pointer compression is turned off and the memory layout of the object is reprinted, the total SIZE of the object is larger. As you can see from the figure below, the memory SIZE of the object header is 16 bytes (128 bits), including 8 bytes for Mark Word and 8 bytes for Klass pointe, without alignment.

Turning on pointer compression reduces the memory usage of objects. It can be seen from the layout information of the D object printed twice that when pointer compression is disabled, the SIZE of the object header increases by 4 bytes. Since the D object is non-attribute, readers can try adding several attribute fields to see that the SIZE increase will be obvious. So turning on pointer compression, in theory, saves about 50 percent of memory. In THE JDK8 or later versions, pointer compression is enabled by default.

The array object

Let’s take a look at the memory layout of array objects. What are the differences

public static void main(String[] args) {
    int[] a = {1};
    System.out.println(ClassLayout.parseInstance(a).toPrintable());
}
Copy the code

The displayed memory layout information is as follows

Mark Work (8 bytes), Klass Point (4 bytes), and Array Length (4 bytes). So the instance data of the array object takes up 4 bytes, and the remaining alignment padding takes up 4 bytes.

At the end

We have learned about the memory layout of an object, the concept of the memory layout of an object and the concept of the header, especially the Mark Word of the header, which will be very useful in our subsequent analysis of synchronize lock optimization and JVM garbage collection age generation.

JVM remember that the age of the object increases by 1 every time it passes MinorGC in Suvivor. When it reaches a certain age, it will be promoted to the old age. The default age is 15 years old. In Mark Word, it can be found that the space allocated to Mark the generational age of the object is 4bit, and the maximum number that 4bit can represent is 2^4-1 = 15.