preface

The Java virtual machine stack is thread-private and has no data security issues, whereas the heap is more complex than the Java virtual machine stack.

Because the heap is a piece of memory that is shared by all threads, thread-safety issues arise, and garbage collection is mainly about reclaiming in-heap space.

So the layout of the heap is very necessary to understand. Now let’s move on to the in-heap layout and the in-memory layout of Java objects.

Object pointing

Let’s start with some code:

package com.zwx.jvm;

public class HeapMemory {
    private Object obj1 = new Object();

    public static void main(String[] args) {
        Object obj2 = newObject(); }}Copy the code

In the above code, what is the difference between obj1 and obj2 in memory?

The method area stores the structure of each class, such as runtime constant pools, property and method data, and method and constructor data.

So obj1 has a method area, and new creates an object instance that is stored in the heap, resulting in the following image (method area points to the heap) :

Insert a picture description here

Obj2, on the other hand, is a local variable that belongs to a method and is stored in a local variable table in a stack frame in the Java virtual machine stack. This is the classic stack-to-heap:

Insert a picture description here

Let’s think again about how the example object in the heap knows which Class it belongs to if a variable points to the heap and only one instance object is stored in the heap.

In other words, how does this instance know its corresponding class meta information? This relates to how a Java object is laid out in memory.

Java memory model

Object memory can be divided into three regions:

  • Object head (Header)
  • Instance Data
  • Align Padding

In a 64-bit operating system (without pointer compression enabled), the Java object layout is as follows:

Insert a picture description here

The alignment padding shown in the figure above is not mandatory, and if the object header and instance data add up to a multiple of 8 bytes, then alignment padding is not required.

With the Java memory layout in mind, let’s look at an interview question

Object obj= New Object() Occupied bytes

The size of the new Object() can be divided into two types:

  • Pointer compression is not enabled

    The occupation size is 8(Mark Word)+8(Class Pointer)=16 bytes

  • Pointer compression enabled (default enabled)

    When Pointer compression is enabled, Class Pointer is compressed to 4 bytes with a final size of:

    8(Mark Word) + 4(Class Pointer) + 4(align padding) = 16 bytes

Is this the result? So let’s verify that. First we introduce a POM dependency:

<dependency>
  <groupId>org.openjdk.jol</groupId>
  <artifactId>jol-core</artifactId>
  <version>0.10</version>
</dependency>
Copy the code

Then create a simple demo:

package com.zwx.jvm;

import org.openjdk.jol.info.ClassLayout;

public class HeapMemory {
    public static void main(String[] args) {
        Object obj = new Object();
        System.out.println(ClassLayout.parseInstance(obj).toPrintable()); }}Copy the code

The following output is displayed:

Insert a picture description here

The final result is 16 bytes, no problem, this is because pointer compression is enabled by default, so we will now turn pointer compression off and try again.

-xx :+UseCompressedOops Disables pointer compression. -xx: -usecompressedoops disables pointer compressionCopy the code

Insert a picture description here

Run it again and get the following results:

Insert a picture description here

As you can see, there is no alignment padding left, but the size is still 16 bits.

Now let’s demonstrate the size of an object with attributes.

Create a new class with a single byte property inside:

package com.zwx.jvm;

public class MyItem {
    byte i = 0;
}
Copy the code

Then output the size of the class with pointer compression on and pointer compression off.

package com.zwx.jvm;

import org.openjdk.jol.info.ClassLayout;

public class HeapMemory {
    public static void main(String[] args) {
        MyItem myItem = new MyItem();
        System.out.println(ClassLayout.parseInstance(myItem).toPrintable()); }}Copy the code

Enable pointer compression, occupying 16 bytes:

Insert a picture description here

Disable pointer compression, occupying 24 bytes:

Insert a picture description here

At this point, you can see the advantage of having pointer compression enabled. If you are constantly creating a large number of objects, pointer compression can be optimized for performance.

Object access

After creating an object, of course, we need to access it, so when we need to access an object, how to locate the object?

At present, there are two main methods of accessing objects:

  • Handle access
  • Direct pointer access.

Handle access

With handle access, the Java virtual machine allocates a chunk of memory in the heap to store the handle pool, which stores the handle address in the object, and then stores the object instance data and object type data address in the handle pool.

Insert a picture description here

Direct pointer access (Hot Spot VIRTUAL machine)

Direct pointer access will store the object type data directly in the object.

Insert a picture description here

Handle access vs. direct pointer access

In the figure above, we can easily compare the pointer position if we use the handle to access.

However, it also has the advantage that if an object is moved (the address is changed), it only needs to change the reference in the handle pool, without changing the reference in the reference object.

If direct pointer access is used, the reference reference needs to be modified in the local variable table.

Heap memory

We mentioned above that the Mark Word in the Java object header stores the generational age of the object.

The generational age of an object can be understood as the number of garbage collections. When an object exists after one garbage collection, the generational age increases by 1.

In a 64-bit VM, the generation age accounts for four bits and the maximum age is 15. The generation age defaults to 0000 and increases with the number of garbage collections.

Java heap memory is divided into Young and Old areas according to generation age, and objects are allocated to the Young area first.

When an object reaches a certain generation age (-xx :MaxTenuringThreshold can be set to a size of 15 by default), it will enter the Old zone (note: if an object is too large, it will enter the Old zone directly).

The reason for this division is that if the entire heap has only one extents, then garbage collection will need to scan all objects in the heap each time, wasting performance.

In fact, most Java objects have a very short life cycle, and once an object has been recycled many times, it can be assumed that the next garbage collection will not be recycled either.

Therefore, garbage collection in Young zone and Old zone can be carried out separately. Only when there is still no space left in Young zone after garbage collection, garbage collection in Old zone can be triggered.

Insert a picture description here

Young area

Let’s look at the following scenario, where Young is an overview after garbage collection:

Insert a picture description here

If an object comes in and takes up the size of two objects, it will find that it can’t fit, and then GC(garbage collection) will be triggered.

However, once GC(garbage collection) is triggered, there is an impact on the user threads because all user threads need to be stopped during GC to ensure that object references are not constantly changing.

Sun calls this event: Stop the World(STW).

So it’s generally better to have as little GC as possible, but in fact you can see in the figure above that you can put at least 3 more objects in there, as long as they are placed in order.

There is space, but because the space is not continuous, so the object failed to allocate memory, resulting in the trigger of GC, so how to solve this problem?

The solution is to put the objects in the Young area in order, so a method is generated to divide the Young area again into two areas: Eden area and Survivor area.

Insert a picture description here

When an object arrives, it is allocated to Eden. When Eden is full, GC is triggered.

After GC, surviving objects are copied to Survivor zones to prevent space discontinuities, and the Eden zone can be cleaned up in its entirety.

Of course, there is a premise to do this, that is, most objects have a very short life cycle. Basically, most objects in Eden can be reclaimed in one garbage collection (this premise is summarized after testing).

Survivor zones are also reclaimed when GC is triggered, not just Eden zones.

However, the problem arises again. The Eden zone ensures the space is basically continuous, but the Survivor zone may produce space debris, resulting in discontinuity.

So we split the Survivor zone in two again.

Insert a picture description here

At this point the workflow looks like this again:

When Eden is full, GC is triggered. After GC, the surviving objects are copied to S0 (S1 is empty), and then objects are allocated in Eden.

After the GC is triggered again, if it is found that S0 is not fit (space debris is generated, but there is still space), then the S0 objects are copied to S1, and the surviving objects are also copied to S1, where S0 is empty, and the sequence is repeated.

If the space object in S0 area or S1 area is still unable to be put down after being copied and moved, it means that it is really full at this time. Then go to the old area to borrow some space (this is the guarantee mechanism, and the old age needs to provide such space allocation guarantee).

Full GC will be raised if there is not enough space in the old section, or OutOfMemeoyError will be raised if there is not enough space in the old section.

Note: To ensure that each copy between S0 and S1 is successful, the size of S0 and S1 must be the same, and one region must be empty at a time.

While this may result in a small amount of wasted space, it’s worth it in terms of other performance improvements.

Old district

When an object in the Young section reaches the set generation age, the object will enter the Old section. When the Old section is Full, the Full GC will be triggered. If the Old section cannot clear the space, OutOfMemeoyError will be raised.

Noun literacy

There are a lot of new terms mentioned above, and there are actually other names for many of them, so I think it’s worth knowing.

  • Garbage collection: referred to as GC.
  • Minor GC: GC for the new generation
  • Major GC: For old GC, the old GC triggers Minor GC as well as Full GC.
  • Full GC: Simultaneous Cenozoic + old GC.
  • Young: The new generation
  • The Old days
  • Eden: I haven’t found any Chinese translation yet.
  • Surcivor zone: Survival zone
  • S0 and S1: also called from and to. Note that from and to are constantly changing identities, and S0 and S1 must be equal, and an area must be empty

A diagram of an object’s life trajectory

From the above introduction, you should have a general impression that an object will be constantly transferred in Eden, S0, S1 and Old (except short-lived objects that will be reclaimed at the beginning of course).

We can get the following flow chart:

Insert a picture description here

conclusion

This article focuses on how a Java object is stored in the heap and demonstrates a common object footprint problem with the memory layout of Java objects.

Then, it also analyzes the space division in the heap and the reasons for the division. In this paper, GC related knowledge is not explained in depth. For GC and GC algorithm, GC collector and other related knowledge, please refer to the previous historical articles.

Please pay attention to me and learn and progress with my elder brother.

From: blog.csdn.net/zwx900102/a…