A few days ago, I had a technical exchange with a friend, who came up with a question during the recruitment interview. When THE JVM garbage is collected, the surviving objects will be copied to different regions. For example, from new generation to old age, does the address of the object being copied change? I was very interested in the question he raised, and after doing some in-depth research, I got this article.

Updating references is the responsibility of the JVM

Updating object references is a fundamental responsibility of any JVM design when moving objects using any GC algorithm. That is, when you move an object, you necessarily involve changing the object reference, but the JVM already does that for you.

As a developer, you can think of references as abstract handles to store objects without worrying about how the JVM manages object storage. But if you’re doing technical research and curious about the underlying implementation, it’s worth digging deeper.

When the actual address of an object changes, the JVM simply updates the reference address used by one or more variables pointing to that address, thereby “unknowingly” moving the object.

The JVM specification only defines reference types as references to objects and does not restrict implementation. Therefore, different virtual machines may have different implementation modes. There are two common implementations: handle access and direct pointer access.

Handle access

To start with, the handle access is in the form that the heap space maintains a pool of handles, and the object reference holds the handle location of the object. The handle in the heap contains the real address of the instance data and type data of the object.

This form of realizing benefits obviously, handle to the object reference stored in the address is relatively stable (constant), when the GC operation moving objects only handle to maintain the pool stored information can, in particular, multiple variables refer to the same pool handle handle, can reduce the update variables store reference, at the same time ensure that the address of the variable remains the same. Disadvantage is more than a transit, access efficiency will have an impact.

Direct pointer access

Direct pointer access eliminates the middle handle pool, and object references hold the object address directly.



This method obviously saves the cost of a pointer location and the access speed is fast. However, the reference addresses held in variables also need to be maintained when objects are moved by GC, and updated multiple times if multiple variables point to the same address. The Hot Spot VIRTUAL machine is implemented in this way.

How do I view reference addresses?

Above we talked about the implementation of object reference, so in the daily development of the object can be printed to see the address? There is a saying that the information printed through the object’s default toString method contains the reference address of the object. Let’s take a look at an example:

Bike bike = new Bike();
System.out.println(bike);
Copy the code

When we execute the above program, the console prints the following information:

com.secbro2.others.Bike@4dc63996
Copy the code

What’s the string after @? Is it the address of the object? This address has actually been around for a long time. ToString: toString: toString: toString

public String toString() {
    return getClass().getName() + "@" + Integer.toHexString(hashCode());
}
Copy the code

Using the source code, we can see that the @ match is not the address of the object, but just the hexadecimal representation of HashCode.

So how do you print the memory address of an object? We need to rely on a JOL (Java Object Layout) class library to add the following Maven dependencies to the project:

<dependency>
    <groupId>org.openjdk.jol</groupId>
    <artifactId>jol-core</artifactId>
    <version>0.10</version>
</dependency>
Copy the code

Then use it in the program as follows:

String answer = "42";
System.out.println("The memory address is " + VM.current().addressOf(answer));
Copy the code

You will find the following print:

The memory address is 31856221536
Copy the code

The above is the actual memory address, which can be retrieved and printed, but pointer compression is performed differently by JVMS in different environments. Therefore, we should not do any native-memory operations based on this address. But the above print clearly proves that the toString method does not print the object’s memory address.

Because of this, printing a hashCode based on the toString method only guarantees that the hashcodes of the two objects are the same, but does not guarantee that the two reference addresses point to the same object.

summary

Through a small exchange with a friend, dig deep, unexpectedly found a lot of knowledge points at the bottom, communication and exploration can be seen. To sum up, the JVM automatically maintains reference addresses during GC operations, and the application address of a variable changes depending on whether it is pool-based or directly pointer to. Also, when we print through the toString method, the output does not contain the object’s address, just the hexadecimal of the object’s HashCode.


Program new horizon

\

The public account “program new vision”, a platform for simultaneous improvement of soft power and hard technology, provides massive information