This is the first day of my participation in Gwen Challenge


This article focuses on JVM memory partitioning and execution on the stack. This content mainly involves the following three interview questions:

How does the JVM divide memory regions?

How does the JVM manage memory efficiently?

Why do you need a meta-space, and what are the problems involved?

With these three questions, the knowledge of memory partition can be understood, do not need to memorize, because it will be combined with the work of scenariospecific problem specific analysis, which can gain a deeper understanding of the MEMORY of the JVM.

First, the first question: How efficiently is the MEMORY region of the JVM divided? This is also a very common interview question. Many students may respond to this question by rote memorization, which will not only fail to integrate the knowledge into the interview but also easily forget the answer.

Why ask about JVM memory partitioning? Java prides itself on automatic memory management. Java programs are much easier to write than C++ ‘s manual memory management, complex Pointers, and so on.

However, there is a price to be paid for requesting and releasing memory immediately. In order to manage these fast memory deallocation operations, a pool must be introduced to delay deallocation of these memory areas.

So, what we call a memory collection is an operation on this pool. Let’s call this pool, the heap, and we can think of it as a whole for a moment.

JVM Memory layout

For a program to run, it needs data. Once you have data, you need to store it in memory. Can you recall how our C++ program works? Is it the same?

The data structure of Java programs is very rich. Here are some examples:

  • Static member variable

  • Dynamic member variable

  • The regional variables

  • Short, compact object declarations

  • Large and complex memory requests

Where are so many different data structures stored and how do they interact with each other? Are you often asked these questions in job interviews?

Take a look at the MEMORY layout of the JVM. As Java has evolved, the memory layout has been tweaked. For example, Java 8 and later removed persistent generations entirely and used Metaspace instead. -xx :PermSize and -xx :MaxPermSize are no longer valid. But in general, the more important areas of memory are fixed.

JVM memory region division is shown in the figure, from which we can see:

  • Data in the JVM heap is shared and is the largest memory hog.

  • The module that can execute bytecode is called an execution engine.

  • How does the execution engine recover from a thread switch? It depends on the program counter.

  • JVM memory partitioning is closely related to multithreading. Like the stack that we use at run time in our program, and the local method stack, they all have the dimension of threads.

  • Local memory contains the metadata area and some direct memory.

In general, as long as you can answer these key areas, the interviewer will nod with satisfaction. But if you dig deep, you may have a headache. Let’s take a closer look at this process.

The virtual machine stack

What data structure is a stack? If you think about the process of loading a bullet, the last bullet is fired first, and the top bullet acts as the top of the stack.

As we mentioned above, the Java virtual machine stack is thread-based. Even if you only have a main() method, it runs as a thread. In the life cycle of a thread, the data involved in the calculation is frequently pushed on and off the stack, and the life cycle of a stack is the same as that of a thread.

Each piece of data in the stack is a stack frame. Each time a Java method is called, a stack frame is created and merged into the stack. Once the corresponding call has been made, the stack is removed. When all frames are off the stack, the thread terminates. Each stack frame contains four areas:

  • Local variable scale

  • The operand stack

  • Dynamic connection

  • The return address

Our applications are built by constantly manipulating these memory Spaces.

The native method stack is an area very similar to the virtual machine stack that serves native methods. You can even think of the virtual machine stack and the local method stack as the same area, and that doesn’t affect our understanding of the JVM.

There’s a special data type called returnAdress. Because this type only exists at the bytecode level, we don’t deal with it much. To the JVM, a program is a bytecode instruction stored in a method area, and a value of type returnAddress is a pointer to the memory address of a particular instruction.

There are two interesting things to mention in this section that will make your interviewer’s eyes pop.

  • There’s a two-tier stack here. The first layer is stack frames, corresponding to methods; The second layer is the execution of methods, which correspond to operands. Be careful not to get mixed up.

  • As you can see, all bytecode instructions are abstracted into stacks and out of stacks. The execution engine just foolishly executes in order to make sure it’s correct.

This is amazing and fundamental. Let’s take a look at what’s inside from a threading perspective.

Program counter

So if you think about it, if our program switches between threads, how do we know where the thread has executed?

Since it is a thread, it means that it is unpredictable in obtaining CPU time slices, and it needs a place to buffer the record of the point at which the thread is running so that it can quickly recover when it obtains CPU time slices.

It’s like you stop what you’re doing, you make a cup of tea, and then you pick up where you left off?

A program counter is a small memory space that acts as a line number indicator of the bytecode being executed by the current thread. And that’s the progress of the current thread. Here’s a picture to help you understand the process.

As you can see, program counters are also created for threads and work with the virtual machine stack to perform computations. The program counter also stores the currently running process, including executing instructions, jumps, branches, loops, exception handling, and so on.

We can take a look at what’s inside the program counter. Here is the bytecode output using the Javap command. You can see that in front of each opCode, there’s a sequence number. These are the offset addresses in the red box, which you can think of as the contents of the program counter.

The heap

The heap is the largest area of memory on the JVM, where almost all of the objects we request are stored. When we say garbage collection, the object of operation is the heap.

Heap space is usually claimed at startup, but not always used.

As objects are created frequently, the heap space becomes more and more occupied, and objects that are no longer in use need to be irregularly reclaimed. This is called Garbage Collection (GC) in Java.

Due to the different sizes of objects, the heap space will be filled with many tiny fragments after a long run, resulting in a waste of space. Therefore, it is not enough to just destroy objects, but also need to clean up the heap space, which is a very complicated process.

When an object is created, is it allocated on the heap or on the stack? This has to do with two things: the type of the object and its location in the Java class.

Java objects can be divided into basic data types and ordinary objects.

For normal objects, the JVM creates the object on the heap first, and then uses references to it elsewhere. For example, store this reference in a local variable table in the virtual machine stack.

For basic data types (byte, short, int, long, float, double, char), there are two cases.

As we mentioned above, each thread has a virtual machine stack. When you declare an object of primitive data type in the method body, it is allocated directly on the stack. In other cases, it’s all on the heap.

Note that things like the int[] array are allocated on the heap. Arrays are not basic data types.

This is the basic memory allocation strategy of the JVM. While the heap is shared by all threads, there are data synchronization issues if multiple threads access it. This is also a big topic, so we’ll leave you with a cliffhanger.

Metaspace for Metaspace, we started with a very common interview question: “Why Metaspace regions? What’s wrong with it?”

At this point, you should recall the difference between classes and objects. An object is a living entity that can participate in the running of a program. A class is more like a template that defines a set of properties and operations. So you can imagine. Where in the JVM is the A.class we generated earlier placed?

To answer this question, you have to mention the history of Java. Prior to Java 8, information about these classes was stored in memory in a section called Perm. Earlier versions, and even the run-time constant pools associated with String.intern, are also here. This region has a size limit and can easily cause the JVM to run out of memory, which can cause the JVM to crash.

The Perm zone has been completely abolished in Java 8 in favor of Metaspace. The old Perm area was on the heap, now the meta-space is on the non-heap, so that’s the background. For a comparison, you can look at this graph.

And then, the upside of the meta space is also its downside. With non-heap, you can use the operating system’s memory and the JVM will no longer run out of memory in the method area. However, unlimited use is the death of the operating system. Therefore, the -xx :MaxMetaspaceSize parameter is also used to control the size.

The method area, as a concept, still exists. Its physical storage container is Metaspace, which stores the contents of the class, constant pool, method data, and method code.

summary

For this part of the content in the interview will often encounter the following two questions.

Where do we put our string constants, as we call them?

And since the constant pool, after Java 7, was put in the heap, the strings that we create will be allocated on the heap.

What’s the difference between heap, non-heap, and local memory?

We can look at a picture of their relationship. To my mind, the heap is soft, loose and elastic; Non-heap is cold and hard, and memory is very compact.

As you know, when the JVM is running, it requests large chunks of heap memory from the operating system for data storage. However, out-of-heap memory, which is the amount of memory left over by the operating system after application, is also partially controlled by the JVM. Typical is some native keyword modification methods, as well as memory application and processing.

On a Linux machine, using the top or ps commands, you can see that in most cases the RSS segment (actual memory footprint) is larger than the heap allocated to the JVM.

If you apply for a host with 2GB of system memory, the JVM may only be able to use 1GB, which is a limitation.

conclusion

The RUNTIME area of the JVM is the stack and the storage area is the heap. A lot of variables are actually fixed at compile time. The bytecode of the.class file, which is not so difficult to understand due to the mnemonic function, will be shared later to take a look at the nature of multithreading at the bytecode level.

The runtime nature of the JVM, as well as bytecode, is low-level knowledge. This article is a preliminary introduction, and some parts are not explained in depth. Hopefully, you should be able to build up an idea of how Java programs work in your mind so that you can get an overall picture when referring to the corresponding memory region in later articles.