The Java Virtual Machine needs memory to breathe — sometimes more than we like. One of its most needed subsystems is Metaspace, the part of the JVM that holds class metadata. With JEP 387, SAP has contributed a more frugal and resilient implementation of OpenJDK.

Although the “elastic meta space” is relatively unknown, it is one of the biggest external contributions to this release, with a whopping 25kloc of patches themselves.

What is JEP?

Java (Java Virtual Machine and JDK) was developed under the umbrella of OpenJDK, a large open source project managed by Oracle and other companies. SAP has been a long-term contributor to the project, with our first involvement dating back to 2012.

OpenJDK development is managed by process, normal enhancement through a process called enhancement Request (RFE). RFE requires patch reviews, but usually rarely, unless patches affect compatibility.

But significant changes to the API surface of the JVM, Java language, or JDK extend the scope of RFE. As such, they are subject to the heavier Java Enhancement Proposal process (which, in the case of nice recursion, is defined by its own JEP). JEP requires more extensive design and code reviews. As a result, it usually takes longer than simple RFE. Nonetheless, JEP is critical to ensuring the long-term quality and compatibility of the JDK.

Most OF the JEP is done by Oracle itself, although the process is open to everyone. This may come down to the sheer size of the talent pool. But it is possible and done outside of JEP: For example, in 2019, Red Hat offered their famous Shenandoah-GC as JEP 189.

Out-of-heap memory and meta space

The JVM can be a resource-poor beast. The biggest memory consumer is usually the Java heap, which is nice and expected, since it contains the actual program data. Everything else is just a necessary surplus — grease needed to make the machine run.

As a result, users are sometimes surprised to find that Java heap consumption is only a fraction of the total space consumed by JVM processes. But we need to accommodate a lot of internal data, such as:

  • Thread stack
  • GC control structure
  • Practice string
  • CDS files and text fragments
  • Jit-compiled code (code cache)
  • And many, many other things

All of this data lies outside the Java heap, either in the C heap or in a manually managed map. Colloquially known as out-of-heap (or, less correctly, native) memory, the combined size of these areas can exceed the size of the heap itself.

One of the biggest consumers of native memory in the JVM is probably meta-space. Therefore, it is worthwhile to optimize the footprint of the meta-space. Especially because it can get out of hand if the conditions are just right: Before Java 16, Metaspace didn’t handle certain – fully efficient – classloading patterns well.

This is the main purpose of JEP 387. The meta-space was introduced with Java 8 and has remained unchanged for most of its existence. It’s time for an overhaul.

Class metadata

Metadata holds class metadata. What are these?

A Java Class contains more than just objects in the java.lang.Class heap. When the JVM loads a class, it builds a structural tree composed primarily of the predigested parts of the class file. The root of the tree is a variable size structure called “Klass” (yes, capital “K”), which contains, among other things, the classes itable and vtable. In addition, the tree contains constant pools, method metadata, comments, bytecode, and so on. It also contains data that is not loaded from the class file but is generated purely at runtime, such as JIT-specific counters.

A Java class begins its life when it is loaded by the classloader. During Class loading, the loader java.lang.class creates objects for the Class in the heap and parses and stores the metadata for the Class in Metaspace. The more classes a loader loads over its lifetime, the more metadata it accumulates in Metaspace. The classloader owns all of this metadata.

A Java class is removed — uninstalled — only when its loading classloader is dead. The Java specification defines this:

“A class or interface may be unloaded if and only if the garbage collector can reclaim the classloader it defines.”

This rule has some interesting consequences. A Java.lang. Class object holds a reference to the Java.lang.ClassLoader that is loaded to it. All instances hold references to their Java.lang.class objects. Therefore, regardless of external references, a classloader can only be collected if all of its classes and all instances are collectable. Once the classloader object is not accessible, THE GC will remove it and unload all of its classes. At that point, it also frees up all the class metadata that the loader has accumulated over its lifetime.

Therefore, we have a “bulk-free” scenario: the class metadata is bound to the classloader and is released in bulk when the classloader dies (we are ignoring the exception to this rule here for simplicity).

Before Java 8: PermGen manages class metadata permanently

Today, class metadata exists in native memory. This was not always the case: before Java 8, they lived in a heap called PermGen. GC manages them like normal Java objects, but this has several disadvantages.

As part of the Java heap, a permanent generation has a limited size. This size must be specified in advance at VM startup. Tight restrictions usually result in an unrecoverable OOM, so users tend to oversize PermGen. That’s a waste of memory and address space. Being in the heap also means that the permanent generation must be a contiguous region, which can be problematic on 32-bit platforms with limited address Spaces.

Another issue with PermGen is the effort required to free up metadata. GC treats them like normal Java objects: entities that can die at any point in time and can be collected. But class metadata is bound to their loaders, so their life cycle is predictable. Therefore, the general flexibility of garbage collection is unnecessary and wastes the associated costs [6].

PermGen also makes life more difficult for JVM developers. Because the metadata is in the Java heap, it is not address stable; GC can move them around. Processing this data in the JVM is cumbersome because references need to be resolved into physical Pointers when accessed. In addition, it makes debugging the JVM and analyzing core files less interesting.

BEA and more about JRockit over the JVM

In 1998, students in Stockholm built an alternative Java VM, the JRockit VM, and created Appeal Virtual Machines. Appeal was taken over by BEA Systems in 2002, and oracle acquired BEA in 2008.

In 2010, Oracle acquired Sun Microsystems. After the second acquisition, Oracle had two separate JVM implementations, the JRockit VM and the original Sun JVM. The JRockit JVM was cancelled and the focus shifted to the Sun JVM.

Fortunately, Sun had open sourced its JVM prior to the acquisition. In 2007, the OpenJDK project was founded, and most of the code base has been released under GPLv2. After Sun’s acquisition, Oracle, fortunately, did not withdraw this decision and continued to support OpenJDK.

The JRockit VM does not keep class metadata in the heap, but in native memory. This was in line with current thinking within the former Sun-JVM team at the time. The decision was therefore made to scrap PermGen

Java 8 through Java 15: The first meta space

The first Metaspace in Java 8 was a huge improvement over PermGen. But it also brings new problems, in the form of occasionally very high memory footprint and significantly reduced elasticity. At a high level, these new problems are caused by class metadata leaving the cosy embrace of the Java heap and instead scrolling its own memory allocator. There are, it turns out, some pitfalls.

At SAP, we investigated customer issues and became more involved in Metaspace development at that time.

  • Fixed block size

First, meta-space block management is too rigid. Blocks come in various sizes and can never be resized. This limits their potential for reuse after the death of the original loader. The free list can fill up with blocks that are locked to the wrong size and cannot be reused by Metaspace.

  • inelastic

The first Metaspace was also inflexible, unable to recover from spikes in usage.

When classes are unloaded, their metadata is no longer needed. In theory, the JVM can hand these pages back to the operating system. If the system is under memory pressure, the kernel can make these free pages available to the people who need them most, which may include other areas of the JVM itself. It is useless to reserve this memory for some possible future class loading.

But Metaspace keeps most of the memory by keeping the freed blocks in the free list. To be fair, there is a mechanism to return memory to the operating system by unmapping empty virtual space nodes. But this mechanism is very coarsely grained, and even moderate meta-space fragmentation is easily defeated. Moreover, it does not apply to class Spaces at all.

  • High overhead per class loader

In older meta-spaces, small class loaders are disproportionately affected by high memory overhead. If your loader size reaches these “optimal location” size ranges, you will pay far more than the required cost of the loader. For example, a loader that allocates about 20K of metadata will consume about 80K internally, wasting more than 75% of the allocated space.

These quantities are small, but add up quickly when dealing with swarms of small loaders. This problem mainly bothers auto-generated classloader scenarios, such as dynamic languages implemented on Java.

Java 16: Meta-space, reinvention

The Metaspace code base became unwieldy and difficult to maintain, so we decided to start from scratch and implement it cleanly. This work requires JEP because it is outside the scope of normal RFE due to its size and the risks involved. It requires more careful review, testing, and collaboration from both Oracle’s runtime and GC staff.

With Java 16, JEP 387 was released — a new meta-space was born. It retains the basic principles of the old Metaspace architecture, with the core being an arena allocator that sits on top of its own virtual memory layer. But there are key differences.

Block geometry in old Metaspace was rigid and inflexible. Blocks exist mainly in three fairly arbitrarily spaced sizes and are difficult to merge and split. This inefficient geometry quickly leads to fragmentation when class unloading begins, which is why each classloader is expensive.

The new meta-space uses a new allocation scheme to manage blocks in memory, based on the partner allocation algorithm [13]. The algorithm is fast and efficient, implements tight memory packaging, and is very good at preventing fragmentation. It manages all of this at a very low run-time cost.

The buddy allocator algorithm is very old, dating back to the 1960s. It is widely used in C-Heap implementations or virtual memory management in operating systems. For example, the Linux kernel uses a variant of this algorithm to manage physical pages.

A typical partner allocator manages blocks of a power of two size. Because of this, it is not the best choice to implement an “end-user” allocation scheme like malloc() because it wastes memory and each allocation is not a perfect quadratic power. But Metaspace uses partner allocation, and this limitation doesn’t matter: The blocks managed by the partner allocator are not the end product of metadata allocation, but rather the coarser grained building blocks used to implement Metaspace Arenas.

The very simplified partner assignment in Metaspace works like this:

  • The classloader is the metadata request space; Its Arena needs and requests a new block from the block manager.
  • The block manager searches the free list for blocks equal to or greater than the requested size.
  • If it finds a larger size than the request, it splits the block in half repeatedly until the fragment has the requested size.
  • It now hands over one of the fragments to the request loader and adds the remaining fragments back to the free list.

Block release works in reverse order:

  • The classloader is dead; Its Arena also dies and returns all blocks to the block manager
  • The block manager marks each block as free and checks its adjacent blocks (” partners “). If it is also free, it merges the two blocks into one larger block.
  • It repeats the process recursively until it meets a partner that is still in use, or until it reaches the maximum chunk size (and maximum defragmentation).
  • Chunks are then unallocated to return memory to the operating system.

Like a self-repairing ice sheet, chunks break apart as they are distributed and crystallize back into larger units as they are released. Even if this process repeats endlessly, it is an excellent way to prevent fragmentation, for example, in a JVM that loads and offloads a large number of classes over its lifetime.

Now, “Talk is cheap, show me the code.” The code is there. Compared with the old meta-space, how is the new meta-space formed?

Advantages:

  • Better elasticity
  • Less overhead per class loader
  • The smaller meta space is used for small Java programs

The new metaspace in Java 16 saves memory — how much depends on the scenario. Flexibility and reduced fragmentation benefit large applications with long uptime. The reduction in overhead per class loader helps in the case of fine-grained loader scenarios. The new Metaspace code base is also cleaner and simpler, which reduces our maintenance staff costs and makes future enhancements easier.