Performance Considerations

Garbage collection performance has two main metrics. Throughput is the percentage of total time not used for garbage collection considered over an extended period of time. Throughput includes the time it takes to allocate (but usually does not need to adjust the allocation speed). Pause is the time when an application is not responding because garbage collection is taking place.

Users have different requirements for garbage collection. For example, some people argue that the right metric for a Web server is throughput, because pauses during garbage collection may be tolerable or simply masked by network latency. However, in interactive graphics programs, even brief pauses can have a negative impact on the user experience.

Some users are sensitive to other considerations. A footprint is a working set of processes, measured in pages and cached rows. On systems with limited physical memory or many processes, footprint can affect scalability. Cue is the time between objects becoming dead and memory becoming available, which is an important consideration in distributed systems, including remote method calls (RMI).

Often, a particular generation size choice is weighed against these considerations. For example, very large young generations can maximize throughput, but do so at the expense of space, timeliness, and pause time. Young generation pauses can occur at the expense of a small and minimal young throughput. For the first approximation, the size of one generation does not affect the collection frequency and pause time of the other generation.

There is no one right way to scale. The best choice depends on how the application uses memory and what the user wants. Therefore, the virtual machine’s choice of garbage collector is not always optimal and can be overridden by the user in the form of command-line options.

 

Garbage collector, heap, and runtime compiler

In J2SE platform version 1.4.2, the following choices are made by default

  • Serial garbage collector
  • The heap size
  • The initial heap size is 4 MB
  • The maximum heap size is 64 MB
  • The client runs the compiler

In J2SE platform version 5.0, a class of machines called server-level machines has been defined as having machines

  • Two or more physical processors
  • 2 or more GB of physical memory

By default, the following is selected on a server-level machine.

  • Throughput garbage collector
  • The heap size
  • Physical memory with an initial heap size of 1/64 and a maximum of 1Gbyte
  • Physical memory with a maximum heap size of 1/4 and a maximum size of 1Gbyte
  • Server Runtime compiler

 

Behavior-based adjustment

Prior to J2SE platform version 5.0, tuning for garbage collection mainly involved specifying the size of the entire heap as well as the possible generations in the heap. Other controls for adjusting garbage collection include the size of the survivor space for the younger generation and the promotion threshold from the younger generation to the older generation. Tuning requires a series of experiments with different values for these parameters, and using specialized tools or just making good judgments to determine when garbage collection is performing well.

In version 5.0, two parameters based on the expected behavior of the application were introduced. These are

  • Maximum pause time target
  • Apply throughput targets.

Setting these targets can be used to tune garbage collection. It should be emphasized that goals cannot always be met. The application needs a heap large enough to hold at least all real-time data. Minimum heap size I was unable to meet these expectations.

 

Maximum pause time target

  • The pause time is how long the garbage collector stops the application and restores space that is no longer used. The intent of the maximum pause time goal is to limit the maximum duration of these pauses. The garbage collector maintains average pause times and average time differences. The average is taken from the start of execution, but weighted to allow for more recent pauses. If the variance of the average plus pause time is greater than the maximum pause time target, the garbage collector considers the target unmet.
  • Use the command line flag to specify the maximum pause time target

-xx: MaxGCPauseMillis = <NNN>

  • This is interpreted as a hint to the garbage collector that pauses of < NNN > milliseconds or less are required. The garbage collector adjusts the Java heap size and other garbage collection-related parameters to try to make the garbage collection pause time less than < NNN > milliseconds. By default, there is no maximum pause time target. These adjustments can cause garbage collectors to occur more frequently, reducing the overall throughput of the application. In some cases, the desired pause time target cannot be met.

 

  • The throughput collector is a generational collector, so there are separate collections of young and old generations. The mean and variance are retained for each generation. The maximum pause time targets are applied to the mean plus the variance of each generation set. Each generation may individually fail to meet pause time targets.

 

Throughput target

  • Throughput goals are measured in terms of the time spent collecting garbage and the time outside of garbage collection (called application time). The target is specified by the command line flag

-xx: GCTimeRatio = <NNN>

  • The ratio of garbage collection time to application time is

1 / (1 + < NNN >)

For example, -xx: GCTimeRatio = 19 sets the target of 1/20 or total garbage collection time to 5%.

 

  • The time spent in garbage collection is the total time that the younger generation and the older generation collect together. If the throughput goal is not met, the size of the generation is increased in an effort to increase the time the application runs between collections.

 

Footprint goals

  • If throughput and maximum pause time goals are met, the garbage collector reduces the heap size until one of the goals (always throughput goals) is not met. Then address the unmet goals.

 

Adjust the strategy

  • Do not choose a maximum for the heap unless you know you need a heap larger than the default maximum heap size. Select throughput goals that are sufficient to meet your application.
  • The heap grows or shrinks to a size that supports the selected throughput target. You can expect some oscillations in the heap size during initialization and during changes in application behavior.
  • If the heap grows to its maximum, this means that in most cases the throughput target cannot be met within the maximum heap size. Set the maximum heap size to a value that is close to the total physical memory on the platform but does not cause applications to swap. Execute the application again. If the throughput target is still not met, the application time target is too high for the available memory on the platform.
  • If the throughput target is met but the pause time is too long, select the maximum pause time target. Choosing a maximum pause time target may mean that your throughput target will not be met, so choose a compromise value that is acceptable for your application.

 

The size of the heap often oscillates as the garbage collector tries to meet competing goals. This is true even if the application has reached a stable state. The pressure to meet throughput goals (which may require larger heaps) competes with goals for maximum pause times and minimum footprint (both may require small heaps).

 

Other changes

Thread-local allocation buffer

Resize the thread-local allocation buffer. When the thread-local allocation buffer fills up, the size of the new buffer depends on the allocation mode of the thread. A thread that allocates more memory will get a larger buffer.

 

Measurement and Method

Use application-specific metrics to best measure throughput and footprint. For example, you can use a client-side load generator to test the throughput of a Web server, while you can use the pmap command to measure the server footprint on a Solaris operating system. Pauses due to garbage collection, on the other hand, can be easily estimated by examining the diagnostic output of the virtual machine itself.

Command line argument -verbose: GC prints information on each collection. Note that the format of -verbose: GC output may change between releases of the J2SE platform. For example, this is output from a large server application:

GC 325407K-> 83000K (776768K), 0.2300771 secs 0.2454258 secs] [Full GC 267628K-> 83769K, 1.8479984 secs]

Here we see two small sets and one major set. The numbers before and after the arrows

325407K-> 83000K (first line)

 

Represents the total size of live objects before and after garbage collection. After minor collections, counts include objects that are not necessarily alive but cannot be reclaimed because they are alive directly or because they are referenced within or from lifetime generations. Numbers in parentheses

(776768K) (in line 1)

 

Is the total available space, not counting the space of permanent generations, i.e. the total heap minus one survivor space. The secondary collection took about a quarter of a second.

0.2300771 seconds (first row)

The format of the main collection in line 3 is similar. Flag -xx: + PrintGCDetails Prints additional information about the collection. Additional information printed using this flag may change with each version of the virtual machine. Additional output with the -xx: + PrintGCDetails flag varies especially with the needs of Java virtual machine development. An example of -xx: + PrintGCDetails output from version 1.5 of the J2SE platform using the serial garbage collector is shown here.

[GC [DefNew: 64575K-> 959K (64576K), 0.0457646 secs]

Showing that minors collect about 98 percent of the younger generation,

DefNew: 64575K-> 959K

And it took about 46 milliseconds.

0.0457646 seconds

The overall heap utilization is reduced to about 51%

196016K-> 133633K (261184K)

And as shown last time, collecting (beyond the younger generation) has some additional overhead:

0.0459067 seconds

The -xx: + PrintGCTimeStamps flag will also print a timestamp at the beginning of each collection.

111.042: [GC 111.042: [DefNew: 8128K-> 8128K (8128K), 0.0000505 secs] 111.042: [Tenured: 18154K-> 2311K (24576K), 0.1290354 secs] 26282K-> 2311K (32704K), 0.1293306 secs]

The collection starts executing the application in approximately 111 seconds. Secondary collections begin around the same time. In addition, information about major collections depicted by Tenured is displayed. Lifetime generation usage decreased to about 10%

18154K-> 2311K

It took about 0.13 seconds.

0.1290354 seconds