It focuses on the JVM memory structure, along with simple tuning scenarios.

preface

The JVM series should belong to the high-level Java content, which was intended to be learned after the accumulation of basic Java knowledge for a period of time. However, due to the need to follow up the online problems in my work, I urgently need to supplement this knowledge. This article is mainly to make a simple record of the knowledge learned, and then record the process of problem analysis.

The JVM problem

I just transferred to the PERSONNEL department of Xiaomi and found that the two machines on the line have been GC with a very high frequency. The following is the GC log:

Here’s what you need to know if you want to understand this GC log.

Garbage collection algorithm

How do I determine if an object is dead?

In general, there are two ways to determine if an object has been destroyed:

  • Reference counting algorithm: adds a reference counter to an object that increments by one each time the object is referenced in one place; Each time an object reference is invalidated, the counter decreases by 1. When the counter is 0, the object is not referenced.
  • Reachability analysis algorithm: Searches along the reference chain starting with a series of root nodes called “GC Roots”, and objects on the reference chain are not recovered.

As shown in the figure above, green objects that are on GC Roots’ reference chain are not collected by the garbage collector, and gray objects that are not on GC Roots’ reference chain are considered recyclable.

So, what is this GC Roots? The following list of objects can be used as GC Roots:

  • Objects referenced in the Java virtual machine stack, parameters, local variables, temporary variables, and so on called by individual threads.
  • An object referenced by a class static attribute in a method area, such as a static variable that refers to a type.
  • The object referenced by the constant in the method area.
  • Objects referenced in the local method stack.
  • Internal references to Java virtual machines, Class objects corresponding to basic data types, and some resident exception objects.
  • Objects that are held by synchronized.

Garbage collection algorithm

Tag – Cleanup algorithm

Mark by name – The cleanup algorithm marks invalid objects and then clears them.

For the mark-clean algorithm, you must clearly see that after garbage collection, there is a lot of debris in the heap space, and there is a disorder. When allocating memory for large objects, sufficient contiguous memory space cannot be found and garbage collection has to be triggered again. In addition, if there are a large number of garbage objects in the Java heap, then the garbage collection will have to do a lot of marking and cleaning, which will inevitably lead to a decrease in the efficiency of the collection.

Replication algorithm

The mark-copy algorithm splits the Java heap into two pieces, uses only one piece for each garbage collection, and then moves all the surviving objects to another area.

An obvious disadvantage of the mark-copy algorithm is that only half of the heap space is used at a time, resulting in a decrease in Java heap space usage.

Tag – Collation algorithm

Tagging – The collation algorithm is a kind of compromise garbage collection algorithm that performs the same steps as the previous two in the process of tagging objects. However, after the marking, the live objects are moved to one end of the heap, and the area outside the live objects is cleaned up directly. In this way, memory fragmentation is avoided, and there is no waste of heap space. However, every time garbage collection is performed, all user threads are suspended, especially for older objects, which takes longer to collect, which is very bad for the user experience.

Garbage collector

Garbage collector in HotSpot VM and applicable scenarios:

The memory model

Antimemory model

The memory model has several important points:

  • JVM memory is divided into heap memory and non-heap memory. Heap memory is also divided into young generation and old generation. Non-heap memory is a permanent generation.
  • The default ratio of Young to Old is 1:3.
  • The young generation will be divided into Eden and Survivor, and Survivor will be divided into FromPlace and ToPlace. The default ratio of Eden, FromPlace and ToPlace is 8:1:1.

GC type

  • Minor /Young GC: Garbage collection for the new generation;
  • Major GC/Old GC: Garbage collection for older generations.
  • Full GC: Garbage collection for the entire Java heap and method areas.

How Minor GC works

Typically, objects created for the first time are stored in the Eden region of the new generation, and when the first Minor GC is triggered, the surviving objects in Eden region are moved to a region in Survivor region. The next time the Minor GC is triggered, objects in the Eden zone are moved to another Survivor zone, along with objects in one Survivor zone. As you can see, we only use one of the two Survivor zones at a time, wasting only one Survivor zone.

Two points to note:

  • Each time an object undergoes a garbage collection, its generational age increases by 1, and when the generational age reaches 15, it is placed directly into the old age.
  • When allocating memory to a large object, there is no more memory space in Eden, and the large object will go to the old age.

Full GC works

The old age is where long-lived objects are stored, and when it fills up, it triggers the Full GC, the most commonly heard of GC, during which all threads are stopped waiting for the GC to complete. For response-demanding applications, Full GC should be minimized to avoid response timeouts.

A few points to note:

  • Full GC takes longer and occurs much less frequently than Minor GC, which means performance issues.
  • The mark-clear algorithm generates a large amount of memory fragmentation, and if sufficient contiguous memory cannot be found for large objects in the future, a GC collection operation will be triggered early.

Both The Minor and Full GC produce pauses, known as stop-the-world. Minor GC pauses are short, while Full GC pauses are long and the system is unresponsive, which greatly affects system performance. Therefore, monitoring and performance analysis of Full GC logs is extremely important in performance tuning.

The GC log

Enabling GC Logs

Steal the lazy, directly posted online content:

Understanding GC Logs

Minor GC logs:

Full GC log:

Common parameters for the JVM

In fact, there are some printing and CMS parameters, I will not list them here.

GC log analysis and optimization

Online machine configuration:

  • A 16 gb memory
  • 4 core CPU

Before optimization

Back to our original screenshot:

Through analysis and calculation, the following data can be obtained:

  • Old generation: 5870976/(1024*1024) = 5.6g
  • New generation: 546176/1024 = 533M
  • Eden: 273152/1024 = 266M
  • From: 273024/1024 = 266M
  • To: 273024/1024 = 266M

The following conclusions are drawn:

  • Cenozoic + Old age = 5.6 + 533/1024 = 6.1g
  • Cenozoic: Old age = 533: (5.6*1024) = 1:10.7
  • Edem:From:To = 1:1:1

Let’s look at the online configuration again:

Verify the calculation result using the configuration:

  • “-xmx6000m-xMS6000m”, can determine the JVM memory size of 6000/1024=5.8G, the previous calculation of the heap size of 6.1g, roughly match (excess may be allocated to eternal generation)
  • “-xmn800m”, it can be determined that the Cenozoic is 800M, and the Edem+From+To is 798M, basically matching.
  • “XX:SurvivorRatio=1”, here is a calculation formula, you can baidu, the formula is Edem:From:To = 1:1:1, which is exactly matched with our calculation results.

SurvivorRatio computation formula is: blog.csdn.net/flyfhj/arti…

The optimized

Points to be optimized:

  • Currently less than half of the memory is being used and the JVM memory needs to be resized;
  • Edem’s memory is too small, only 266M, and this is the main reason for frequent Minor GCS that need to be scaled up;
  • The ratio of new generation to old generation should be adjusted from 1:10.7 to 1:2
  • The Edem:From:To ratio of the new generation needs To be adjusted From 1:1:1 To 8:1:1

Optimized configuration:

Optimized online log:

Heap before GC invocations=3 (full 1): par new generation total 2764800K, used 2524705K [0x00000005cc000000, 0x0000000687800000, 0x0000000687800000) eden space 2457600K, 100% used [0x00000005cc000000, 0x0000000662000000, 0x0000000662000000) from space 307200K, 21% used [0x0000000674c00000, 0x0000000678d885c0, 0x0000000687800000) to space 307200K, 0% used [0x0000000662000000, 0x0000000662000000, 0x0000000674c00000) concurrent mark-sweep generation total 5120000K, used 15613K [0x0000000687800000, 0x00000007c0000000, 0x00000007c0000000) Metaspace used 62116K, capacity 62680K, committed 63288K, Reserved 1105920K Class space used 6639K, Capacity 6781K, committed 6816K, reserved 1048576K [Allocation Failure) 35.225: [ParNew: 2524705K-> 252467k (2764800K), 0.2682475secs] [Times: User sys = = 1.05 0.00, real = 0.27 secs]Copy the code

Optimized results:

  • The JVM has a memory size of 10000M, or about 9.7GB
  • Edem’s memory size is 2.6 GIGABytes, which is 10 times larger
  • The ratio of Cenozoic generation to old generation is 1:2
  • The ratio of Edem:From:To is 8:1:1

This is probably not optimal at the moment, as the JVM memory size should continue to grow, so we need to watch it online for a while and then see how we can optimize it further.

Welcome everyone to like a lot, more articles, please pay attention to the wechat public number “Lou Zai advanced road”, point attention, do not get lost ~~