The original link: http://www.dubby.cn/detail.html?id=9059

1. An overview of the

Hardware and software requirements

Operating system requirements: Windows XP or higher, Mac OS X or Linux. Please note that these tests were done on Windows 7 and have not yet been tested on all platforms. However, everything should work fine on OS X or Linux. Of course, it’s even better if your machine has more than one core.
Java 7 Update 9 or later.
Latest Java 7 Demos and sample Zip.

Prepare the content

Java 7U9 or later installed.
Download the sample code from the official website and unzip it, such as C:\ Javademos.

2. Java and the JVM

Java preview

Java is a programming language and computing platform first released by Sun Microsystems in 1995. It is the fundamental technology that supports Java programs, including general-purpose tools, games, and business applications. Java runs on more than 850 million personal computers worldwide and billions of devices around the world, including mobile and TV devices. Java is made up of many key components that together make up the Java platform as a whole.

Java runtime version

When you download Java, you already have the Java Runtime environment (JRE). The JRE consists of a Java Virtual Machine (JVM), Java core class libraries, and supporting Java class libraries. All three components need to be installed if you want to run Java programs on your computer. With Java 7, you can run Java applications on the operating system, install and run Java applications from the Web using Java Web Start, or run them as a Web embedded application in a browser (JavaFX).

Java programming language

Java is an object-oriented programming language with the following features.

Platform independence — Java applications are compiled into bytecode, stored in class files, and loaded by the JVM. Because Java applications run in the JVM rather than directly on the operating system, they can run on various operating systems. (Translator: Write once, run anywhere, JVM makes platform compatibility for me, of course, can’t really be platform independent)
Object-oriented – Java incorporates many of the features of C and C++ with some optimizations.
Automatic garbage Collection – Java automatically allocates and frees memory without the programmer’s burden. (Translator: But you wouldn’t be reading this article if you knew more about GC mechanics.)
Rich standard library – Java has many pre-designed classes that we can use directly, such as input and output, network, date, etc.

JDK

The Java Development Kit (JDK) is a series of toolkits required to develop Java applications. With the JDK, you can compile and run programs you write in Java. In addition, the JDK provides tools for packaging and distributing applications.

The JDK and JRE share a Java application program interface (Java API). Java apis are pre-packaged libraries that developers can use directly. Java APIS make it easier for developers to develop things like string handling, time handling, networking, and collections of various data structures (e.g. Lists, Maps, Stacks, and Queues).

JVM

The Java Virtual Machine (JVM) is an abstract computer. The JVM is a program that looks like a computer and executes programs written into the JVM. In this way, Java programs are written to the same set of interfaces and libraries. Java programming instructions are translated into instructions and commands to run on the local operating system for each JVM implementation of a particular operating system. In this way, Java programs are platform independent.

The first prototype implementation of the Java Virtual Machine, completed by Sun, simulated the Java virtual machine instruction set in software hosted by handheld devices similar to contemporary personal digital assistants. Oracle’s current VIRTUAL machine implements Java virtual machines on mobile, desktop, and server devices. But the Java virtual machine is not responsible for any specific implementation technology, host hardware, or host operating system. It has no inherent interpretation (just a specification) and you can do it by compiling its instruction set into a silicon CPU. It can also be implemented in microcode or directly in silicon.

The Java virtual machine knows nothing about the Java programming language, except the specific binary format, the class file format. Class files contain Java virtual machine instructions (or bytecode) and symbol tables, along with other auxiliary information.

For security, the Java virtual machine imposes strong syntactic and structural restrictions on code in class files. However, the Java virtual machine can host any language that has functionality that can be represented in valid class files. Because of this, implementers of many other languages can compile their languages into class files and hand them over to the JVM to execute in order to enjoy the traversal that the JVM provides.

Exploring the JVM Architecture

The Hotspot of the architecture

The HotSpot JVM has an infrastructure that supports powerful features and capabilities, as well as the ability to achieve high performance and large-scale scalability. For example, the HotSpot JVM JIT compiler generates dynamic optimizations. In other words, they make optimization decisions while the Java application is running and generate high-performance local machine instructions for the underlying system architecture. In addition, with its mature development and ongoing engineering of its runtime environment and multi-threaded garbage collector, HotSpot JVM is highly scalable even on the largest available computer systems.

The main components of the JVM include the classloader, the runtime data area, and the execution engine.

Key components of Hotspot

The following figure highlights the key components of the JVM as they relate to performance.

There are three components that the JVM focuses on when tuning performance. The heap is where your object data is stored. This area is managed by the garbage collector selected at startup. Most tuning options are based on the size of the heap and choosing the best garbage collector for your situation. JIT compilers also have a significant impact on performance, but rarely need to be tuned using newer JVMS.

Performance based

Typically, when tuning a Java application, the focus is on one of two main goals: responsiveness or throughput. We will review these concepts as the tutorial progresses.

responsiveness

Responsiveness refers to how quickly an application or system can respond to a request. Here’s an example:

How quickly desktop applications respond to UI events (clicks, swipes, etc.).
The speed at which a website returns pages.
The speed at which database query results are returned.

For an application focused on responsiveness, long pauses are not acceptable. The goal of optimization is generally to speed up response.

throughput

Throughput is concerned with how much work an application or system can accomplish in a given amount of time. Here’s an example:

The number of things done at a given time.
The number of jobs that can be completed in a batch in an hour.
The number of queries the database can complete in an hour.

Long pauses are acceptable for applications focused on throughput. Because the focus is on productivity over a longer period of time, rather than closing a request as quickly as possible.

3. G1 collector

The G1 Collector (garbage-First Collector) is a server-side, multi-processor, large-memory scenario. The G1 collector can meet the expected pause times with a high probability, while achieving high throughput. The G1 collector has been supported since JDK 7 Update 4. The G1 collector is designed for the following applications:

Can run concurrently with application threads such as the CMS collector.
Defragmentation of free memory is completed in short pause times.
More predictable GC pause durations are required.
You don’t want to sacrifice too much throughput.
No need for a larger Java heap.

The G1 is planned as a long-term replacement for the concurrent Trademark Scanning Collector (CMS). Comparing G1 to CMS, there are some differences that make G1 a better solution. One difference is that G1 is an implementation of a compression algorithm. G1 compacts space sufficiently to avoid fine-grained freelists for allocation, relying instead on regions. This greatly simplifies the implementation of the collector and largely eliminates potential fragmentation problems. In addition, G1 provides more predictable garbage collection pauses than the CMS collector and allows users to specify desired pause targets.

Summary of G1

Previous garbage collectors (Serial, Parallel, CMS) structured the heap into three regions: new generation, old generation, and permanent generation.

All objects die in one of these blocks.

The G1 collector divides heap memory in a different way.

The heap is divided into a set of equally sized heap regions, each of which is a contiguous virtual memory range. Each region is assigned Eden, Survivor, or Old, but they have no fixed size. This provides greater flexibility in memory usage.

When performing garbage collection, G1 operates like the CMS collector. G1 performs a concurrent global marking phase to determine the activity of objects throughout the heap. After the marking phase is complete, G1 knows which areas are mostly empty. It first collects these areas, which tend to produce a lot of free space. This is why this method of garbage collection is called garbage first. As the name suggests, G1 focuses its collection and compression activities on the area of the heap that might be filled with recyclable objects, the garbage. G1 uses a pause prediction model to meet a user-defined pause time goal and selects the number of regions to collect based on the specified pause time goal.

The areas marked by G1 that are ripe for recycling are the garbage to be collected. G1 copies objects from one or more regions of the heap to a single region on the heap, contracting and freeing memory in the process. This evacuation is performed in parallel across multiple processors to reduce pause times and improve throughput. So, for each garbage collection, G1 continuously reduces fragmentation, working within user-defined pause times. This is beyond the capabilities of the previous two methods. The CMS (Concurrent mark scan) garbage collector does not perform compression. ParallelOld garbage collection only performs full heap compression, resulting in considerable pause time.

Note that G1 is not a real-time collector. It meets the pause time target with high probability, but not absolute certainty. Based on previously collected data, G1 estimates how many regions can be collected within a target time period specified by the user. As a result, the collector has a fairly accurate model of the cost of collecting areas and uses it to determine which areas to collect and how many areas to collect while staying within the residence time target.

Note: G1 has concurrent (running with application threads, such as refine, mark, clear) and parallel (multi-threaded, such as Stop the World) phases. Full GC is still single-threaded, but if you tune it properly, your application should be able to avoid Full GC.

G1 memory usage

If you migrate from ParallelOldGC or CMS to G1, you’ll notice that you seem to have more memory. This mainly relates to “statistical” data structures such as Remembered Sets and Collection Sets.

Remembered Sets or RSets Where tracing objects are applied. Each region of the heap has a TSet. Rsets can be parallel, independent references to objects in a region of the phone. The memory usage of RSets is less than 5%.

Collection Sets or CSets will be collected in a GC. All living objects are evacuated (copied/moved). CSets can be Eden, Survivor, or Old Generation. CSets occupy less than 1% of the memory.

The G1 scenario is recommended

G1’s first focus is to provide a solution for users running applications that can guarantee limited GC latency and a large heap. This means a heap size of about 6GB or more and a stable predictable pause time of less than 0.5 seconds.

Applications currently running using the CMS or ParallelOldGC garbage collector will benefit from switching to G1 if the application has one or more of the following features.

Full GC lasts too long or too frequently.
Object allocation or promotion rates are significantly different.
Don’t want long GC pauses (more than 0.5 to 1second)

Note: If you’re using CMS or ParallelOldGC and your application doesn’t experience long GC pauses, you can stay the same. Even if you don’t use the G1 collector, you can still use the latest JDK.

Review the CMS collector

Review generational GC and CMS

Concurrent mark scan (CMS) collectors (also known as concurrent low-pause collectors) collect lifetime generations. It tries to minimize pauses due to garbage collection by performing most of the garbage collection work simultaneously with the application threads. Typically, concurrent low-pause collectors do not copy or compress live objects. Live objects are not moved when garbage collection is complete. If fragmentation becomes a problem, allocate a larger heap.

Note: The younger CMS collectors use the same algorithm as the parallel collectors.

Collection phase of CMS

CMS performs the following steps when collecting older generations:

phase	describe
1. Initialize the tag (Stop the World)	Objects of the older generation are “marked” as reachable, including objects that the younger generation may reach. Pause times are generally short.
2. Concurrent markup	As the application thread executes, it iterates through the old objects concurrently, generating an object graph of the reachable objects. This reachability analysis is performed in phases 2, 3, and 5, and the scanned objects are immediately marked as alive.
3. Re-mark (Stop the World)	To find theConcurrent marking phaseMissed objects, that is, objects updated by the Java application thread after the collector has finished tracing them.
4. Concurrent clearing	Collect objects that have been marked as unreachable during the marking phase. Dead objects are added to the Free List for later allocation. Dead objects may be merged at this point. Note that live objects are not moved.
5, reset	Clear the collected statistics to make preparations for the next collection.

Review the steps for garbage collection

1. CMS heap structure

The heap was split into three parts.

The Cenozoic was split into Eden and two Suvivor regions. The old age is a continuous space. Object defragmentation is not normally done unless it is a Full GC.

2. How does Young GC work

The new generation is marked green and the old is marked blue. If your application has been running for a while, your virtual machine memory should look like this. In the old days, memory was scattered.

With CMS, older objects are recycled when appropriate, and again, living objects are not collated unless a Full GC is performed.

3. Collection of the new generation

Living objects are copied from Eden and Suvivor to another Suvivor. If the object’s age has reached the threshold, it is promoted to the old age.

4. After Young GC

After a Young GC, the Eden sector and one of the Suvivor will be emptied.

In this picture, the dark blue is the object that has just been promoted from the new generation to the old generation. In the new generation, green objects are objects that have not yet met the criteria for promotion.

5. CMS old collection

There are two phases that Stop the World: initial marking and re-marking. When the object space usage of the old era reaches a threshold, the CMS begins.

(1) The initial tag has a short pause to mark reachable objects. (2) The concurrent marking stage is the concurrent marking of living objects during the execution of the application program. This is followed by (3) relabeling to find the living objects missed in (2).

6. Old age collection — concurrent cleanup

Objects that have not been marked in previous phases are freed without defragmenting memory.

Note: Unmarked objects == dead objects

7. Old Age Collection — after removal

After phase (4) collection, you can see that many objects are freed. You may also notice that memory fragmentation still exists. (Translator: I can’t take it anymore. This sentence has been used thousands of times.)

The CMS then completes the (5) reset and waits for the next GC to arrive.

Step by step, approach G1

The G1 collector allocates heap memory differently than it used to.

1. G1 heap structure

Heap memory is an area of memory divided into many fixed sizes.

The size of each region is determined at JVM startup. The JVM typically parses into 2000 regions, each of which is 1 to 32Mb in size.

2. G1 memory allocation

Each small area represents Eden, Suvivor or Old.

The colors on the picture show what each region represents. When collecting, live objects are moved from one region to another. Each region can either Stop the World or collect without parallel.

Each small area can represent Eden, Suvivor or Old. In addition, there is a fourth type of area for storing large objects. In general, objects that exceed 50% of the size of a single region are assigned to the fourth region. This fourth region is a collection of contiguous regions. The fourth type of region is the unallocated region that we see.

Note: At the time of writing this article, large object collection is not optimized, so it is recommended to avoid such large object allocation as much as possible.

3. The New generation of G1

The heap is split into 2000 small areas ranging from 1Mb to 32Mb. Blue represents the old and green represents the new.

Note: There is no need for the previous collector to allocate the new generation and future generations to contiguous memory, under G1, the new generation and the old generation can be scattered.

4. Young GC in G1

Living objects are transferred (copied/moved) to another or more suvivor regions. If the age reaches the threshold, it is assigned to the Old area.

This process stops the World. During this process, a lot of information will be collected, such as the size of Eden, the size of Suvivor, and the pause time of this collection, so as to prepare for the next collection.

This way, it is easy to resize the size of each area.

5. After G1’s Young GC

The living objects have been moved to another Suvivor or old zone.

To summarize, the G1’s Young GC features:

The heap is broken up into zones.
The Cenozoic consists of some discontinuous regions. This makes it easy to expand or shrink the size of the new generation.
Young GC will Stop the World.
The Young GC is multithreaded and parallel.
Living objects are copied and moved to the suvior or old region.

G1 old age collection

Like CMS, G1 is designed to be a low-pause GC collector. The table below describes the old age collection phase of G1.

G1 Collection phase – Concurrent tag cycle phase

The G1 old age collection steps are as follows, and note that some of these steps are part of the Young GC.

phase	describe
1. Initial tag (Stop the World)	This will Stop the World. He hitches a ride with the Young GC and flags objects from the old age that the New generation (root regions) can refer to.
2. Root region scanning	Scan the new generation to find out which objects from the old age are referenced by objects from the new generation. This phase does not interrupt the execution of the application. This phase must be completed before the Young GC occurs.
3. Concurrent markup	Find living objects in the entire heap. This executes concurrently with the application. However, this phase can be interrupted by the Young GC.
4. Re-mark (Stop the World)	Completes the marking of live objects. Use SATB (snapshot-at-the-beginning) (this algorithm is much faster than CMS algorithm)
5, Clear (Stop the World also concurrent)	1. Count live objects and Stop the World; 2. Clear RSets(Stop the World). 3. Reset the Free area and reclaim it to the Free List (concurrent)
*, Stop the World	Stop the World to copy and move live objects to a new unused area. If only the Cenozoic were evacuated, the log would be`GC pause (young)`If both the Cenozoic and the old are evacuated, the log is recorded as`GC Pause (mixed)`

Now that we have a rough definition of each stage, let’s take a closer look at what each step actually does.

6. Initial marking phase

The initial tag is performed along with the Young GC, which is GC Pause (Young)(Inital -mark) if you look at the GC log.

7. Concurrent marking phase

If an empty area (marked “X”, where objects are dead) is found, it is removed directly during the re-marking phase. Again, this information is counted and used to optimize the next GC.

8. The relabeling phase

Empty areas are removed and recycled. The liVENESS of objects in all areas was calculated.

9. Copy/clean phase

The G1 collector selects the region with the lowest level of object activity for collection. New generation and old age are recycled at the same time. In this case, the GC log is GC Pause (mixed). In this way, the younger and older colleagues are recycled.

10. After the copy/clean phase

The selected areas are recycled and compressed, resulting in dark blue and dark green.

Summary old GC

The features of the G1’s older GC are:

Concurrent marking phase
- The activity of each area is calculated concurrently while the application is running.
- Determine which areas are most worth recycling based on activity.
- There is no CMS cleanup phase.
Relabeling stage
- Use snapshot-at-the-beginning (SATB) algorithm, which is more efficient than CMS’s algorithm.
- Areas that are completely empty are recycled.
Copy/clean phase
- New generation and old age are recycled at the same time.
- Selection in the old days was based on activity.

6. Command-line options and best practices

Basic command line

To use the G1 collector, we need to use -xx :+UseG1GC

Here we use demo (first you need to go to your demo directory, demo/ JFC /Java2D),

java -Xmx50m -Xms50m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -jar  Java2Demo.jar
Copy the code

Main Parameters

-xx :+UseG1GC — tells the JVM to use the G1 collector. -xx :MaxGCPauseMillis=200 — Set a maximum pause time. This is a soft goal, meaning that the JVM will do its best to satisfy your goal. As a result, sometimes you may not be able to meet your requirements. The default value is 200ms. XX: InitiatingHeapOccupancyPercent = 45 – total heap using proportional to the value, began to a trip to the GC. Is the proportion of the total heap, not the proportion of a generation. 0 means GC is always iterated, and the default is 45.

The best implementation

Here are some suggestions for best practices for using G1.

Do not set the new generation capacity

If you use -xMN to specify the size of the new generation, you interfere with G1’s behavior.

G1 will not adhere to the pause times you expect, that is, this option will be turned off-XX:MaxGCPauseMillis.
G1 will not be able to dynamically expand and contract your Cenozoic, as specified.

Use response time as the standard

Instead of using average response time for XX:MaxGCPauseMillis=

, consider using 90% or more of your expected response time. That means 90% of users (clients /?) The request response time does not exceed the preset target value. Because this value is only a target value, there is no exact guarantee that it will be met.

Transfer failed?

Promotion failure occurs when the HEAP area of the JVM is insufficient when GC is performed on survivors or Promoted Objects. Heap memory cannot be expanded because it is already at its maximum. You can use -xx :+PrintGCDetails to print to-space overflow if the migration fails. This operation is expensive!

GC still has to continue, so space must be freed.
Objects that fail to be copied must be placed in the appropriate place.
Any updated RSets in the CSet region must be regenerated.
All of these operations are costly.

How do I avoid transfer failures?

Increase heap memory.
- increase-XX:G1ReservePercent=n, the default is 10.
- G1 uses a reserved memory, creating a false memory upper limit that is used when memory fails. (Translator: Leave everything on the line so we can meet each other later.)
Perform GC earlier.
use-XX:ConcGCThreads=nTo increase the thread of GC execution.

Complete G1 command-line options

Here are the full command line options for G1, so keep the best practices in mind when using them.

Options and default values	describe
-XX:+UseG1GC	Using the G1 collector
-XX:MaxGCPauseMillis=n	Set an expected pause time, remembering that this is only a soft purpose that the JVM will try to achieve
-XX:InitiatingHeapOccupancyPercent=n	Percentage of heap memory footprint when a concurrent GC cycle is started. Garbage collectors like G1 use it to trigger concurrent GC cycles based on the utilization of the entire heap, not just the memory usage ratio of a particular generation. A value of 0 means “always execute GC loop “. The default value is 45.
-XX:NewRatio=n	The size ratio of new to old is 2 by default
-XX:SurvivorRatio=n	The Eden/Suvivor ratio is 8 by default
-XX:MaxTenuringThreshold=n	The age at which the object is promoted, the default is 15
-XX:ParallelGCThreads=n	The number of threads used in the collector concurrent phase. The default value depends on the platform on which the JVM is running
-XX:ConcGCThreads=n	Set the number of threads for the collector. The default value depends on the platform on which the JVM is running
-XX:G1ReservePercent=n	Set G1 reserved memory to prevent transfer failure
-XX:G1HeapRegionSize=n	The G1 collector subdivides heap memory into many small areas of uniform size. This option is to set the size of each region and the default value is calculated based on the total amount of heap. It ranges from 1 Mb to 32 Mb

JVM — G1 collector