Often, a performance report that says only that CPU usage is high does not help to locate a problem. Because of the high CPU, there are many different situations. The CPU has five states (US sy id wa st), which can be shown in vmstat. When the code consumes CPU, which is often encountered in performance analysis, it is the CPU in the US state. Of course, there is also a case where the system calls generated by the code are particularly high, in which case SY’s CPU is also high (this is a rare case, I’ve only seen it once in my career). In the case of the Java language, we do not need a particularly complex Profile tool to locate the code.

Before writing the detailed analysis method, we need to talk about thread state transitions. Let’s look at system-level thread state transitions.



From this transition, you can see that after the thread spawns, it will reach the ready state. In this state, you are waiting for the CPU. The runing state is actually executed on the CPU. Notice the difference.

The r column shown by vmstat contains both ready and running threads (this varies by operating system, but this is true for most Linux systems). Note this because a lot of vmstat explanations on the web say that the R column is the number of processes running or that the R column is the number of threads running, which is not true. Here is an example (and this is the example below) :



This is top running on one of my cloud servers. You can see that only one of the current tasks is in the running state. What about vmstat?



The server has only two CPUs, so if R is referring to the number of processes or threads running, it is definitely incorrect because two CPUs can run at most two threads at the same time.

So keep in mind that this R value includes both threads waiting for the CPU (i.e., ready) and running threads (i.e., running).

In the future, there will be time to explain other system-level thread states. Some people may think that there is nothing to explain other states, but in performance analysis, thread states are related to some performance counters. For example, in the suspended state, the CPU’s time slices are used up and it is temporarily swapped out. Blocked occurs when a condition is blocked while waiting for it to be met; And both of these states are likely to result in high CPU usage. In the process of analysis, this information gives us a direction. So just saying CPU is high doesn’t help you, because CPU is high for a variety of reasons.

Since this is Java, take a look at thread state transitions in Java.



From this diagram, we can see that the Java process has a variety of states (please search for the specific state explanation), how to see what these states are doing, you need to type out the stack. In the stack, you can see the corresponding code (for other compiled languages, you can also see the code running on the stack). In addition, in the performance analysis, stack analysis is a very important part of the content, because today is just to explain how to position from the CPU high to the code layer, so I will not explain the state of the thread more, I will write the article later.

Example:

To do this, first execute a CPU-consuming Java instance (the example was written by a member of the 7D Group, if you are interested in doing this you can find a small example online or write one yourself). Check the status of vmstat.



From the image above, you can see that the left window is executing a CPU consuming Demo, while the right window shows that the current system is completely CPU consumed. The process number can be found with the top command:



Now it’s time to look at which threads in this process are consuming CPU.

As you can see from pidstat (sorry, I overwrote the screenshot of pidstat and didn’t want to start again so I didn’t take the screenshot), there are 10 threads consuming CPU resources. I put the command here, you can do it yourself if you are interested.

pidstat -p 10846 -u -d -t -w -h 1 1000

As you can see from the above command, there are multiple threads consuming the CPU. Thread IDs are: 10861, 10862, 10863, and so on.

Do Thread Dump with jstack.

[root@7dgroup ~]# jstack -l 10846 > 10846.threaddump

Open the generated file again.



NID refers to Native ID, which corresponds to system-level TID. TID is a 10-process display, NID is a hexadecimal display.

Let’s convert a thread number to find it.

[root@7dgroup ~]# printf %x’\n’ 10861

2a6d

This corresponds to a Threaddump file.

The picture

Obviously you can look up line 13 of this cputEstThreadDemo. Java.



As you can see from this example, the analysis of the high CPU consumption of Java code can be done only with system-level commands and the commands that come with the JDK. Because this example is very simple, the steps are clear. But in the case of an application with a lot of code and a lot of logic, what you might see is that the CPU is constantly switching between threads, so you might need to do more thread dump, one by one. Of course, with the help of some tool analysis, it is often possible to analyze complex applications with less effort.

Here’s just one idea.

There have been a lot of articles written by a lot of people on Java analysis, and the reason I wrote this article is to make it more series.