This paper is by studying the geek time column “Linux performance optimization of actual combat” 05 | base paper: an application of CPU utilization rate has reached 100%, what should I do?

CPU utilization

*** To maintain CPU time, Linux triggers time interrupts with a pre-defined beat rate (expressed as HZ in the kernel) and uses the global variable Jiffies to record the number of beats since startup. Each time a time interrupt occurs, the value of Jiffies increases by 1. The cadence HZ is optional for the kernel and can be set to 100, 250, 1000, etc. Different systems may have different values, but you can check its configuration by querying the /boot/config kernel option. For example, in my system, the beat rate is set to 250, which is 250 time interrupts per second.

$ grep 'CONFIG_HZ=' /boot/config-$(uname -r)
CONFIG_HZ=250Copy the code

Also, because metronomic HZ is a kernel option, user-space programs are not directly accessible. For the convenience of user-space programs, the kernel also provides a user-space beat rate, USERHZ, which is always fixed at 100, or 1/100 of a second. Thus, the user-space program does not need to care how much HZ is set in the kernel, because it always sees the fixed value USERHZ. Linux provides user space with information about the internal state of the system through the /proc virtual file system, and /proc/stat provides CPU and task statistics for the system. For example, if you only care about the CPU, you can execute the following command:

# only keep each CPU data $cat/proc/stat | grep ^ CPU CPU 280580 7407 286084 172900810 83602 583 0 0 0 0 cpu0, 144745, 4181, 176701 86423902 52076 0 301 0 0 0 cpu1 135834 3226 109383 86476907 31525 0 282 0 0 0Copy the code

The output here is a table. The first column indicates the NUMBER of cpus, such as CPU0 and CPU1, and the first column indicates the sum of all cpus. The other columns show the total number of CPU beats in different scenarios in USER_HZ, which is 10 ms (1/100 of a second), so this is actually the CPU time in different scenarios. Of course, you don’t have to memorize the order of each column here. Just remember to query the man Proc when you need it. However, you should be aware of the meaning of each column in the Man Proc documentation, as they are important indicators related to CPU usage, and you will see them in many other performance tools. Now, let me read it one by one.

  • User (often abbreviated to US) stands for user-mode CPU time. Note that it does not include the following nice times, but does include guest times.
  • Nice (often abbreviated to NI) represents low-priority user-mode CPU time when the nice value of a process is adjusted to be between 1 and 19. Note that the value of nice ranges from -20 to 19, and the higher the value, the lower the priority.
  • System (often abbreviated sys) stands for kernel CPU time.
  • Idle (often abbreviated to ID) stands for idle time. Note that it does not include time to wait for I/O (IOwait).
  • Iowait (often shortened to WA) represents the CPU time to wait for I/O.
  • Irq (often abbreviated to HI) represents the CPU time to process hard interrupts.
  • Softirq (often abbreviated si) represents the CPU time to process soft interrupts.
  • Steal (often abbreviated to ST) represents the CPU time consumed by other VMS while the system is running in the virtual machine.
  • Guest (often shortened to guest) represents the time that other operating systems are running through virtualization, that is, the CPU time that the virtual machine is running.
  • Guest_nice (gnICE for short), the time required to run a VM with a low priority.

    And what we usually sayCPU usage is the percentage of total CPU time spent other than idle time, can be expressed by the formula:



How do I check the CPU usage

Top and PS are the most commonly used performance analysis tools: Top shows the overall CPU and memory usage of the system, as well as the resource usage of individual processes. Ps only shows the resource usage for each process.

$top top - 11:58:59 up 9 days, 22:47, 1 User, Load Average: 0.03, 0.02, 0.00 Tasks: 123 total, 1 running, 72 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.3US, 0.3SY, 0.0Ni, 99.3id, 0.0wa, 0.0hi, 0.0Si, 0.0st KiB Mem: 8169348 total, 5606884 free, 334640 used, 2227824 buff/cache KiB Swap: 0 total, 0 free, 0 Used.7497908 Avail Mem PID USER PR NI VIRT RES SHR S %CPU % Mem TIME+ COMMAND 1 root 20 0 78088 9288 6696 S 0.0 0.1 0:16. 83 systemd root 20 2 0 0 0 0 S 0.0 0.0 0:00. 05 kthreadd 4 root 0-20 0 0 0 I 0.0 0.0 0:00. 0-0 at 00 kworker/H...Copy the code

In this output, top displays the average of all cpus by default, and you can switch to the per-CPU usage by pressing the number 1. Looking further down, the blank line is followed by real-time information for the process, with each process having a %CPU column that represents the CPU usage of the process. It is the sum of user-mode and kernel-mode CPU utilization, including CPU used by process user-space, kernel-mode CPU executed through system calls, and CPU waiting to run in the ready queue. In a virtualized environment, it also includes the CPU used to run the virtual machine. Use Pidstat, which is a tool for analyzing per-process CPU usage. The pidstat command below shows five sets of CPU usage for a process at one-second intervals, including:

  • User-mode CPU usage (%usr);
  • Kernel CPU usage (%system);
  • Running VM CPU usage (%guest);
  • Wait for CPU usage (%wait);
  • And total CPU usage (%CPU). In the last part of Average, the Average value of 5 groups of data was calculated.
Output a set of data every 1 second, $pidstat 15 15:56:02 UID PID %usr %system %guest % Wait %CPU CPU Command 15:56:03 0 15006 0.00 0.99 0.00 0.00 0.99 1 dockerd... Average: UID PID %usr %system %guest % Wait %CPU CPU Command Average: 0 15006 0.00 0.99 0.00 0.00 0.99 - dockerdCopy the code

What can I do if the CPU usage is too high?

Here are two of the most common and favorite uses of PERf for analyzing CPU performance problems. The first common usage is perf top. Similar to top, it can display the functions or instructions that occupy the most CPU clock in real time, so it can be used to find hot functions, as shown in the following interface:

$ perf top Samples: 833 of event 'cpu-clock', Event count (approx.): 97742399 Overhead Shared Object Symbol 7.28% perf [.] 0x00000000001F78A4 4.72% [kernel] [k] vsnprintf 4.32% [kernel] [k] Module_get_kallsym 3.65% [kernel] [k] _raw_spin_unlock_irqRestore...Copy the code

In the output result, the first line contains three data, which are Samples, event type, and Event count. In this example, perF collected a total of 833 CPU clock events out of a total of 97742399. Looking further down is a tabular format with each row containing four columns:

  • The first Overhead is the percentage of the symbol’s performance events in all samples, expressed as a percentage.
  • The second column Shared is the Dynamic Shared Object where the function or instruction resides, such as the kernel, process name, Dynamic link library name, and kernel module name.
  • The third column, Object, is the type of the dynamically shared Object. For example, [.] represents user-space executables, or dynamically linked libraries, while [k] represents kernel space.
  • The last column of Symbol is the Symbol name, which is the function name. When the function name is unknown, it is represented by a hexadecimal address. Using the above output as an example, we can see that the perf tool itself consumes the most CPU clock, but it only accounts for 7.28%, indicating that there is no CPU performance problem. You should be familiar with the use of perf top. Moving on to the second common usage, perf Record and perf report. Perf Top shows performance information in real time, but its disadvantage is that it does not store data and cannot be used offline or for subsequent analysis. And perF record provides the function of saving data, saved data, you need to use perF report parsing display.
$perf Record # Press Ctrl+C to terminate sampling [perf Record: Woken up 1 times to write data] [perf Record: Captured and wrote 0.452 MB Perf. data (6093 samples)] $perf Report # Shows reports like Perf TopCopy the code

Test the performance of this Nginx service

# test Nginx performance with 10 concurrent requests Test a total of 100 request $ab - 10 - c n 10000 http://192.168.0.10:10000/ This is ApacheBench, Version 2.3 < $Revision: 1706008 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, ... Requests per second: 11.63 [#/ SEC] (mean) Time per request: 859.942 [ms] (mean)...Copy the code

Run the top command on a new terminal and press the number 1 to switch to the per-CPU usage:

$ topCopy the code

How do you know which phP-FPM function is causing the CPU usage increase? So let’s analyze it using PERF. On the first terminal, run the following perf command:

$perf top -g -p 21515 $perf top -g -p 21515Copy the code

summary

CPU usage is the most intuitive and commonly used indicator of system performance, and the first indicator we usually focus on when troubleshooting performance problems. So it is important to be familiar with its meaning, especially to understand the user (%user), Nice (% Nice), system (%system), waiting I/O (% IOWAIT), interrupt (%irq), and soft interrupt (%softirq) CPU usage. Such as:

  • If the user CPU and Nice CPU are high, the user process occupies a large number of cpus. Therefore, check the process performance.
  • If the system CPU is high, it indicates that the kernel mode consumes more CPU. Therefore, the performance of the kernel thread or system call should be checked.
  • If the I/O waiting time for the CPU is high, the I/O waiting time is long. Therefore, check whether the SYSTEM storage system has AN I/O problem.
  • The high value of soft and hard interrupts indicates that the handlers of soft and hard interrupts occupy more CPU. Therefore, the interrupt service routines in the kernel should be checked. When you have an increase in CPU usage, you can use tools such as Top and Pidstat to identify the source of the CPU performance problem. Then use tools such as PERf to identify specific functions that are causing performance problems. thinking

Purchase if required

This article is published by OpenWrite!