01 | how to learn Linux performance optimization?

1. Linux Performance Tool Atlas

2. Mind mapping

02 | based article: how to understand the “load average”?

1. View the load average

The top or uptime command

$uptime 02:34:03 up 2 days, 20:14, 1 User, load Average: 0.63, 0.83, 0.88

The average loads of the past 1 minute, 5 minutes, and 15 minutes are displayed in sequence

2. Average load

The average load refers to the average number of processes in the running state and in the non-interrupted state per unit of time. For example, when the load average is 2, on a system with 2 cpus, it means that all the cpus are just fully occupied

The Runnable state refers to the process that is using CPU or waiting for CPU, that is, the process whose ps command is in R state (Running or Runnable). The uninterruptible state refers to the process that is in the kernel-mode uninterruptible process, that is, the D state of the ps command

When the average load exceeds 70% of the NUMBER of cpus, it is time to analyze and troubleshoot high load problems

Load average and CPU utilization

The load average includes not only the processes that are using the CPU, but also the processes that are waiting for the CPU and waiting for I/O

CPU usage is a statistic of CPU usage per unit time. It does not correspond to the average load

Case:

1) CPU intensive processes, using a large number of cpus will lead to a higher load average, which is consistent with the two;

2) I/ O-intensive processes, where waiting for I/ OS leads to an increase in average load, but CPU usage is not necessarily high;

3) A large number of processes are piled up and waiting for CPU scheduling will lead to an increase in average load and high CPU usage

Practice: Stress and sysstat

1) Stress is a Linux system stress testing tool used for abnormal processes to simulate scenarios with elevated load averages.

Sysstat includes common Linux performance tools for monitoring and analyzing system performance. The two commands for this package are mpstat and pidstat:

Mpstat is a common multi-core CPU performance analysis tool used to view performance indicators for each CPU in real time, as well as the average indicators for all cpus

Pidstat is a common process performance analysis tool used to view real-time performance indicators of processes, such as CPU, memory, I/O, and context switching

2) Case investigation

CPU intensive process case:

Pidstat, see which process % CPU is high, %wait is low, and it is most likely that this process is causing the CPU to speed up IO intensive processes. Pidstat, see which process has a higher %wait and a higher %CPU. This process is most likely to cause the CPU to run too many processes. Pidstat to see if there are many processes with a higher %wait

03 | based article: often say CPU context switching is what mean? (on)

1. CPU context

CPU context:includingCPU registers and program counters

**CPU context switch: ** First saves the CPU context of the previous task (i.e., CPU registers and program counters), then loads the CPU context of the new task to run the new task

Depending on the task, CPU context switching can be divided into several different scenarios: process context switching, thread context switching, and (hardware) interrupt context switching

** Why does a lot of process switching increase CPU load? ** Too much context switching, will consume CPU time in registers, kernel stack and virtual memory and other data preservation and recovery, shorten the process really run time, has become a major culprit of the system performance decline

2. Process context

System call: Two CPU context switches occurred during one system call (the same process was running during the system call)

Process context: user-space resources such as virtual memory, stack, and global variables, and kernel-space resources such as kernel stack, CPU registers, and program counters (CPU context)

Process context switch: Switching from one process to another process to run; Save the context of the current process and load the context of the next process (switching multi-user space resources than CPU context)

Process switching is triggered by CPU rotation, insufficient system resources required by the process (such as insufficient memory), sleep active suspension, hardware interruption, and process priority preemption

Thread context

When a process has multiple threads, these threads share resources such as virtual memory and global variables. In addition, threads have their own private data, such as stacks and registers

Switch between two threads belonging to the same process: virtual memory, global variables do not switch, only need to ** switch thread private data stack, and kernel stack, CPU registers, program counters (CPU context) ** and other non-shared data

Therefore, thread switching costs are much lower than process switching

Interrupt context

Interrupt context switch: does not involve the user mode of the process, and does not need to save and restore the virtual memory, global variables and other user mode resources of the process. Only the states necessary for the execution of the kernel-mode interrupt service routine are included, including CPU registers, kernel stack, hardware interrupt parameters, and so on.

04 | based article: often say CPU context switching is what mean? (below)

1. Check the system context switch

The vmstat tool

Context switch (CS) is the number of context switches per second

In (interrupt) is the number of interrupts per second

Running or Runnable (R) is the length of the ready queue, that is, the number of processes that are Running and waiting for the CPU

B (Blocked) is the number of processes that are in an uninterruptible sleep state

2, process context switch view

To see the details of each process, use pidstat with the -w option, and there are two columns that we will focus on

One is CSWCH, which represents the number of voluntary context switches per second. Context switch is a context switch caused by the process being unable to obtain required resources. For example, system resources such as I/O and memory are insufficient.

The other is NVCSWCH, which represents the number of involuntary context switches per second. Context switch refers to a process that is forcibly scheduled by the system because the time slice has expired. For example, a large number of processes are competing for CPU

There are more voluntary context switches, indicating that processes are waiting for resources and other issues such as I/O may occur.

There are more involuntary context switches, which means that processes are being forcibly scheduled, that is, competing for CPU, which means that CPU is indeed a bottleneck;

A higher number of interrupts indicates that the CPU is being used by interrupt handlers, and you need to look at the /proc/interrupts file to determine the specific interrupt type

3. Summary of investigation process

First check the system load through uptime

Then use mpstat and pidstat to determine whether the CPU is overloaded, processes are too busy, or IO is too much

Then vmstat was used to analyze the switching times and switching types to determine whether it was caused by too many I/OS or fierce process contention

05 | base paper: an application of CPU utilization rate has reached 100%, what should I do?

1. CPU usage

/proc/stat provides CPU and task statistics for the system

Important INDICATORS related to CPU usage:

User (often abbreviated to US) stands for user-mode CPU time

Nice (often abbreviated to NI) represents low-priority user-mode CPU time when the nice value of a process is adjusted to be between 1 and 19. Note that the value of nice ranges from -20 to 19, and the higher the value, the lower the priority

System (often abbreviated sys) stands for kernel CPU time

Idle (often abbreviated to ID) stands for idle time. Note that it does not include the time spent waiting for I/O

Iowait (often shortened to WA) represents the CPU time to wait for I/O.

CPU usage is the percentage of total CPU time spent other than idle time

2, view

Top shows the overall CPU and memory usage of the system, as well as the resource usage of each process. Ps only shows the resource usage for each process

Top does not subdivide user-mode and kernel-mode cpus for processes. So how do you look at the details of each process? pidstat

3, perf

Perf is a built-in performance analysis tool for Linux 2.6.31. Based on performance event sampling, it can be used not only to analyze various events and kernel performance of the system, but also to analyze performance problems of specific applications.