The top command, which allows users to monitor process and system resource usage on Linux, is one of the most useful tools in the system administrator’s toolbox and comes preinstalled with every distribution. Unlike other commands such as ps, it is interactive, allowing us to browse the list of processes, terminate processes, and so on. In this article, we’ll learn how to use the top command.

Getting started

The top command is very simple. You only need to enter top in the terminal. The top directive launches an interactive command line application, as shown below, with the top half of the output containing statistics about process and resource usage and the bottom half containing a list of currently running processes. You can use the arrow keys and the page up/down keys to navigate the list. If you want to quit, just press the Q key.

$top - 21:07:28 Up 21 days, 4:31, 1 User, Load Average: 0.12, 0.06, 0.07 Tasks: 33 total, 1 running, 31 sleeping, 0 stopped, 1 zombie %Cpu(s): 0.2US, 0.5sy, 0.0Ni, 89.7id, 0.0wa, 0.0hi, 0.0si, 9.6st KiB Mem: 33554432 total, 31188884 free, 513100 used, 1852448 buff/cache KiB Swap: 2097148 total, 2097148 free, 0 used. 31188884 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 52601 root 39 19 1310268 14900 9836 S 0.3 0.0 22:59.21 logagent-Collec 1 root 20 0 45416 5244 3968 S 0.0 0.0 5:35.71 Systemd 340 Root 20 0 64700 21336 17684 S 0.0 0.1 8:33.90 systemd-Journal 357 root 20 0 101836 2768 2312 S 0.0 0.0 0:01.13 GSSProxy 384 Dbus 20 0 28632 2800 2464 SSHD 461 agent 200 52376 5200 3684 S 0.0. SSHD 461 agent 200 52376 5200 3684 S 0.0 Ilogtail 1690 Agent 20 0 2193388 246304 11264 S 0.0 0.7 23:45.88 Java 2527 admin 20 0 161744 4268 3704 R 0.0 0.0 0:00.72 Top 3245 root 20 0 559140 12412 5860 S 0.0 0.0 64:48.67 logAgent 3420 root 20 0 745052 58464 43820 S 0.0 0.2 11:16.32 MetricBeat 3447 root 20 0 957796 55548 43708 S 0.0 0.2 10:14.47 MetricBeat 5093 root 20 0 1905356 159280 Java 7458 root 20 0 13700 2564 2356 S 0.0 0.0 0:00.00 bash 7464 root 20 0 86268 4436 3740 S 0.0 0.0 00.00 sudo #... Omit the otherCopy the code

There are many variations of TOP, but in the rest of this article, we’ll discuss the most common variation — the variation that comes with the props -ng package, to run the validation experience:

$procps top - v - ng version 3.3.10 Usage: top - hv secs | - bcHiOSs - d - n Max - u | u user -p pid (s) - o field - w/colsCopy the code

Quite a few things happen in top’s interface, which we’ll examine in the next section.

Understand the interface of Top – The Summary Area

In the first section, we can see that the output interface of TOP is divided into two parts. In this section, we will focus on the information in the upper part, which is generally called summary area

System time, uptime, and user session

  • System time: current system time (21:07:28)
  • Normal operation: System running time (21 days, 4:31)
  • Number of active user sessions: 1
top - 21:07:28 up 21 days,  4:31,  1 user,
Copy the code

Active user sessions are classified into TTY and PTY. In fact, if you log in to your Linux system through your desktop environment and then start the terminal emulator, you will find two active sessions.

TTY: Physically running PTY on the system via the command line or desktop environment: terminal emulator window or via SSH

If we want more information about the active user session, we can use the who command as follows:

$ who
admin    pts/0        2020-10-31 17:15 (xx.xx.xx.xx)
Copy the code

Memory usage

The Memory section displays information about the system’s Memory usage, as follows:

KiB Mem : 33554432 total, 31188208 free,   513488 used,  1852736 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used. 31188208 avail Mem
Copy the code

Mem and Swap display RAM and Swap space information respectively. When RAM utilization nears full, infrequently used areas of RAM are written to Swap space for later retrieval when needed. However, because disk access is slow, relying too much on Swap can hurt system performance.

About the Swap

  • Physical memory is the actual memory size of a computer, made up of RAM chips. Virtual memory is virtualized, using disk instead of memory. The appearance of virtual memory, let the machine memory is not enough to be partially solved. When the program is running, the operating system does the actual virtual memory to physical memory replacement and load (corresponding page and segment virtual memory management). The virtual memory here is called swap;

  • When the user submits the program, the process is then generated to run on the machine. The machine will determine whether the current physical memory is free to allow the process to run into memory, if so, directly into memory to run; If not, a process is selected and suspended based on priority, the process is swapped to swap to wait, and the new process is brought into memory to run. According to this swap in and swap out, memory recycling is realized, so that the user does not feel the memory limit. This also shows that swap plays a very important role in holding the process being swapped out.

  • Data is exchanged between memory and swap in the unit of memory pages. In Linux, the page size is set to 4kb. Memory and disk exchange data in blocks

Total, free, and used are the total size, idle size, and used size respectively. Avail MEM value refers to the amount of memory that can be allocated to the process without causing more swapping.

At the Linux kernel level there are always different ways to try to reduce the number of disk accesses; It maintains a “Disk cache” in RAM, which stores frequently used areas of the disk, and disk writes are stored in a “disk buffer,” which the kernel eventually writes to disk. Their total memory consumption is a buff/cache value. This may seem like a bad thing, but it’s not, because the memory used by the cache will be allocated to the process as needed.

Task – the Tasks

The Tasks section displays statistics about the processes running on the system

Tasks:  33 total,   1 running,  31 sleeping,   0 stopped,   1 zombie
Copy the code

Total is easy to understand. It represents the total number of processes running on the system. But for a few other state-related numbers, we need a little background on how the Linux kernel handles processes.

Process execution is a mixture of I/ O-limited work (such as reading disks) and CPU-limited work (such as performing arithmetic operations). When a process performs I/O, the CPU is idle, so the OS switches to executing other processes during this time. In addition, the operating system allows a given process to execute for a very short time before it switches to another process. This is the manifestation of operating system multitasking. Doing all this requires us to track the “state” of the process. In Linux, processes may be in the following states:

  • 1. Runnable (R): Processes in this state are either executing on the CPU or exist in a runqueue, ready to execute.
  • Interruptible sleep(S): A process in this state is waiting for an event to complete.
  • Uninterruptible sleep (D): In this case, a process is waiting for an I/O operation to complete.
  • Stopped (T): These processes have been Stopped by a job control signal, such as Ctrl+Z, or because they are being tracked.
  • Zombie (Z): Zombie progression

A process can create many children that can exit while the parent still exists, but the data structures must remain until the parent acquires the state of the child. A terminated process in which such data structures still exist is called a zombie process. The D and S states are represented as sleeping in the top information, the T states as stopped, and the Z states as zombie.

CPU usage

CPU usage, which shows the percentage of CPU time spent on various tasks.

%Cpu(s): 0.3us, 0.4SY, 0.0Ni, 90.3ID, 0.0wa, 0.0hi, 0.0Si, 9.0stCopy the code

Us refers to the time it takes the CPU to execute a process in user space. Similarly, SY refers to the time it takes to run a kernel-space process. Linux uses the nice value to indicate the priority of a process. The higher the value, the lower the priority. As we’ll see later, the default nice value can be changed. In the case of manually setting NICE, the time taken to execute the process is displayed as the NI value. Ni is followed by ID, which is the amount of time the CPU remains idle, and most operating systems set the CPU to “power saving mode” when it is idle. Next comes the WA value, which is how long it takes the CPU to wait for the I/O to complete.

An Interrupt is a signal to the processor about an event that requires immediate attention; Peripherals typically use hardware interrupts to inform the system about events, such as keys on a keyboard. Software interrupts, on the other hand, are caused by specific instructions being executed on the processor. In both cases, the operating system handles them, and the time it takes to handle hardware interrupts and software interrupts is given by HI and SI, respectively.

In a hypervisor, CPU resources are allocated to each virtual machine (VM). The operating system detects when there is work to be done, and if it detects that it needs to execute but cannot execute because the CPU is busy on another VM, time wasted in this way is “stolen” time, displayed as ST.

Average Load -Load Average

The Load Average section represents the average “load” on the system over the last 1, 5, and 15 minutes.

Load average: 0.11, 0.07, 0.07Copy the code

Load is a measure of the amount of computation performed by the system. On Linux, the load is the number of processes in the R and D states at any given moment. The Load Average value gives you a relative measure of how long you must wait to complete the task. So here are a couple of little examples, just to give you an intuition of what these two concepts are.

  • 1. On a single core system,load averageA score of 0.4 means the system is only doing 40% of what it can do.load averageA value of 1 means the system is at full capacity — even with a little extra work, the system becomes overloaded. aload averageA system of 2.12 means it is 112% overloaded with more work than it can handle.
  • 2. On multi-core systems, it should be used firstload averageDivide by the number of CPU cores to get a similar measure.

Moreover, Load Average is not actually what most of us know as a typical load average. It is an “exponential moving average,” which means that a fraction of the previous Load Average is taken into account at the current value (this is a point for more technical details in this article).

Understand the interface of Top – The Task Area

The Summury Area is relatively simple and allows you to quickly get summary statistics about the current system running. However, detailed information can only be obtained from the Task Area.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 52601 root 39 19 1310268 14900 9836 S 0.3 0.0 22:59.21 Logagent-collec 1 root 20 0 45416 5244 3968 S 0.0 0.0 5:35.71 Systemd 340 Root 20 0 64700 21336 17684 S 0.0 0.1 8:33.90 Systemd-journal 357 root 20 0 101836 2768 2312 S 0.0 0.0 0:01.13 GSSProxy 384 Dbus 20 0 28632 2800 2464 S 0.0 0.0 0:00.04 dbus-daemon 432 root 200 84760 5852 4984 S 0.0 0.0 00.01 SSHD 461 agent 200 52376 5200 3684 S 0.0 0.0 0:00.01 ilogtail 1690 Agent 20 0 2193388 246304 11264 S 0.0 0.7 23:45.88 Java 2527 admin 20 0 161744 4268 3704 R 0.0 0.0 0:00.72 Top 3245 root 20 0 559140 12412 5860 S 0.0 0.0 16:48.67 logAgent 3420 root 20 0 745052 58464 43820 S 0.0 0.2 11:16.32 MetricBeat 3447 root 20 0 957796 55548 43708 S 0.0 0.2 10:14.47 MetricBeat 5093 root 20 0 1905356 159280 9584 S Java 7458 root 20 0 13700 2564 2356 S 0.0 0.0 0:00.00 bash 7464 root 20 0 86268 4436 3740 S 0.0 0.0 0:00. 00 sudoCopy the code

Here’s what each column means:

PID

This is the process ID, a unique positive integer that identifies the process.

USER

This is the “valid” username (mapped to the user ID) of the user who started the process. Linux assigns a real user ID and a valid user ID to the process; The latter allows a process to act on behalf of another user. (For example, a non-root user can be promoted to root to install software)

PR NI The NI field displays the nice value of a process. The PR field displays the scheduling priority of a process from the perspective of the kernel. The Nice field affects the process priority.

VIRT, RES, SHR and %MEM

VIRT, RES, and SHR are all related to the memory consumption of the process. VIRT is the total amount of memory consumed by a process. This includes program code, data stored in memory by the process, and any memory areas that have been swap to disk. RES is the memory consumed by the process in RAM, and %MEM represents this value as a percentage of the total available RAM. Finally, SHR is the amount of memory that is shared with other processes.

S indicates the process status

TIME+

The TIME+ column represents the total CPU TIME used by the process since it started, accurate to one hundredth of a second.

COMMAND

The COMMAND column represents the name of the current process.

Example Of the top command

So far, we’ve discussed what top’s interface information describes. However, in addition to displaying this information, top manages the process, and we can control all aspects of top’s output. In this section, we will give several examples. (In most of the examples below, you must press a key while top is running. These keys are case sensitive, so if you press K in caps lock, you’ve actually pressed a K, but the command doesn’t work.)

Kill the process

If you want to kill a process, just press K while top is running. A prompt will appear asking for the process’s process ID and press Enter.

PID to signal/kill [default pid = 384]
Copy the code

3444444444444444 is the process ID that you can manually enter

PID to signal/kill [default pid = 384] 34444444444444
Copy the code

If this blank space is left, top uses a SIGTERM that allows the process to terminate gracefully. If you want to forcibly terminate the process, you can type SIGKILL here. You can also enter the signal number here, for example, the number for SIGTERM is 384 and the number for SIGKILL is 384. If you leave the process ID blank and press Enter directly, it will terminate the process at the top of the list. As mentioned earlier, we can also use the arrow keys to scroll and change the process we want to terminate in this way.

Sort process list

One of the most common reasons to use a tool like TOP is to find out which process is consuming the most resources. We can sort the list by the following keys:

  • M: Sort by memory usage
  • P: to sort by CPU usage
  • N: Sort by process ID
  • T: To sort by running time

By default, top displays all results in descending order, but we can switch to ascending order by pressing the R key. You can also use the -o switch to sort the list. For example, if you wanted to sort the CPU usage of a process, you could do this:

top -o %CPU
Copy the code

Displays a list of threads instead of processes

You’ve already seen how Linux switches between processes. As we know, processes do not share memory or other resources, which makes this switch rather slow. Like other operating systems, Linux supports a “lightweight” alternative called “threading.” A “thread” is a part of a process that can share certain areas of memory and other resources, while also running concurrently like a process. By default, TOP displays a list of processes in its output. If you want to list Threads instead of processes, press H. The Tasks line shows Threads instead of processes.

Threads: 351 total,   2 running, 349 sleeping,   0 stopped,   0 zombie
Copy the code

The “Tasks” line in the Summury Area has been changed to “Threads”, but the attributes in the task Area list have not been changed. The reason is that within the Linux kernel, threads and processes use the same data structure for processing, so each thread has its own ID, state, and so on. If we want to switch back to the process view, press H again. Alternatively, top-h can be used to display threads by default.

The full process path is displayed

By default, all process names in the COMMAND column display the summary name. If we want to show the current process’s completion path, we can switch the perspective by pressing C, or use top-c directly to launch the interface.

Displays parent and child processes in a tree structure

You can cut to the Forest View view by pressing V in the Top interaction to show the parent-child process in a tree structure.

432 root 20 0 84760 5852 4984 S 0.0 0.0 0:00.01 '- /usr/sbin/sshd -d 98518 root 20 0 118432 6884 5792 S 0.0 0.0 0:00.00 '-sshd: admin [priv] 98520 admin 20 0 118432 3648 2556 S 0.0 0.0 0:01.32' -sshd: admin@pts/0 98521 admin 20 0 120656 4936 3768 S 0.0 0.0 00.34 '-- bash 130138 admin 20 0 161748 4208 3624 R 0.0 0.0 0:00. 27 ` - top - cCopy the code

Lists the user’s processes

To list a user’s processes, press U at the top runtime. Then, enter the user name or leave it blank to show the processes of all users; Alternatively, run the top-u XXX command to specify all process information of user XXX.

KiB Swap: 2097148 total, 2097148 free, 0 used. 31179088 avail Mem Which user (blank for all) root # waiting for input PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 45416 5244 3968 S 0.0 0.0 5:37.57 /usr/lib/systemd/systemd --system --deserialize 18 340 root 20 0 72892 30836 27184 S 0.0 0.1 8:36.56 /usr/lib/systemd/systemd-journald 357 root 20 0 101836 2768 2312 S 0.0 0.0 0:01.14 /usr/sbin/gssproxy -d 432 root 20 0 84760 5852 4984 S 0.0 0.0 0:00.01 /usr/sbin/sshd -dCopy the code

Filtering process

If we need to process a lot of processes, simple sorting doesn’t actually help us very much. In this case, we can press O to activate top’s filter mode, and then filter to our current process by entering a filter expression. Filter expressions are statements that specify relationships between properties and values, such as:

  • COMMAND= Java: Process name = Java
  • ! COMMAND= Java: process name! = Java
  • %CPU>3.0: CPU>3.0

To clear all filter criteria, press =.

conclusion

A Guide to the Linux “Top” Command

The top command is very helpful for monitoring and managing processes on Linux systems. This article has just scratched the surface and covered a lot of ground we haven’t covered. For example, how to add more columns to top. For more information, you can run man Top to view the MAN page for further learning.