Linux Cgroup Series 1: Basic Concepts

Cgroup is a feature of the Linux Kernel: it is a hierarchical group of processes running on a system that you can allocate resources to (such as CPU time, system memory, network bandwidth, or a combination of these resources). By using cgroups, system administrators can exercise fine-grained control over the allocation, sorting, rejection, management, and monitoring of system resources. Hardware resources can be allocated intelligently between applications and users, increasing overall efficiency.

A Cgroup is similar to a namespace in that it groups processes into groups. However, the purpose of a namespace is different from that of a namespace. A namespace isolates resources between process groups, while a Cgroup monitors and restricts resources for a group of processes.

There are two versions of Cgroup: V1 and V2. V1 is implemented earlier and has many functions, but its functions are implemented in a scattered way, so the planning is not very good, which leads to some inconvenience in use and maintenance. V2 is designed to solve this problem in V1. Cgroup V2 claims to be ready for production, but its support is limited, as v2 was introduced into the kernel along with cGroup Namespace. V1 and v2 can be mixed, but it’s a little bit more complicated, so no one would normally use it that way.

1.Why cgroup


In Linux, there has always been the concept and requirement of grouping processes, such as session group, Progress Group, etc. Later, as people have more and more requirements on this aspect, such as the need to track the memory and IO usage of a group of processes, so the Emergence of cgroup. It is used to group processes in a unified manner and monitor and manage processes and resources based on the group.

2.What is the cgroup


The term cgroup can mean different things in different contexts, whether it refers to the entire Linux Cgroup technology or to a specific process group.

Cgroup is a mechanism to manage processes by group under Linux. In the view of users, Cgroup technology is to organize all processes in the system into an independent tree. Each tree contains all processes in the system, and each node of the tree is a process group. Each tree is attached to one or more subs. The role of the tree is to group processes and the role of the subsystem is to operate on the groups. Cgroup mainly includes the following two parts:

  • subsystemA subsystem is a kernel module that, once attached to a Cgroup tree, performs specific operations on each node (process group) of the tree. Subsystem is often calledresource controller“, because it is mainly used to schedule or limit the resources of each process group, but this is not entirely accurate as sometimes we group processes just to monitor their state, e.g. Perf_event subsystem. To date, Linux supports 12 sub-systems, such as limited CPU usage, limited memory usage, counting CPU usage, freezing and restoring a set of processes, etc. We’ll cover each of them later.
  • hierarchyA:hierarchyCan be understood as a Cgroup tree, each node of the tree is a process group, each tree will be zero to multiplesubsystemAssociation. A tree can contain all the processes in the Linux system, but each process can belong to only one node (process group). A process can belong to more than one tree, i.e. a process can belong to more than one process group, but the process groups are affiliated to different subsystem. Currently, Linux supports 12 subsubsystem types. If we don’t consider the case of not associating with any subsystem (which is the case for Systemd), we can build up to 12 Cgroup trees, each of which is associated to one subsystem. Of course you could just build a tree and have that tree relate to all the subsystem. When a Cgroup tree is not attached to any subsystem, it means that the tree just subgroups the processes. It is up to the application to decide what to do on the subgroup basis.systemdOne such example.

3.Think of resources as a pie


On CentOS 7 systems (including Red Hat Enterprise Linux 7), resource management Settings can be moved from the process level to the application level by bundling the CGroup level system with the Systemd unit tree. By default, Systemd automatically creates hierarchies of slice, Scope, and service units (meaning explained later) to provide a uniform structure for the CGroup tree. This structure can be further modified by creating a custom slice with the systemctl command.

If we treat the System’s resources as a pie, all resources are divided by default into three Cgroups: System, User, and Machine. Each cgroup is a slice, and each slice can have its own sub-slice, as shown below:

Let’s use CPU resources as an example to explain some of the keywords in the figure above.

As shown in the figure above, three top-level slices (System, User, and Machine) are created by default, and each slice gets the same CPU usage time (only when the CPU is busy). If User. slice wants 100% CPU time and the CPU is idle, then user.slice can get it. These three types of top-level slice mean the following:

  • System.slice — The default location for all system services
  • User.slice — default location for all user sessions. Each user session creates a subslice under this slice, and if the same user logs in to the system multiple times, the same subslice will still be used.
  • Slice — default location for all virtual machines and Linux containers

One way to control CPU resource usage is through Shares. Shares is used to set the relative value of cpus (you can think of it as weights) for all cpus (cores). The default value is 1024. So in the figure above, HTTPD, SSHD, Crond, and GDM all have CPU shares of 1024, as do System, User, and Machine.

Suppose four services are running on the system, two users are logged in, and a virtual machine is running. Also assume that each process requires as much CPU as possible (each process is busy).

  • system.sliceWill get33.333%Where each service is allocated resources from System. SliceA quarterOf the CPU usage time, i.e8.25%The CPU usage time of.
  • user.sliceWill get33.333%CPU usage time, which each logged-in user gets16.5%The CPU usage time of. Suppose there are two users:tomjackIf Tom logs out or kills all processes under the user session, Jack can use it33.333%The CPU usage time of.
  • machine.sliceWill get33.333%If the VM is shut down or idle, then both System. slice and user.slice will be removed from this33.333%Of CPU resources respectively50%CPU resources are then divided equally among their subslices.

To strictly control the CPU resources, set the CPU usage upper limit. That is, the CPU usage cannot exceed the upper limit, no matter the CPU is busy. This can be set using the following two parameters:

Cfs_period_us = Period for calculating the CPU usage time, in microseconds (us). Cfs_quota_us = Allowed CPU usage time within a period (single-core time; multi-core time must be accumulated during setting)Copy the code

The systemctl parameter CPUQuota sets the upper limit of CPU usage. For example, if you want to limit the CPU usage of user Tom to 20%, you can run the following command:

$ systemctl set-property user-1000.slice CPUQuota=20%
Copy the code

When using the systemctl set-property command, you can use TAB completion:

$ systemctl set-property user-1000.slice AccuracySec= CPUAccounting= Environment= LimitCPU= LimitNICE= LimitSIGPENDING= SendSIGKILL= BlockIOAccounting= CPUQuota= Group= LimitDATA= LimitNOFILE= LimitSTACK= User= BlockIODeviceWeight= CPUShares= KillMode= LimitFSIZE= LimitNPROC= MemoryAccounting= WakeSystem= BlockIOReadBandwidth= DefaultDependencies= KillSignal= LimitLOCKS=  LimitRSS= MemoryLimit= BlockIOWeight= DeviceAllow= LimitAS= LimitMEMLOCK= LimitRTPRIO= Nice= BlockIOWriteBandwidth= DevicePolicy= LimitCORE= LimitMSGQUEUE= LimitRTTIME= SendSIGHUP=Copy the code

There are a number of properties that can be set, but not all of them are used to set cgroups. We just need to focus on blocks, CPU, and Memory.

If you want to through the configuration file to set the cgroup, service can be directly in the/etc/systemd/system/XXX. Service. D under the directory to create the corresponding configuration file, Slice can be directly in the/run/systemd/system/XXX. Slice. D under the directory to create the corresponding configuration file. In fact, setting cgroup using the systemctl command line tool will also be written to the configuration file in this directory:

$ cat /run/systemd/system/user-1000.slice.d/50-CPUQuota.conf
[Slice]
CPUQuota=20%
Copy the code

View the corresponding cgroup parameter:

$ cat /sys/fs/cgroup/cpu,cpuacct/user.slice/user-1000.slice/cpu.cfs_period_us
100000

$ cat /sys/fs/cgroup/cpu,cpuacct/user.slice/user-1000.slice/cpu.cfs_quota_us
20000
Copy the code

This means that user Tom can use 20 milliseconds of CPU time in a lifetime (100 milliseconds). Regardless of whether the CPU is free or not, the CPU resources used by the user will not exceed this limit.

{{% notice note %}} The CPUQuota value can exceed 100%. For example, if the system has multiple CPU cores and the CPUQuota value is 200%, then the slice can use 2 cores of CPU time. {{% /notice %}}

4.conclusion


This article mainly introduces some basic concepts of Cgroup, including its default Settings in CentOS system and control tools. Taking CPU as an example, it explains how cgroup controls resources. The next article will look at specific examples of how different Cgroup Settings can affect performance.

5.The resources


  • Linux Cgroup Series (01) : An overview of Cgroups