Introduction to Docker container - Cgroup

Posted by zack45668 on Wed, 24 Nov 2021 06:04:05 +0100

After we finished the previous article that namespace provides isolation for container technology, we will introduce the "limitation" of containers

Maybe you'll be curious. Haven't we created a container for the container through the Linux Namespace? Why do we need to limit the container?

Because in the linux process, the container process is not physically isolated. At runtime, it shares the same cpu and memory with other processes on the host. If it is not limited, it will inevitably cause resource competition.

In the container, process 1 can only see the situation in the container under the interference of "cover up", but on the host, as process 100, it still has an equal competitive relationship with all other processes. This means that although process 100 is apparently isolated, the resources it can use (such as CPU and memory) can be occupied by other processes (or other containers) on the host at any time. Of course, the No. 100 process itself may eat up all resources. These situations are obviously not reasonable behaviors that a "sandbox" should show.

Linux Cgroups is an important function in the Linux kernel to set resource limits for processes.

The full name of Linux Cgroups is Linux Control Group. Its main function is to limit the upper limit of resources that a process group can use, including CPU, memory, disk, network bandwidth, etc.


In Linux, the operating interface exposed by Cgroups to users is the file system, that is, it is organized in the / sys/fs/cgroup path of the operating system in the form of files and directories. In the Centos machine, I can display them with the mount command, which is:

$ mount -t cgroup
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)

If not, install the cgroups module directly using yum install libcgroup

You can see that under / sys/fs/cgroup, there are many subdirectories such as cpuset, CPU and memory, also known as subsystems. These are the types of resources that can be restricted by Cgroups. Under the resource type corresponding to the subsystem, you can see the specific methods that can be restricted. For example, for the CPU subsystem, we can see the following configuration files. The instructions are:

$ ls /sys/fs/cgroup/cpu
aegis   cgroup.clone_children  cgroup.procs          cpuacct.stat   cpuacct.usage_percpu  cpu.cfs_quota_us  cpu.rt_runtime_us  cpu.stat  notify_on_release  system.slice  user.slice
assist  cgroup.event_control   cgroup.sane_behavior  cpuacct.usage  cpu.cfs_period_us     cpu.rt_period_us  cpu.shares         docker    release_agent      tasks

If you are familiar with Linux CPU management, you will notice CFS in its output_ Period and CFS_ Keywords like quota. These two parameters need to be used in combination and can be used to limit the length of the process to cfs_period can only be allocated to CFS in total_ CPU time of quota. How to use such a configuration file? You need to create a directory under the corresponding subsystem. For example, we now enter the / sys/fs/cgroup/cpu Directory:

root@centos:/sys/fs/cgroup/cpu$ mkdir container
root@centos/sys/fs/cgroup/cpu$ ls container/
cgroup.clone_children cpu.cfs_period_us cpu.rt_period_us  cpu.shares notify_on_release
cgroup.procs      cpu.cfs_quota_us  cpu.rt_runtime_us cpu.stat  tasks

This directory is called a "control group". You will find that the operating system will automatically generate the resource limit file corresponding to the subsystem under the newly created container directory. Now, we execute such a script in the background:

$ while : ; do : ; done &
[1] 226

In this way, we can use the top command to confirm whether the CPU is full:

$ top
%Cpu0 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

From the output, you can see that the CPU utilization has reached 100% (cpu0: 100.0 US). At this time, by viewing the files in the container directory, we can see that there is no limit on the CPU quota in the container control group (i.e. - 1), and the CPU period is 100 ms (100000 us) by default:

$ cat /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us 
-1
$ cat /sys/fs/cgroup/cpu/container/cpu.cfs_period_us 
100000

Next, we can set restrictions by modifying the contents of these files. For example, to CFS in the container group_ Quota file write 20 ms (20000 us):

$ echo 20000 > /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us

Combined with the previous introduction, you should understand the meaning of this operation. It means that the process limited by the control group can only use 20 ms CPU time every 100 ms, that is, the process can only use 20% CPU bandwidth. Next, we write the PID of the restricted process into the tasks file in the container group, and the above settings will take effect for the process:

$ echo 226 > /sys/fs/cgroup/cpu/container/tasks 

We can use the top command to check:

$ top
%Cpu0 : 20.3 us, 0.0 sy, 0.0 ni, 79.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

As you can see, the CPU utilization of the computer immediately dropped to 20% (% cpu0: 20.3 US).

 

In addition to the CPU subsystem, each subsystem of Cgroups has its unique resource limitation capabilities, such as:

  • blkio, which sets I/O limits for {block} equipment} and is generally used for disk and other equipment;
  • cpuset, which allocates a separate CPU core and a corresponding memory node for the process;
  • Memory, which sets the memory usage limit for the process.

The design of Linux Cgroups is relatively easy to use. It is a combination of a subsystem directory and a set of resource limiting files. For Docker and other Linux container projects, they only need to create a control group (that is, create a new directory) for each container under each subsystem, and then fill the PID of the process into the tasks file of the corresponding control group after starting the container process.

The values to be filled in the resource files under these control groups are specified by the parameters when the user executes docker run, such as this command:

$ docker run -it --cpu-period=100000 --cpu-quota=20000 ubuntu /bin/bash

After starting the container, we can confirm by checking the contents of the resource limit file in the "docker" control group in the CPU subsystem under the Cgroups file system:

$ cat /sys/fs/cgroup/cpu/docker/5d5c9f67d/cpu.cfs_period_us 
100000
$ cat /sys/fs/cgroup/cpu/docker/5d5c9f67d/cpu.cfs_quota_us 
20000

 

So you see, the technology of docker container is not particularly innovative. It just unifies the relevant functions of linux and integrates them to form container technology.

Topics: Docker