Search This Blog

Throttling CPU usage with Linux cgroups

There are a number of reasons you may want to throttle rather than limit a process's CPU usage on your system. One very good reason is to keep the CPU temperature down or to simply reduce the amount of energy a certain process uses.

Limiting versus throttling

The term "limit" is nearly always used where throttling is actually required. A good example of why the two are not interchangeable would be the current ISP industry:

Example 1: Sally signs up for super fast broadband (100 Mbps) but hasn't read the small print: she can only download 10 GB of data before her connection is terminated and she has to wait for the next month before she can continue to use her service. Sally's service is not throttled but it is limited.

Example 2: Tony signs up for a basic package (1 Mbps) as he doesn't need to use the Internet a great deal. However he had the good sense to use an unlimited package so that he doesn't hit any usage caps. His router syncs at 100 Mbps but he only receives a 1 Mbps service. The ISPs equipment is throttling his service, but not limiting it.

Example 3: Benedict has signed up for some deal without reading any of the details. He receives a 20 Mbps service and is happy with the speeds. Unfortunately for him after downloading 5 GB of data his download rate drops to 1 Mbps. He has limits on his service which has led to it being throttled.

There are many instances where you may wish to both limit and throttle CPU usage. The former is very easy and well documented, the latter not so much.

nice and chrt

nice and chrt are used to set scheduling priorities for processes. This is to limit their usage, not to throttle them. The processes will use as much of the CPU as the scheduler is willing to give them. An idle process (chrt -i 0) can still consume 100% CPU if there are no other processes requesting CPU time.

cpulimit

cpulimit will throttle the CPU usage of a process, and this may be ideal for you, but it works after-the-fact: you need to run cpulimit after the process has started so you will probably need to use a script to find the process in the first place. The throttling is then applied. It also doesn't work well with interactive shells.

cgroups

cgroups are designed to limit and/or audit system resources. cgroups have the power to limit but also to throttle CPU usage (as well as other things) for a process from the second it is launched. Finding out how to do so is not easy.

cgroups, as the name suggests, creates groups for processes to be either launched inside, or moved into. The group restrictions can be edited at any time, and processes can be moved between cgroups at any time. This is a trick used by Android to give foreground tasks (apps) a better interactive response.

First things first: you need to make sure the cgroups filesystem is mounted. This can be done manually but in Ubuntu this is done with a daemon which reads /etc/cgconfig.conf:

mount {
cpu = /sys/fs/cgroup/cpu;
cpuacct = /sys/fs/cgroup/cpuacct;
devices = /sys/fs/cgroup/devices;
memory = /sys/fs/cgroup/memory;
freezer = /sys/fs/cgroup/freezer;
}

These default mounts contain variables which can be edited by the root user. Chances are you do not want to mess with the defaults, so you will need to create a group:

$ sudo cgcreate -a limited_processes -g cpu:brian

Where limited_processes is the name of the cgroup and can take any name, and cpu:brian is cgroup subsystem : username. This can also be set in cgconfig.conf.

$ ll /sys/fs/cgroup/cpu/brian
total 0
drwxr-xr-x 2 brian root 0 Apr 10 02:42 .
drwxr-xr-x 5 root  root 0 Apr 10 02:41 ..
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cgroup.clone_children
--w--w--w- 1 brian root 0 Apr 10 02:42 cgroup.event_control
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cgroup.procs
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cpu.cfs_period_us
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cpu.cfs_quota_us
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cpu.rt_period_us
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cpu.rt_runtime_us
-rw-r--r-- 1 brian root 0 Apr 10 02:42 cpu.shares
-r--r--r-- 1 brian root 0 Apr 10 02:42 cpu.stat
-rw-r--r-- 1 brian root 0 Apr 10 02:42 notify_on_release
-rw-r--r-- 1 brian root 0 Apr 10 02:45 tasks

Now, at this point it is important to remember the difference between limiting and throttling....

cpu.shares - The default value is 1024. This gives any process in this cgroup 1024 out of 1024 "CPU shares". In other words if you lower this value it will limit the process. For example: if I set this value to 512 then the process will receive a maximum of 50% of the CPU if and only if another process is also requesting CPU time (ignoring any nice and realtime values you may have set). It still has the option to consume 100% of the idle CPU time.

cpu.cfs_period_us - The default value is 100000 and refers to the time period in which the standard scheduler "accounts" the process in microseconds. It does little on its own.

cpu.cfs_quota_us - The default value is -1 which means it has no effect. Any valid value here is in microseconds and must not exceed cpu.cfs_period_us.

The trick to throttling a process is to manipulate the last two values, namely cpu.cfs_period_us and cpu.cfs_quota_us. The quota is the amount of CPU bandwidth (time) in which a process in the cgroup will be allowed to use per the period. So a process that is given 100000 out of 100000 is allowed to use 100% of the CPU (again, ignoring nice and realtime values). A process given 90000 out of 100000 is allowed to run 90%, 50000 out of 100000 is allowed to run 50% and so on. In the words of the official Linux documentation:

The bandwidth allowed for a group is specified using a quota and period. Within each given "period" (microseconds), a group is allowed to consume only up to "quota" microseconds of CPU time.  When the CPU bandwidth consumption of a group exceeds this limit (for that period), the tasks belonging to its
hierarchy will be throttled and are not allowed to run again until the next period.

Example

In this first example I have set cpu.shares = 100 for the matho-primes process, which gives the process 100 out of 1024 arbitrary cycles.


As you can see this has not throttled my process. Because the system has CPU time to spare the process still consumes all that it can.

In this next example I set cpu.cfs_period_us = 50000 and cpu.cfs_quota_us = 1000 for the same process.


This has had the desired effect. For every 50,000 µs time slice, the process is only allowed to use 1,000 µs (2%) and is paused until the next time slice is available. This is true regardless of the current system demand. (Note: the process can still receive less than its 2% allotted time if the system is heavily loaded or a higher priority process demands the time.)

I can check the amount of throttling that has been done at any time:

$ cat cpu.stat
nr_periods 336
nr_throttled 334
throttled_time 16181179709

To launch the process within the cgroup I used the following command:

$ cgexec -g cpu:brian matho-primes 0 999999999

Summary

1. Create the cgroup.
2. Set cpu.cfs_period_us.
3. Set cpu.cfs_quota_us.
4. Use cgexec to launch the application.

Once running the cgroup can be edited by the owner of that group, or the process can be moved to a different cgroup. This may be handy using a Cron job to give a process more time at certain times of the day.

Notes:
  • The current (April 2015) Linux cgroup documentation doesn't mention the cpu subsystem at all and has introduced the cpusets subsystem, but the two do not do the same job. It is not clear if this type of throttling capability has been removed altogether or moved to a different area of the kernel.
  • There are, of course, bugs.
  • Just because you have a process in one cgroup, doesn't mean it cannot also be in another.
  • Different process subsystems (cpu, memory, freezer, etc.) can be in different in cgroups.
  • Multiple processes can and do share cgroups, but only when told to do so.
  • Child processes remain inside the cgroups unless moved out of them, and only if they are allowed to be moved (set by policy).
  • cgroups are hierarchical and you can create sub-cgroups to limit certain processes further. A sub-cgroup cannot receive more resources than its parent cgroup.
  • There may be a slight performance penalty depending on your choice for period and quota. You will only really need to worry about this on highly optimised or incredibly large systems.
  • cgroups do not offer virtualisation or jailing of a process, though they can be used alongside these systems, and are in many circumstances (such as Android and LXC).
  • You can still set the nice and realtime values to give processes certain priorities. This will not affect the maximum CPU time allowed by the cgroup and the CPU time will be shared in a complex manner between all processes, as One would expect from a decent operating system.
  • The cpu.cfs* values of course refer to the CFS scheduler, the cpu.rt* values refer to the realtime scheduler. It is unlikely you will want to change the realtime values unless you want fine-grained control over realtime processes.
References:
VigLink badge My profile on StackExchange