My Service Can’t Get Realtime!

So, you have a service that requires real-time scheduling. When you run this service on your systemd system it is unable to acquire real-time scheduling, even though it is full root and has all possible privileges. And now you are wondering what is going on and what you can do about it?

What is Going on?

By default systemd places all system services into their own control groups in the “cpu” hierarchy. This has the benefit that the CPU usage of services with many worker threads or processes (think: Apache with all its gazillion CGIs and stuff) gets roughly the same amount of CPU as a service with very few worker threads (think: MySQL). Instead of evening out CPU per process this will cause CPU to be evened out per service.

Now, the “cpu” cgroup controller of the Linux kernel has one major shortcoming: if a cgroup is created it needs an explicit, absolute RT time budget assigned, or otherwise RT is not available to any process in the group, and an attempt to acquire it will fail with EPERM. systemd will not assign any RT time budgets to the “cpu” cgroups it creates, simply because there is no feasible way to do that, since the budget needs to be specified in absolute time units and comes from a fixed pool. Or in other words: we’d love to assign a budget, but there are no sane values we could use. Thus, in its default configuration RT scheduling is simply not available for any system services.

Working Around the Issue

Of course, that’s quite a limitation, so here’s how you work around this:

Note that this all only applies to services. By default, user applications run in the root cgroup of the “cpu” hierarchy, which avoids these problems for normal user applications.

In the long run we hope that the kernel is fixed to not require an RT budget to be assigned for any cgroup created before a process can acquire RT (i.e. a process’ RT budget should be derived from the nearest ancestor cgroup which has a budget assigned, rather than unconditionally its own uninitialized budget.) Ideally, we’d also like to create a per-user cgroup by default, so that users with many processes get roughly the same amount of CPU as users with very few.