embedded container environment. ⚫There are many things to think about when RT processes run on the container environment. • Integrate tools for RT into Kubernetes : Today’s Topic • Security : https://blogs.oracle.com/linux/dealing-with-realtime-processes-in-linux-user-namespaces 4 Inside Container Core 0 Linux dockerd RT process Non-RT process Non-RT process Core 0 Core 1 Core 2 Core 3 kernel thread Interrupt CPU isolation
is not about the lowest possible latency or the maximum possible throughput. Real-time is deterministic execution time. Deterministic execution time means performing tasks within a certain time, this not being affected by any external process. ⚫There are many tasks to make processes real-time. Especially, containerization makes it more difficult. ⚫Today, I just only introduce the issues of CPU shielding on the container environment. 5 https://www.redhat.com/en/blog/going-full-deterministic-using-real-time-openstack
where on a multiprocessor system or on a CPU with multiple cores, real-time tasks can run on one CPU or core while non-real-time tasks run on another. ⚫Use cases • Isolating RT processes • Thermal throttling –This use case just only uses cpuset. –Reduce power consumption by pining background threads that are not performance-critical on LITTLE CPUs. • NFV(Network Functions Virtualization) –Improve NFV performance and prevent spurious packet loss. 7 https://en.wikipedia.org/wiki/CPU_shielding
launch RT processes ⚫Kernel threads • Move kernel threads from the isolated core – Ex. Use cset. The “isolcpus“ kernel boot option cannot isolate kernel threads. “nohz_full” supports it except for CPU bounded threads since kernel version 5.9 •Set dynamic tickless behaviour – Ex. Set up the “nohz_full” kernel boot option • Stop RCU callbacks – Ex. Set up the “rcu_nocbs” kernel boot option • Set CPU affinity for work queue – Ex. Modify cpumasks in /sys/devices/virtual/workqueue and so on… 8
Ex. Modify files under /proc/irq • Change the interrupt handler from irq context to kernel thread – Ex. Set up the “threadirqs” kernel boot option and so on… You should adjust settings to your use case. 9 OK! What about CPU shielding inside a container??
container 11 Core 0 Linux dockerd RT process Non-RT process Non-RT process Core 0 Core 1 Core 2 Core 3 ⚫cpuset is incomplete for CPU shielding. • When --cpuset-cpus argument is used, Docker can set CPU affinity. • But it cannot isolate CPUs against other than user processes. kernel thread Interrupt Outside the scope of this presentation How to move??
is a tool to manipulate cpusets. • cset can isolate both user processes and kernel threads except for CPU bounded threads. ⚫How to isolate • cset creates directories of 'system' and 'user' to operate cpuset on the root of cpuset controller. • The 'system' cpuset which contains CPUs which are used for unimportant tasks. • The 'user' cpuset which contains CPUs which are used for important tasks. – The 'user' cpuset is the shield. 12 https://github.com/lpechacek/cpuset/blob/v1.6/doc/tutorial.txt
the container on the isolated core by cset. ⚫What happened?? • cset creates directories of 'system' and 'user’. • Docker launches the container with --cpuset-cpus argument –Docker(runc) also creates the directory of cpuset(Ex. /sys/fs/cgroup/cpuset/docker) and tries to launch the container from that. –But cset has already made cpuset exclusive as default. – # echo 1 > cpuset.cpu_exclusive –So Docker fails to launch the container. 13
Core 0 Core 1 Core 2 Core 3 /sys /fs /cpuset /user/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive Created by cset Created by Docker Shielded by cset /cgroup exclusive /docker/cpuset.cpu_exclusive
you use cgroupfs driver 1. Create the isolated cpuset as 'docker' # cset shield --userset=docker -c 0 -k on 2. Launch your Docker container 3. Move processes to the non-isolated cpuset when you launch the unimportant container if you need ⚫It is difficult to maintain cpuset… • Using systemd driver • Using KVM 15
0 Core 0 Core 1 Core 2 Core 3 /sys /fs /cpuset /docker/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive Created by cset Created by Docker Shielded by cset /cgroup exclusive
in Kuberntes? ⚫Support explicitly reserved CPU list since Kubernetes v1.17 • The new Kubelet Flag to define an explicit CPU set for OS system daemons and Kubernetes system daemons. • This option is specifically designed for Telco/NFV. • To move the system daemon, Kubernetes daemons and interrupts/timers are out of scope. –In CentOS, you can do this using the tuned toolset. 18 https://v1-18.docs.kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list
kubelet Core 0 Core 1 Core 2 Core 3 ⚫Make architecture simple to reduce maintenance costs • Try new kernel features to reduce necessary tools ⚫Integrate tools for RT into Kubernetes kernel thread Interrupt Reserved CPU list isolcpus, nohz_full and so on Inside Container CPU isolation
• Processes • Kernel threads •Interrupt ⚫Containerization makes CPU Shielding more difficult. • Integrate tools for RT into Kubernetes • Consider security ⚫Diversity is important for OSS. • To get patches into mainline, we need to understand different use cases. 20