since cgroup v2. ⚫If the parent cgroup disables some controllers, those cannot be enabled in the descendant cgroups • You can control what controllers are enabled in the descendant cgroups from the file of cgroup.subtree_control 5 /sys/fs/cgroup /cgtest1 /cpu.* /memory.* /cgtest2 /cpu.* /memory.*
But some controllers need a privilege because of using eBPF. ⚫cgroup aware OOM killer ⚫PSI per cgroup ⚫(NEW) utilization clamping support • Assign the actual the computational power assigned to task groups considering the actual frequency which is depending on the operation of schedutil and asymmetric capacity systems like Arm's big.LITTLE. 6 https://github.com/torvalds/linux/commit/2480c093130f64ac3a410504fa8b3db1fc4b87ce
• Each cgroup has a “cgroup.controllers” file which lists all controllers available for the cgroup to enable ⚫But some controllers are not listed in the above file. ⚫Why?? 9 https://www.kernel.org/doc/html/v5.9/admin-guide/cgroup-v2.html
the default hierarchy. • cgrp_dfl_inhibit_ss_mask ⚫Some controllers are implicitly enabled on the default hierarchy. • cgrp_dfl_implicit_ss_mask ⚫When system boots, kernel sets up the above masks. 10 if (ss->implicit_on_dfl) cgrp_dfl_implicit_ss_mask |= 1 << ss->id; else if (!ss->dfl_cftypes) cgrp_dfl_inhibit_ss_mask |= 1 << ss->id; https://github.com/torvalds/linux/blob/v5.9/kernel/cgroup/cgroup.c#L5740-L5743
access to devices with each cgroup. ⚫There are three files to control behavior. • devices.allow is the allowlist of devices. • devices.deny is the denylist of devices. • devices.list shows available devices. ⚫Interface • Ex. Allow cgroup 1 to read and mknod /dev/null as below: 13 https://www.kernel.org/doc/html/v5.9/admin-guide/cgroup-v1/devices.html # echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow
some reasons. • Ex. To control network access ⚫The eBPF program is attached to a specific cgroup. ⚫BPF_PROG_TYPE_CGROUP_DEVICE was introduced since kernel version 4.15. 14
program to a cgroup • https://github.com/torvalds/linux/blob/v5.9/kernel/bpf/syscall.c#L4187 → bpf_prog_attach at kernel/bpf/syscall.c#L2839 → cgroup_bpf_prog_attach at kernel/bpf/cgroup.c#L762 → cgroup_bpf_attach at kernel/cgroup/cgroup.c#L6496 → __cgroup_bpf_attachat kernel/bpf/cgroup.c#L433 16
the link to a cgroup, and * propagate the change to descendants * @cgrp: The cgroup which descendants to traverse * @prog: A program to attach * @link: A link to attach * @replace_prog: Previously attached program to replace if BPF_F_REPLACE is set * @type: Type of attach operation * @flags: Option flags * * Exactly one of @prog or @link can be non-null. * Must be called with cgroup_mutex held. */ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog, struct bpf_prog *replace_prog, struct bpf_cgroup_link *link, enum bpf_attach_type type, u32 flags)
abstraction layer. ⚫OCI runtime spec • This specification is originally designed for cgroup v1. • But some container runtimes can handle the configuration of cgroup v1 for v2. ⚫libcgroup is currently developing the facility of cgroup v2 interfaces. •https://github.com/libcgroup/libcgroup/issues/12 22
v2. • Like uclamp, it is for not only cloud systems but also embedded systems ⚫cgroup v2 changed interfaces and the way of resource control. • Some cgroup v2 controllers are not supported in the default hierarchy. ⚫eBPF is important for cgroup v2. 23