Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A tale of two plugins: safely extending the Ku...

September 05, 2023

A tale of two plugins: safely extending the Kubernetes Scheduler with WebAssembly


September 05, 2023

More Decks by sanposhiho

Other Decks in Programming


  1. 1 A tale of two plugins: safely extending the Kubernetes

    Scheduler with WebAssembly Kensei Nakada (@sanposhiho) ↑Slide URL↑
  2. 3 Platform Engineer at Kubernetes SIG-Scheduling approver Kubernetes contributor award

    2022 winner Kensei Nakada (sanposhiho)
 Hello! Kia ora! こんにちは!
  3. 4 Platform Engineer at Kubernetes SIG-Scheduling approver Kubernetes contributor award

    2022 winner Kensei Nakada (sanposhiho)
 ↑ Slide URL! Hello! Kia ora! こんにちは!
  4. 5 Agenda The scheduler extensions The wasm extension on the

    scheduler The wasm extension deep-dive Project status / What’s next 02 03 04 01
  5. 6 Pod … a group of containers, which is the

    smallest execution unit in Kubernetes. Node … a virtual or physical machine, where Pods run. Pod・Node
  6. 7 The component to literally schedule each Pod to Node.

    Checks many factors (resources, affinity, volume, etc) and decides the best Node for the Pod. Kubernetes scheduler
  7. 9 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin
  8. 10 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin
  9. 14 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin
  10. 15 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin
  11. 16 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin WebAssembly Via WebAssembly plugin
  12. 17 Unique use cases sometimes require extending your scheduler. •

    The batch jobs requirements. ◦ Start several Pods at the same time (coscheduling) ◦ Elastic resource quota (capacityscheduling) • You’ll find many other usecases in kubernetes-sigs/scheduler-plugins Extend your scheduler
  13. 18 Control your scheduling Category description Built-in scheduling constraints The

    scheduling constraints on Pod spec Control the scheduling per Pod. KubeSchedulerConfiguration Control the scheduling per cluster. Extend the scheduler Extender Via webhook Plugin Via your own scheduler plugin WebAssembly Via WebAssembly plugin
  14. 19 Webhook based extension on the scheduler. Each webhooks are

    called at a specific point in scheduling: • Filter: Filter in Scheduling Framework • Prioritize: Score in Scheduling Framework • Preempt: PostFilter in Scheduling Framework • Bind: Bind in Scheduling Framework Extender
  15. 20 • 👍 No need to rebuild scheduler. (Just pass

    URL via config) • 👍 The flexibility of implementation. • 👎 It affects the scheduling latency very badly. • (👎 See this for more disadvantages.) Extender
  16. 23 Scheduling Framework: the pluggable architecture of scheduler. • Decouple

    all scheduling logic from the scheduler’s core impl • One scheduling factor = One plugin • We can extend the scheduler by creating your own plugins. Plugin (Scheduling framework)
  17. 28

  18. 29 • 👍 More extension points are available. • 👍

    No overhead to call plugins. • 👎 Cannot use it casually. (requires rebuild etc) Plugin (Scheduling framework)
  19. 30 • 👍 More extension points are available. • 👍

    No overhead to call plugins. • 👎 Cannot use it casually. (requires rebuild etc) Plugin (Scheduling framework)
  20. 31 • Maintenance cost ◦ Need to fork the scheduler

    and keep consistent with your Kubernetes version. • The scheduler should be only one in the cluster. ◦ Need to let all Pods go through a new scheduler in some ways. ◦ May need to convince people managing the scheduler in your cluster. (your infra team, cloud vendor, etc) ◦ May need to maintain multiple scheduling plugins owned by different teams in the scheduler. Hurdles for Plugin extension
  21. 32 • Maintenance cost ◦ Need to fork the scheduler

    and keep consistent with your Kubernetes version. • The scheduler should be only one in the cluster. ◦ Need to let all Pods go through a new scheduler in some ways. ◦ May need to convince people managing the scheduler in your cluster. (your infra team, cloud vendor, etc) ◦ May need to maintain multiple scheduling plugins owned by different teams in the scheduler. Can we call it a true pluggable system? 🤔 Hurdles for Plugin extension
  22. 34 WebAssembly is a way to safely run code compiled

    in other languages. • Wasm runtimes execute wasm guests (xxxx.wasm) • Wasm guests import functions from host. ◦ = They cannot do other things. You may hear it around the browser stuff, but… WebAssembly
  23. 37 The wasm plugin is implemented to follow the Scheduling

    Framework. Using to host wasm runtime in it! How it works Wasm plugin
  24. 38 It’s very tough for non-wasm people to write own

    wasm guest to satisfy ABIs. We’re providing TinyGo SDK so that people can develop wasm plugins via a similar experience with Golang native plugins. – just need to implement interfaces. TinyGo SDK
  25. 41 The wasm plugins are portable, easy to distribute from

    the communities, and easy to apply to your scheduler. Add your wasm plugin
  26. 42 The wasm plugins are portable, easy to distribute from

    the communities, and easy to apply to your scheduler. Add your wasm plugin Wow, it can use http(s)!
  27. 43 It's a balanced solution between the extender and golang

    plugin. • 👍 All extension points are available. • 👍 No need to change the scheduler’s code, no need to rebuild! • 👍 Easy to distribute plugins. (via http(s)) • 👍 It can be written in many languages. • 👎 A bad impact on the latency. • 👎 Wasm peculiar limitations. So… will wasm extension replace all plugins?
  28. 44 It's a balanced solution between the extender and golang

    plugin. • 👍 All extension points are available. • 👍 No need to change the scheduler’s code, no need to rebuild! • 👍 Easy to distribute plugins. (via http(s)) • 👍 It can be written in many languages. • 👎 A bad impact on the latency. • 👎 Wasm peculiar limitations. So… will wasm extension replace all plugins?
  29. 48 Golang plugin can still be the best if… •

    The scheduler’s latency is super critical in your cluster. ◦ The bigger cluster you get, the faster scheduling is needed. • You need to do heavy-calculation or handle tons of various objects. ◦ Due to inlined GC and the latency to pass objects from host to guest. (both will be discussed in later section) So… will wasm extension replace all plugins?
  30. 50 The wasm plugin is implemented to follow the Scheduling

    Framework. Using to host wasm runtime in it! How it works Wasm plugin
  31. 52 Contracts of how the host and the guest communicates.

    Just like API, but B instead of P (Application Binary Interface) ABI
  32. 53 WebAssembly’s sandbox model and its limitation: • The guest

    can only operate their memory. • The guest memory is exported to host so that host can read or write anything. • Only numeric types are supported. The wasm function can only operate their memory
  33. 54 Example ABI to get URI from host. (http-wasm.io) •

    Guest: allocates enough memory for URI and gives the linear memory offset and maximum length in bytes. • Host: put the URI there and tell guest the length of it. How to pass things from host
  34. 56 One function to fetch protobuf encoded Pod, Node, etc.

    For the performance concern, we do • Lazy loading • Cache • Faster garbage collection • (Other future planned enhancements) Protobuf encoding
  35. 57 Only get objects from host when actually accessing it.

    Lazy loading Pod is fetched. But NodeList isn’t.
  36. 58 The scheduler refers the same resource status in one

    scheduling cycle. → Why don’t we have a cache in wasm guest! We reduce the number of communication as much as possible with them: • Fetch objects from host only when it’s necessary for the guest. • Fetch the same object from host only once during one scheduling cycle. Lazy loading with caching
  37. 59 • Wasm has only one thread and GC is

    inlined. • Inlined GC overhead was over half the latency of a plugin execution 😢 Garbage collection overhead
  38. 60 wasilibs/nottinygc requires some flags, but performs much better. We

    saw around 50% latency reduction in some scenarios: nottinygc
  39. 61 The performance is the critical factor for the project.

    We have two levels of benchmark test: • Plugin-level benchmark tests, using the Golang benchmark test feature. ◦ How much it takes time in which part. • The scheduler_perf, running the wasm plugin in the scheduler and observing the scheduler’s metrics. ◦ How much actually the wasm slows down the scheduling. Benchmark tests
  40. 63 It’s an early stage, but it’s already have an

    enough functionality to write a simple wasm plugin! See examples if you’re interested in! Project status
  41. 64 • Support all extension points. • Get resources other

    than Pods and Nodes. • Further performance improvement. • Other language examples for guest. What’s left to do
  42. 65 • Join us #sig-scheduling on Kubernetes slack! • Shout

    out to all contributors so far, especially Adrian for tons of contributions, and all people, especially Chris for all helps to bring me here, Wellington 󰐜. We’re running through many things in 30 min 🏃💨, Thanks all! That’s all!
  43. 68 Pod is created → started on Node Pod is

    created and the scheduler notices it.
  44. 69 Pod is created → started on Node The scheduler

    decides where to go. (= Scheduling) Node A
  45. 73 The scheduler needs to consider many things: • Resource

    request on Pod. • Affinity requirement on Pod. (PodAffinity, NodeAffinity) • How Pods are spread into each domain now. (PodTopologySpread) • Taints on Nodes / Tolerations on Pod. • …etc Scheduling factors
  46. 74 The architecture inside the scheduler • The scheduler is

    composed of many Plugins. ◦ One scheduling factor = one plugin (NodeAffinity plugin, etc) • Each plugin is created to work on one or more extension points. ◦ Filter: filtering Nodes that don’t fit the requirements ◦ Score: scoring the remaining Nodes ◦ … Scheduling Framework
  47. 78 Role: Filtering out Nodes that shouldn’t/cannot run the Pod

    For example.. • Nodes that don’t have enough resource to run the Pod • Nodes that don’t match with a required NodeAffinity on the Pod • … Scheduling Framework - Filter
  48. 80 Role: Scoring each Node that passed all Filter plugins

    For example, give higher scores to Nodes • already have the container image(s) of the Pod • match with a preferred NodeAffinity on Pod • … Scheduling Framework - Score
  49. 82 The cluster level scheduling configuration: • Disable/Enable plugins in

    the scheduler • Change plugins’ behaviors of all scheduling KubeScheduler Configuration
  50. 85 We explored Golang standard package named plugin, but gave

    up. You can see our investigation and discussion here: • https://github.com/kubernetes/kubernetes/issues/106705 • https://github.com/kubernetes/kubernetes/issues/100723 Several attempts to make it easier
  51. 86 People start to use Wasm to provide the extendability

    – load and run the wasm binary in the host. In Golang, there are already some: - Dapr Wasm middleware - Trivy Modules - knqyf263/go-plugin Powered by WebAssembly with Golang
  52. 87 • Lack of exported functions • Performance problem See

    this for more information. Why not normal Go GOOS=wasip1 GOARCH=wasm?