Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetes Controllers - are they loops or events?
Search
Tim Hockin
February 20, 2021
Technology
11
4k
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
Tweet
Share
More Decks by Tim Hockin
See All by Tim Hockin
Kubernetes in the 2nd Decade
thockin
0
390
Why Service is the worst API in Kubernetes, and what we can do about it
thockin
2
950
Kubernetes Pod Probes
thockin
6
4.5k
Go Workspaces for Kubernetes
thockin
2
1k
Code Review in Kubernetes
thockin
2
1.8k
Multi-cluster: past, present, future
thockin
0
530
Kubernetes Network Models (why is this so dang hard?)
thockin
9
1.9k
KubeCon EU 2020: SIG-Network Intro and Deep-Dive
thockin
8
1.3k
A Non-Technical Kubernetes Talk (KubeCon EU 2020)
thockin
3
620
Other Decks in Technology
See All in Technology
Codeful Serverless / 一人運用でもやり抜く力
_kensh
7
420
20250910_障害注入から効率的復旧へ_カオスエンジニアリング_生成AIで考えるAWS障害対応.pdf
sh_fk2
3
260
品質視点から考える組織デザイン/Organizational Design from Quality
mii3king
0
200
OCI Oracle Database Services新機能アップデート(2025/06-2025/08)
oracle4engineer
PRO
0
140
オブザーバビリティが広げる AIOps の世界 / The World of AIOps Expanded by Observability
aoto
PRO
0
380
allow_retry と Arel.sql / allow_retry and Arel.sql
euglena1215
1
170
Aurora DSQLはサーバーレスアーキテクチャの常識を変えるのか
iwatatomoya
1
970
共有と分離 - Compose Multiplatform "本番導入" の設計指針
error96num
2
510
2025年になってもまだMySQLが好き
yoku0825
8
4.8k
2025年夏 コーディングエージェントを統べる者
nwiizo
0
170
AWSを利用する上で知っておきたい名前解決のはなし(10分版)
nagisa53
10
3.1k
ハードウェアとソフトウェアをつなぐ全てを内製している企業の E2E テストの作り方 / How to create E2E tests for a company that builds everything connecting hardware and software in-house
bitkey
PRO
1
130
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
26
1.9k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
Automating Front-end Workflow
addyosmani
1370
200k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.1k
Unsuck your backbone
ammeep
671
58k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
6k
Code Reviewing Like a Champion
maltzj
525
40k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.9k
How GitHub (no longer) Works
holman
315
140k
Transcript
Kubernetes Controllers Are they loops or events? Tim Hockin @thockin
v1
Background on “reconciliation”: https://speakerdeck.com/thockin/kubernetes-what-is-reconciliation
Background on “edge vs. level”: https://speakerdeck.com/thockin/edge-vs-level-triggered-logic
Usually when we talk about controllers we refer to them
as a “loop”
Imagine a controller for Pods (aka kubelet). It has 2
jobs: 1) Actuate the pod API 2) Report status on pods
What you’d expect looks something like:
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c for each pod
p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b
...then repeat (aka “a poll loop”)
Here’s where it matters
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b
Node Kubernetes API a kubelet c c a b kubectl
delete pod b
Node Kubernetes API a kubelet c Get all pods c
a b
Node Kubernetes API a kubelet c { name: a, ...
} { name: c, ... } c a b
Node Kubernetes API a kubelet c I have “b” but
API doesn’t - delete it! c a b
Node Kubernetes API a kubelet c Set status c a
This is correct level-triggered reconciliation Read desired state, make it
so
Some controllers are implemented this way, but it’s inefficient at
scale
Imagine thousands of controllers (kubelet, kube-proxy, dns, ingress, storage...) polling
continuously
We need to achieve the same behavior more efficiently
We could poll less often, but then it takes a
long (and variable) time to react - not a great UX
Enter the “list-watch” model
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Watch all pods
Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... } for each pod p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
We trade memory (the cache) for other resources (API server
CPU in particular)
There’s no point in polling my own cache, so what
happens next?
Remember that watch we did earlier? That’s an open stream
for events.
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c c a b kubectl
delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: c, ... }
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a b API said to delete pod “b”.
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a API said to delete pod “b”.
“But you said edge-triggered is bad!”
It is! But this isn’t edge-triggered.
The cache is updated by events (edges) but we are
still reconciling state
“???”
The controller can be restarted at any time and the
cache will be reconstructed - we can’t “miss an edge*” * modulo bugs, read on
Even if you miss an event, you can still recover
the state
Ultimately it’s all just software, and software has bugs. Controllers
should re-list periodically to get full state...
...but we’ve put a lot of energy into making sure
that our list-watch is reliable.