Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetes Controllers - are they loops or events?
Search
Tim Hockin
February 20, 2021
Technology
11
4k
Kubernetes Controllers - are they loops or events?
Tim Hockin
February 20, 2021
Tweet
Share
More Decks by Tim Hockin
See All by Tim Hockin
Kubernetes in the 2nd Decade
thockin
0
410
Why Service is the worst API in Kubernetes, and what we can do about it
thockin
2
970
Kubernetes Pod Probes
thockin
6
4.6k
Go Workspaces for Kubernetes
thockin
2
1.1k
Code Review in Kubernetes
thockin
2
1.8k
Multi-cluster: past, present, future
thockin
0
530
Kubernetes Network Models (why is this so dang hard?)
thockin
9
1.9k
KubeCon EU 2020: SIG-Network Intro and Deep-Dive
thockin
8
1.3k
A Non-Technical Kubernetes Talk (KubeCon EU 2020)
thockin
3
620
Other Decks in Technology
See All in Technology
カンファレンスに託児サポートがあるということ / Having Childcare Support at Conferences
nobu09
1
580
Data Hubグループ 紹介資料
sansan33
PRO
0
2.2k
Performance Insights 廃止から Database Insights 利用へ/transition-from-performance-insights-to-database-insights
emiki
0
280
AI時代こそ求められる設計力- AWSクラウドデザインパターン3選で信頼性と拡張性を高める-
kenichirokimura
3
320
サイバーエージェント流クラウドコスト削減施策「みんなで金塊堀太郎」
kurochan
3
1.9k
やる気のない自分との向き合い方/How to Deal with Your Unmotivated Self
sanogemaru
0
510
"プロポーザルってなんか怖そう"という境界を超えてみた@TSUDOI by giftee Tech #1
shilo113
0
200
AWS IoT 超入門 2025
hattori
0
340
AWS Control Tower に学ぶ! IAM Identity Center 権限設計の第一歩 / IAM Identity Center with Control Tower
y___u
0
170
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
12
80k
GoでもGUIアプリを作りたい!
kworkdev
PRO
0
150
ニッポンの人に知ってもらいたいGISスポット
sakaik
0
150
Featured
See All Featured
Designing Experiences People Love
moore
142
24k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Music & Morning Musume
bryan
46
6.8k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
45
2.5k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
33
2.5k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Large-scale JavaScript Application Architecture
addyosmani
514
110k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
We Have a Design System, Now What?
morganepeng
53
7.8k
Build your cross-platform service in a week with App Engine
jlugia
232
18k
Transcript
Kubernetes Controllers Are they loops or events? Tim Hockin @thockin
v1
Background on “reconciliation”: https://speakerdeck.com/thockin/kubernetes-what-is-reconciliation
Background on “edge vs. level”: https://speakerdeck.com/thockin/edge-vs-level-triggered-logic
Usually when we talk about controllers we refer to them
as a “loop”
Imagine a controller for Pods (aka kubelet). It has 2
jobs: 1) Actuate the pod API 2) Report status on pods
What you’d expect looks something like:
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c for each pod
p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b
...then repeat (aka “a poll loop”)
Here’s where it matters
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b
Node Kubernetes API a kubelet c c a b kubectl
delete pod b
Node Kubernetes API a kubelet c Get all pods c
a b
Node Kubernetes API a kubelet c { name: a, ...
} { name: c, ... } c a b
Node Kubernetes API a kubelet c I have “b” but
API doesn’t - delete it! c a b
Node Kubernetes API a kubelet c Set status c a
This is correct level-triggered reconciliation Read desired state, make it
so
Some controllers are implemented this way, but it’s inefficient at
scale
Imagine thousands of controllers (kubelet, kube-proxy, dns, ingress, storage...) polling
continuously
We need to achieve the same behavior more efficiently
We could poll less often, but then it takes a
long (and variable) time to react - not a great UX
Enter the “list-watch” model
Node Kubernetes API a kubelet b c Get all pods
Node Kubernetes API a kubelet b c { name: a,
... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Watch all pods
Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet b c Cache: { name:
a, ... } { name: b, ... } { name: c, ... } for each pod p { if p is running { verify p config } else { start p } gather status }
Node Kubernetes API a kubelet b c Set status c
a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
We trade memory (the cache) for other resources (API server
CPU in particular)
There’s no point in polling my own cache, so what
happens next?
Remember that watch we did earlier? That’s an open stream
for events.
Node Kubernetes API a kubelet b c c a b
kubectl delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c c a b kubectl
delete pod b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: b, ... } { name: c, ... }
Node Kubernetes API a kubelet c Delete: { name: b,
... } c a b Cache: { name: a, ... } { name: c, ... }
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a b API said to delete pod “b”.
Node Kubernetes API a kubelet c Cache: { name: a,
... } { name: c, ... } c a API said to delete pod “b”.
“But you said edge-triggered is bad!”
It is! But this isn’t edge-triggered.
The cache is updated by events (edges) but we are
still reconciling state
“???”
The controller can be restarted at any time and the
cache will be reconstructed - we can’t “miss an edge*” * modulo bugs, read on
Even if you miss an event, you can still recover
the state
Ultimately it’s all just software, and software has bugs. Controllers
should re-list periodically to get full state...
...but we’ve put a lot of energy into making sure
that our list-watch is reliable.