A Tale of Two KEPs

#KubeCon #CloudNativeCon A Tale of Two KEPs: How the Community
is Taming Kubernetes’ CrashLoopBackoﬀ Yang Li, Google Cloud

How many of you have suﬀered from CrashLoopBackOﬀ?

Ever wanted to ﬁx Kubernetes, but didn't know how to
write a KEP?

Voices from Issue #57291 • "Success" Exits (Exit Code 0):
◦ Don't punish a good container • Early Recovery: ◦ Faster retries for transient errors • Late Recovery: ◦ Give us a manual reset button

Workarounds • Pod: Bash wrappers (while true; do app.py; done)
• Cluster: Custom "Pod Reaper" operators • Node: Forking K8s to patch Kubelet binaries

Typical Modern Workloads • Task Isolation: Fast in-place session resets
without Pod rescheduling overhead. • Fast Restart on Failure: Transient, recoverable errors causing massive cascade delays in AI/ML. • Critical Sidecars: Infra kills (e.g., OOMKilled proxies) isolating perfectly healthy main apps.

Alternatives Considered • Flat-rate restarts for Succeeded Pods • RestartPolicy:
Rapid • …

Kubelet Overhead Analysis

Kubelet Overhead Analysis (cont.) • 110 Crashing Pods • 5
QPS at API Traﬃc Peak • 2x Kubelet CPU

From v1.32 to v1.34: KEP-4603

KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration # container restart delays will
start at 10s, increasing # 2x each time they are restarted, to a maximum of 100s crashLoopBackOff: maxContainerRestartPeriod: "100s" apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration # delays between container restarts will always be 2s crashLoopBackOff: maxContainerRestartPeriod: "2s"

v1.35 onwards: KEP-4603 and KEP-5593

Let’s do some demos

Demo Cluster Setup (v1.35) • worker1 ◦ 2s max backoff
(KEP-5593 beta with configuration) • worker2 ◦ 60s max backoff (KEP-4603 alpha) • worker3 ◦ 300s max backoff (pre-v1.35 default)

Scenario 1 - Task Isolation

Scenario 2 - Fast Restart

Scenario 3 - Sidecar Restart

Key Takeaways • Container restarts are heavy, Kubelet has real
physical limits • KEP-4603 and KEP-5593 gives cluster operators granular, safe control over recovery times • Open source is a marathon, pragmatic splits unblock years of frustration

Shoutouts @lauralorenz @hankfreund SIG Node

Any questions?

A Tale of Two KEPs

A Tale of Two KEPs

Y

More Decks by Y

Other Decks in Programming

Featured

Transcript

#KubeCon #CloudNativeCon A Tale of Two KEPs: How the Community

How many of you have suﬀered from CrashLoopBackOﬀ?

Ever wanted to ﬁx Kubernetes, but didn't know how to

Voices from Issue #57291 • "Success" Exits (Exit Code 0):

Workarounds • Pod: Bash wrappers (while true; do app.py; done)

Typical Modern Workloads • Task Isolation: Fast in-place session resets

Alternatives Considered • Flat-rate restarts for Succeeded Pods • RestartPolicy:

Kubelet Overhead Analysis

Kubelet Overhead Analysis (cont.) • 110 Crashing Pods • 5

From v1.32 to v1.34: KEP-4603

KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration # container restart delays will

v1.35 onwards: KEP-4603 and KEP-5593

Let’s do some demos

Demo Cluster Setup (v1.35) • worker1 ◦ 2s max backoﬀ

Scenario 1 - Task Isolation

Scenario 2 - Fast Restart

Scenario 3 - Sidecar Restart

Key Takeaways • Container restarts are heavy, Kubelet has real

Shoutouts @lauralorenz @hankfreund SIG Node

Any questions?