Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kubernetesコントローラーのパフォーマンスチューニング
Search
Akihiro Ikezoe
March 16, 2023
Programming
2.2k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Kubernetesコントローラーのパフォーマンスチューニング
Kubernetes Meetup Tokyo #56
2023/03/16
https://k8sjp.connpass.com/event/275280/
Akihiro Ikezoe
March 16, 2023
More Decks by Akihiro Ikezoe
See All by Akihiro Ikezoe
Kubernetes Admission Webhook Deep Dive
zoetrope
8
1.6k
Kubernetesオペレータのアンチパターン&ベストプラクティス
zoetrope
11
4.9k
Production-Ready Kubernetesに至るまでの3年間とこれから
zoetrope
4
950
オンプレKubernetesでMySQLクラスタの運用を自動化するためにOperatorを自作している話
zoetrope
5
2.5k
サイボウズを支える技術~インフラ刷新プロジェクトNecoを中心に紹介~
zoetrope
1
1.3k
Kuebernetesクラスタのマルチテナンシーベストプラクティス
zoetrope
8
6.9k
クラウドネイティブなチームづくり
zoetrope
7
4k
Open Policy Agent / Gatekeeper 勉強会
zoetrope
5
3k
Kubernetesクラスタの自動管理システムのつくりかた
zoetrope
3
19k
Other Decks in Programming
See All in Programming
Lessons from Spec-Driven Development
simas
PRO
0
210
Java × distroless で 軽量なコンテナイメージを / Java on Distroless
contour_gara
0
550
フロントエンドとバックエンドで「1文字」を揃えよう
youkidearitai
PRO
0
710
技術的負債解消で開発者の未来を開く- AIの力でコード刷新
kmd2kmd
0
110
LLM本来の能力を解き放つサンドボックス技術とAI民主化への適用
yukukotani
3
4.3k
Go1.27で導入されるジェネリクスメソッドでできること
mackee
0
140
Inside Stream API
skrb
1
740
並列実装の現場、2ヶ月間実務でAIを使い倒したAIもPCも私も限界が近い
ming_ayami
0
130
技術記事、 専門家としてのプログラマ、 言語化
mizchi
13
6.2k
過去最大のMCPアップデート! 2026-07-28 RC版の謎に迫る
licux
6
360
Semantic Version 単位で戦略を柔軟に変えて、パッケージアップデートを自動化する
daitasu
1
260
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
410
Featured
See All Featured
Optimizing for Happiness
mojombo
378
71k
A Modern Web Designer's Workflow
chriscoyier
698
190k
A Soul's Torment
seathinner
6
3k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Navigating Team Friction
lara
192
16k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
1
1.7k
Documentation Writing (for coders)
carmenintech
77
5.4k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
66
55k
HTML-Aware ERB: The Path to Reactive Rendering @ RubyCon 2026, Rimini, Italy
marcoroth
1
200
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
300
ラッコキーワード サービス紹介資料
rakko
1
3.7M
Transcript
None
◆ ◆ ◆ ◼ ◼ ◼ ◼ ◼ ◼ ◼
◆ ◼ ◆ ◼ ◆ ◼ ⚫ ⚫ ◼
None
◆ ◆ ◼ ◆ ◼ ◼ ◼
✓ ✓ ✓ ✓ ✓ ✓
◆ ◆ ◼ ◼ ◆ ◼ ◼
Controller Workers Workers Workers Workers Reconciler Informer
Controller Workers Workers Workers Workers Reconciler Informer
◆ ◼ ◆ ◼ ◼ ◆ ◼ ◼
◆ ◼ ◼ ◆ ◼ ◼ ◆ ◼ ◼
◆ ◼ ◼ ◆
◆ ◼ https://github.com/kubernetes/enhancements/issues/1602 ◆ ◼ https://kubernetes.io/docs/reference/instrumentation/metrics/ ◆ ◼ https://kubernetes.io/docs/concepts/cluster-administration/system-traces/
◆ ◼ ◼ ◼ ⚫ ⚫ https://cybozu-go.github.io/moco/metrics.html ⚫
◆ ◼ ◼ ⚫ ⚫ ⚫ ◼
◆ ◼ ◼ ◼ ◆ ◼ ◼ ◼ https://github.com/cybozu-go/moco/pull/500
◆ ◼ ◼ ◼ ◼ ◼ ◆
◆ ◼ ◼ ⚫ ◆ ◼
◆ ◆ ◼ ◼ ◼ ◼ ◼ ◼
◆ ◼ ◆ ◆ ◼ ◆
None
◆ ◼ ◼ ◆ ◼ ◼ ◆ ◼
Kubernetes Cluster Application Controller ArgoCD Server Repo Server Application Resource
Application Resource
application-controller Workers Workers Workers Workers Status Processors Workers Workers Operation
Processors Application Resource Informer Informer watch Events Application Resource
◆ ◆ ◼ ◼ ◼
◆ ◼ ◆ ◼ ◆ ◼ ◆ ◼
◆ ◼ ◼ ◆
◆ ◼ ◼
application-controller Workers Workers Workers Workers Status Processors Workers Workers Operation
Processors Application Resource Informer Informer watch Events
◆ ◼ ◼ ◆ ◼ ◆ ◼
◆ ◆
◆ ◼ ◼ ◼ ◆
workqueue_depth{job="kube-controller-manager",name="volumes"}
histogram_quantile(0.99, sum(rate( rest_client_rate_limiter_duration_seconds_bucket{ job="kube-controller-manager" }[1m] )) by (le))
kube-controller-manager PersistentVolume Controller
◆ ◼ --kube-api-qps ◆ ◼ ◆ ◼ ◼
None
◆ ◆ ◆
None
◆ ◼ https://github.com/zoetrope/kubbernecker ◼ ◼ ⚫ ◼ ⚫ ⚫
None
# Reconcile 99 histogram_quantile(0.99, sum( rate(controller_runtime_reconcile_time_seconds_bucket[1m]) ) by(job, controller, le)
) # Reconcile sum(rate(controller_runtime_reconcile_total[1m]))by(job, controller, result)
# 99 histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket[1m])) by(job, name, le)) # sum(workqueue_depth) by
(job, name)
◆ ◆ import ( "context" "net/url" "time" "github.com/prometheus/client_golang/prometheus" clmetrics "k8s.io/client-go/tools/metrics"
crmetrics "sigs.k8s.io/controller-runtime/pkg/metrics" ) var ( rateLimiterDelay = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "rest_client_rate_limiter_duration_seconds", Help: "client-go rate limiter delay in seconds. Broken down by verb, and host.", Buckets: []float64{0.005, 0.025, 0.1, 0.25, 0.5, 1.0, 2.0, 4.0, 8.0, 15.0, 30.0, 60.0}, }, []string{"verb", "host"}, ) _ clmetrics.LatencyMetric = &latencyAdapter{} ) func init() { crmetrics.Registry.MustRegister(rateLimiterDelay) adapter := latencyAdapter{ metric: rateLimiterDelay, } clmetrics.RateLimiterLatency = &adapter } type latencyAdapter struct { metric *prometheus.HistogramVec } func (c *latencyAdapter) Observe(_ context.Context, verb string, u url.URL, latency time.Duration) { c.metric.WithLabelValues(verb, u.Host).Observe(latency.Seconds()) }
# Rate Limiter 99 histogram_quantile(0.99, sum( rate(rest_client_rate_limiter_duration_seconds_bucket[1m]) ) by(job, verb,
le) )
# Application Reconcile Status Processor {job=~"argocd/argocd-application-controller"} | logfmt | msg
="Reconciliation completed" | line_format "{{.application}}: {{.time_ms}}" # Application Reconcile Operation Processor {job=~"argocd/argocd-application-controller"} | logfmt | msg = "sync/terminate complete" | line_format "{{.application}}: {{.duration}}"
# {job=~"argocd/argocd-application-controller"} | logfmt | level = "debug" msg =~
"Refreshing app .*" apiVersion: v1 kind: ConfigMap metadata: name: argocd-cmd-params-cm data: # Application Controller debug default "info" controller.log.level: "debug"
◆
◆ $ kubectl port-forward svc/argocd-application-controller-metrics -n argocd 8082:8082 # 30
$ curl localhost:8082/debug/pprof/profile > cpu.pprof # goroutine $ curl localhost:8082/debug/pprof/goroutine?debug=1
◆ ◆ --otlp-address ◆
apiVersion: v1 kind: ConfigMap metadata: name: argocd-cmd-params-cm data: # Number
of application status processors (default 20) controller.status.processors: "20" # Number of application operation processors (default 10) controller.operation.processors: "10" ◆ ◆
import ctrl "sigs.k8s.io/controller-runtime" // ・・・途中省略・・・ cfg, err := ctrl.GetConfig() if
err != nil { return err } cfg.QPS = 50 cfg.Burst = int(cfg.QPS * 1.5) mgr, err := ctrl.NewManager(cfg, ctrl.Options{ ... })