Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Monitoring in Motion: Challenges of Monitoring ...
Search
Ilan Rabinovitch
February 26, 2016
Technology
0
110
Monitoring in Motion: Challenges of Monitoring Containers and Kuberntes
Ilan Rabinovitch
February 26, 2016
Tweet
Share
More Decks by Ilan Rabinovitch
See All by Ilan Rabinovitch
Monitoring in Motion - ContainerCon 2016
irabinovitch
0
100
Data Driven Post Mortems at Datadog - LinuxCon 2016
irabinovitch
1
210
Introduction to Docker Monitoring
irabinovitch
0
150
OSCON 2016 - Monitoring in Motion
irabinovitch
2
180
Monitoring OpenStack at Lithium (OpenStack Summit Austin 2016)
irabinovitch
0
72
LinuxFest Northwest 2016 - Monitoring 101
irabinovitch
0
44
Monitoring ECS and Dynamic Infrastructure
irabinovitch
0
110
Doing DevOps Right with Datadog + Pagerduty
irabinovitch
0
130
Docker Usage Patterns - Docker Meetup Palo Alto - Nov 2015
irabinovitch
0
73
Other Decks in Technology
See All in Technology
プロダクト開発の品質を守るAIコードレビュー:事例に見る導入ポイント
moongift
PRO
1
430
【2026年版】生成AIによる情報システムへのインパクト
taka_aki
0
170
NW構成図の自動描画は何が難しいのか?/netdevnight3
corestate55
2
420
技術キャッチアップ効率化を実現する記事推薦システムの構築
yudai00
2
140
GoとWasmでつくる軽量ブラウザUI
keyl0ve
0
130
2026年のAIエージェント構築はどうなる?
minorun365
10
2.3k
教育現場のプロンプトエンジニアリング問題を 解決するAIエージェントを作成してみた
ryoshun
0
120
Claude Codeと駆け抜ける 情報収集と実践録
sontixyou
1
960
Interop Tokyo 2025 ShowNet Team Memberで学んだSRv6を基礎から丁寧に
miyukichi_ospf
0
180
歴史に敬意を! パラシュートVPoEが組織と共同で立ち上がる信頼醸成オンボーディング
go0517go
PRO
0
210
AI活用を"目的"にしたら、データの本質が見えてきた - Snowflake Intelligence実験記 / chasing-ai-finding-data
pei0804
0
580
競争優位を生み出す戦略的内製開発の実践技法
masuda220
PRO
2
430
Featured
See All Featured
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
750
Practical Orchestrator
shlominoach
191
11k
From π to Pie charts
rasagy
0
140
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Claude Code のすすめ
schroneko
67
210k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
830
Color Theory Basics | Prateek | Gurzu
gurzu
0
210
Test your architecture with Archunit
thirion
1
2.2k
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
140
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
130
How to train your dragon (web standard)
notwaldorf
97
6.5k
Transcript
Monitoring In Motion Challenges in Monitoring Kubernetes & Containers Cloud
Native SF Meetup Feb 25, 2016 Ilan Rabinovitch Director, Community Datadog
About Me • Long time Datadog user. • Prior to
Datadog built automation and monitoring tooling at Ooyala and Edmunds.com • SCALE and TXLF Co-Founder Ilan Rabinovitch Datadog
[email protected]
@irabinovitch
Agenda • Monitoring 101 - Crash Course • Challenges in
Monitoring Dynamic Infrastructure • Demo Time • Questions?
Monitoring Everything
None
@honest_update on Twitter
Quick Overview of Datadog • Monitoring for modern applications. •
Time series storage of metrics and events. • Trending, alerting and anomaly detection. • Hundreds of integrations out of the box.
Monitoring 101: Categorization More at: http://goo.gl/t1Rgcg
None
Monitoring 101: Focus on symptoms More at: http://goo.gl/t1Rgcg
Recurse until you find root cause. More at: http://goo.gl/t1Rgcg
Container Monitoring Challenges
https://www.datadoghq.com/docker-adoption/
None
None
Operational Complexity •Average containers per host: N (N=4, 10/2015) •N-times
as many “hosts” to manage •Affects everything
Operational Complexity: Scale 100 instances 400 containers
Operational Complexity: Scale 160 metrics per host 640 metrics per
host
Operational Complexity: Scale 100 instances 64,000 metrics
None
Host Centric vs Service Centric
Host Centric vs Service Centric
Query Based Monitoring … … …
•Use tags, labels, etc on your hosts and metrics. •Pull
in existing labels from your infrastructure (Region, Docker Images, K8S Tags..) Query Based Monitoring By using tags, auto-adapt!
Where is my application running ? What’s the total throughput
of App X ? What’s its response time per tag ? (pod, version, DC) What’s the distribution of 5xx from Nginx per pod ?
Auto Discovery
Docker API Kubelet API Monitoring Agent Container A O A
O A O Application Container Off-The-Shelf Application (Redis, PostgreSQL, …) Containers List Metadata Additional Metadata (Pod names, RC, …) Config Backend Integration Configurations Host Level Metrics
Some Pictures Dashboards and Metrics Alerts Sharing
Demo time