Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
650
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
インフラ室事例集
mixi_engineers
PRO
2
220
All About Sansan – for New Global Engineers
sansan33
PRO
1
1.3k
小規模チームによる衛星管制システムの開発とスケーラビリティの実現
sankichi92
0
190
.NET 10 のパフォーマンス改善
nenonaninu
2
4.6k
小さな判断で育つ、大きな意思決定力 / 20251204 Takahiro Kinjo
shift_evolve
PRO
1
250
事業部のプロジェクト進行と開発チームの改善の “時間軸" のすり合わせ
konifar
9
2.8k
会社紹介資料 / Sansan Company Profile
sansan33
PRO
11
390k
32のキーワードで学ぶ はじめての耐量子暗号(PQC) / Getting Started with Post-Quantum Cryptography in 32 keywords
quiver
0
160
TOAMI~投網~: フィッシングハンター支援用ブラウザ拡張ツール / TOAMI ~Casting Net~: Browser Extension Tool for Supporting Phishing Hunters
nttcom
1
120
Claude Code Getting Started Guide(en)
oikon48
0
130
Active Directory 勉強会 第 6 回目 Active Directory セキュリティについて学ぶ回
eurekaberry
16
5.9k
pmconf2025 - データを活用し「価値」へ繋げる
glorypulse
0
380
Featured
See All Featured
Testing 201, or: Great Expectations
jmmastey
46
7.8k
Designing Experiences People Love
moore
142
24k
Embracing the Ebb and Flow
colly
88
4.9k
Navigating Team Friction
lara
191
16k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Visualization
eitanlees
150
16k
We Have a Design System, Now What?
morganepeng
54
7.9k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.3k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.6k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]