Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
1.6k
3
Share
Servers are doomed to fail
JBD
May 17, 2019
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.8k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.7k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
290
Critical Path Analysis
rakyll
0
700
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.3k
Other Decks in Technology
See All in Technology
テストコードのないプロジェクトにテストを根付かせる
tttol
0
220
さきさん文庫の書籍ができるまで
sakiengineer
0
300
ビジュアルプログラミングIoTLT vol.23
1ftseabass
PRO
0
150
A Harness for Behaviour: how to get AI to generate code that does what we intend, or "TDD in the age of AI"
xpmatteo
0
490
AI駆動開発でなんでもハンズオン環境をつくってみた
yoshimi0227
0
170
大規模環境でどのように監視を実現する?
yuobayashi
2
270
Javaコミュニティをもっと楽しむための9箇条
takasyou
0
420
TROCCOで始めるクラウドコストを民主化するためのFinOps
tk3fftk
1
270
JJUG CCC 2026 Spring AI時代の開発こそ標準化を武器に! ― 方式・プロセス・プラットフォームの標準化
s27watanabe
2
590
LLM時代のリファクタリング戦略_AIエージェントによる段階的・安全なTS移行方法
play_inc
0
460
Datadog 認定試験の概要と対策
uechishingo
0
160
Agentic Design Patterns
glaforge
0
260
Featured
See All Featured
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
590
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
1
230
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.7k
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
1
1.2k
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
180
We Are The Robots
honzajavorek
0
230
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
360
Stop Working from a Prison Cell
hatefulcrawdad
274
21k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
250
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
130
Public Speaking Without Barfing On Your Shoes - THAT 2023
reverentgeek
1
410
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]