Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.8k
Serverless Containers
rakyll
1
250
Critical Path Analysis
rakyll
0
580
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
10分でわかるfreeeのQA
freee
1
12k
20250413_湘南kaggler会_音声認識で使うのってメルス・・・なんだっけ?
sugupoko
1
360
「それはhowなんよ〜」のガイドライン #orestudy
77web
9
2.4k
Vision Pro X Text to 3D Model ~How Swift and Generative Al Unlock a New Era of Spatial Computing~
igaryo0506
0
260
AIと開発者の共創: エージェント時代におけるAIフレンドリーなDevOpsの実践
bicstone
1
230
ブラウザのレガシー・独自機能を愛でる-Firefoxの脆弱性4選- / Browser Crash Club #1
masatokinugawa
1
380
古き良き Laravel のシステムは関数型スタイルでリファクタできるのか
leveragestech
1
630
.mdc駆動ナレッジマネジメント/.mdc-driven knowledge management
yodakeisuke
24
11k
YOLOv10~v12
tenten0727
3
840
Tokyo dbt Meetup #13 dbtと連携するBI製品&機能ざっくり紹介
sagara
0
420
2025年春に見直したい、リソース最適化の基本
sogaoh
PRO
0
460
Micro Frontends: Necessity, Implementation, and Challenges
rainerhahnekamp
0
330
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
34
2.2k
Six Lessons from altMBA
skipperchong
27
3.7k
GraphQLとの向き合い方2022年版
quramy
46
14k
Building Adaptive Systems
keathley
41
2.5k
Thoughts on Productivity
jonyablonski
69
4.6k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
30
2k
YesSQL, Process and Tooling at Scale
rocio
172
14k
Become a Pro
speakerdeck
PRO
27
5.3k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.6k
Into the Great Unknown - MozCon
thekraken
37
1.7k
Why You Should Never Use an ORM
jnunemaker
PRO
55
9.3k
Designing for humans not robots
tammielis
252
25k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]