Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
540
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.1k
Other Decks in Technology
See All in Technology
GoogleのAIエージェント論 Authors: Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic
customercloud
PRO
0
160
JAWS-UG20250116_iOSアプリエンジニアがAWSreInventに行ってきた(真面目編)
totokit4
0
140
Oracle Exadata Database Service(Dedicated Infrastructure):サービス概要のご紹介
oracle4engineer
PRO
0
12k
自社 200 記事を元に整理した読みやすいテックブログを書くための Tips 集
masakihirose
2
330
Formal Development of Operating Systems in Rust
riru
1
420
技術に触れたり、顔を出そう
maruto
1
160
深層学習と3Dキャプチャ・3Dモデル生成(土木学会応用力学委員会 応用数理・AIセミナー)
pfn
PRO
0
460
ABWGのRe:Cap!
hm5ug
1
120
ゼロからわかる!!AWSの構成図を書いてみようワークショップ 問題&解答解説 #デッカイギ #羽田デッカイギおつ
_mossann_t
0
1.5k
20250116_JAWS_Osaka
takuyay0ne
2
200
.NET 最新アップデート ~ AI とクラウド時代のアプリモダナイゼーション
chack411
0
200
20250116_自部署内でAmazon Nova体験会をやってみた話
riz3f7
1
100
Featured
See All Featured
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
33
2.7k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
A designer walks into a library…
pauljervisheath
205
24k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Designing Experiences People Love
moore
139
23k
Keith and Marios Guide to Fast Websites
keithpitt
410
22k
VelocityConf: Rendering Performance Case Studies
addyosmani
327
24k
Fireside Chat
paigeccino
34
3.1k
Testing 201, or: Great Expectations
jmmastey
41
7.2k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
3
240
Documentation Writing (for coders)
carmenintech
67
4.5k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]