Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
620
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
Create Ruby native extension gem with Go
sue445
0
160
『ホットペッパービューティー』のiOSアプリをUIKitからSwiftUIへ段階的に移行するためにやったこと
recruitengineers
PRO
1
100
roppongirb_20250911
igaiga
1
260
OCI Oracle Database Services新機能アップデート(2025/06-2025/08)
oracle4engineer
PRO
0
190
共有と分離 - Compose Multiplatform "本番導入" の設計指針
error96num
2
1.3k
AIエージェントで90秒の広告動画を制作!台本・音声・映像・編集をつなぐAWS最新アーキテクチャの実践
nasuvitz
3
420
AIフレンドリーなコードベースを目指して/登壇資料(高橋 悟生)
hacobu
PRO
0
120
Apache Spark もくもく会
taka_aki
0
150
Aurora DSQLはサーバーレスアーキテクチャの常識を変えるのか
iwatatomoya
1
1.2k
エンジニアリングマネージャーの成長の道筋とキャリア / Developers Summit 2025 KANSAI
daiksy
3
1.3k
会社紹介資料 / Sansan Company Profile
sansan33
PRO
7
380k
20250910_障害注入から効率的復旧へ_カオスエンジニアリング_生成AIで考えるAWS障害対応.pdf
sh_fk2
3
280
Featured
See All Featured
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
How to train your dragon (web standard)
notwaldorf
96
6.2k
The Art of Programming - Codeland 2020
erikaheidi
56
13k
GitHub's CSS Performance
jonrohan
1032
460k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.1k
Site-Speed That Sticks
csswizardry
10
830
Git: the NoSQL Database
bkeepers
PRO
431
66k
The Language of Interfaces
destraynor
161
25k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Embracing the Ebb and Flow
colly
87
4.8k
The Cult of Friendly URLs
andyhume
79
6.6k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]