Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
630
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
エンタメとAIのための3Dパラレルワールド構築(GPU UNITE 2025 特別講演)
pfn
PRO
0
270
PHPからはじめるコンピュータアーキテクチャ / From Scripts to Silicon: A Journey Through the Layers of Computing Hiroshima 2025 Edition
tomzoh
0
130
大規模サーバーレスAPIの堅牢性・信頼性設計 〜AWSのベストプラクティスから始まる現実的制約との向き合い方〜
maimyyym
9
4.4k
Adminaで実現するISMS/SOC2運用の効率化 〜 アカウント管理編 〜
shonansurvivors
4
440
Oracle Base Database Service 技術詳細
oracle4engineer
PRO
11
80k
小学4年生夏休みの自由研究「ぼくと Copilot エージェント」
taichinakamura
0
680
ユーザーの声とAI検証で進める、プロダクトディスカバリー
sansantech
PRO
1
130
Geospatialの世界最前線を探る [2025年版]
dayjournal
1
220
社内お問い合わせBotの仕組みと学び
nish01
1
590
Wasmのエコシステムを使った ツール作成方法
askua
0
140
Exadata Database Service on Dedicated Infrastructure(ExaDB-D) UI スクリーン・キャプチャ集
oracle4engineer
PRO
3
5.5k
ガバメントクラウドの概要と自治体事例(名古屋市)
techniczna
2
230
Featured
See All Featured
Large-scale JavaScript Application Architecture
addyosmani
514
110k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
900
Reflections from 52 weeks, 52 projects
jeffersonlam
352
21k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
870
How STYLIGHT went responsive
nonsquared
100
5.8k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
How to train your dragon (web standard)
notwaldorf
96
6.3k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]