Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
640
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
ラスベガスの歩き方 2025年版(re:Invent 事前勉強会)
junjikoide
0
920
激動の時代を爆速リチーミングで乗り越えろ
sansantech
PRO
1
250
ゼロコード計装導入後のカスタム計装でさらに可観測性を高めよう
sansantech
PRO
1
700
AI連携の新常識! 話題のMCPをはじめて学ぶ!
makoakiba
0
180
abema-trace-sampling-observability-cost-optimization
tetsuya28
0
470
戦えるAIエージェントの作り方
iwiwi
22
11k
Spec Driven Development入門/spec_driven_development_for_learners
hanhan1978
1
610
GPUをつかってベクトル検索を扱う手法のお話し~NVIDIA cuVSとCAGRA~
fshuhe
0
380
AI時代に必要なデータプラットフォームの要件とは by @Kazaneya_PR / 20251107
kazaneya
PRO
4
610
[Journal club] Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
keio_smilab
PRO
0
110
Databricks Free Editionで始めるMLflow
taka_aki
0
790
Snowflakeとdbtで加速する 「TVCMデータで価値を生む組織」への進化論 / Evolving TVCM Data Value in TELECY with Snowflake and dbt
carta_engineering
0
140
Featured
See All Featured
A designer walks into a library…
pauljervisheath
209
24k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Building a Scalable Design System with Sketch
lauravandoore
463
33k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
253
22k
Stop Working from a Prison Cell
hatefulcrawdad
272
21k
Music & Morning Musume
bryan
46
6.9k
Git: the NoSQL Database
bkeepers
PRO
431
66k
Facilitating Awesome Meetings
lara
57
6.6k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
950
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.7k
Site-Speed That Sticks
csswizardry
13
940
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]