Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.6k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
510
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.1k
Other Decks in Technology
See All in Technology
プロダクトチームへのSystem Risk Records導入・運用事例の紹介/Introduction and Case Studies on Implementing and Operating System Risk Records for Product Teams
taddy_919
1
170
ABEMA のコンテンツ制作を最適化!生成 AI x クラウド映像編集システム / abema-ai-editor
cyberagentdevelopers
PRO
1
180
ガチ勢によるPipeCD運用大全〜滑らかなCI/CDを添えて〜 / ai-pipecd-encyclopedia
cyberagentdevelopers
PRO
3
210
10分でわかるfreee エンジニア向け会社説明資料
freee
18
520k
Forget efficiency – Become more productive without the stress
ufried
0
150
APIテスト自動化の勘所
yokawasa
7
4.2k
話題のGraphRAG、その可能性と課題を理解する
hide212131
4
1.5k
カメラを用いた店内計測におけるオプトインの仕組みの実現 / ai-optin-camera
cyberagentdevelopers
PRO
1
120
物価高なラスベガスでの過ごし方
zakky
0
380
「最高のチューニング」をしないために / hack@delta 24.10
fujiwara3
21
3.5k
リンクアンドモチベーション ソフトウェアエンジニア向け紹介資料 / Introduction to Link and Motivation for Software Engineers
lmi
4
290k
チームを主語にしてみる / Making "Team" the Subject
ar_tama
4
310
Featured
See All Featured
Statistics for Hackers
jakevdp
796
220k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
191
16k
Become a Pro
speakerdeck
PRO
24
5k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
664
120k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
25
1.8k
RailsConf 2023
tenderlove
29
880
Why You Should Never Use an ORM
jnunemaker
PRO
53
9k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
48k
The Language of Interfaces
destraynor
154
24k
Raft: Consensus for Rubyists
vanstee
136
6.6k
Product Roadmaps are Hard
iamctodd
PRO
48
10k
Facilitating Awesome Meetings
lara
49
6k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]