Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
production: an owner's manual
Search
Igor Wiedler
April 23, 2018
Programming
190
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
production: an owner's manual
from exec(ut) 2018
Igor Wiedler
April 23, 2018
More Decks by Igor Wiedler
See All by Igor Wiedler
Redis Bedtime Stories
igorw
1
360
Wide Event Analytics (LISA19)
igorw
4
940
a day in the life of a request
igorw
0
170
The Power of 2
igorw
0
340
LISP 1.5 Programmer's Manual: A Dramatic Reading
igorw
0
480
The Moral Character of Software
igorw
1
310
interdisciplinary computing (domcode)
igorw
0
320
miniKanren (clojure berlin)
igorw
1
330
End the war on tabs (phpnw14)
igorw
1
1.1k
Other Decks in Programming
See All in Programming
Oxcを導入して開発体験が向上した話
yug1224
4
340
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
21
7k
気圧・高度・GPSを記録&可視化するアプリ「Koudo」を作った話
hjmkth
1
320
AI 輔助遺留系統現代化的經驗分享
jame2408
1
990
Vue × Nuxt × Oxc どこまで使える?実運用の現在地
andpad
0
300
RTSPクライアントを自作してみた話
simotin13
0
630
Strategic Design in the Frontend: Moduliths & Micro Frontends @DDDEurope
manfredsteyer
PRO
0
130
Dataformのリポジトリを立ち上げるときにまずやること / dataform-day0-2026
snhryt
0
180
脅威をエンジニアリングの糧にして――現場編 / Turning Threats into Engineering Fuel — Field Edition
nrslib
0
300
AIだと陥りがちなJakarta EE最新技術への移行時の落とし穴と解決策
tnagao7
0
120
New "Type" system on PicoRuby
pocke
1
1k
LaravelLive Japan の裏方のすべて — 第188回 PHP勉強会@東京 (2026-06-24)
suguruooki
2
120
Featured
See All Featured
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
850
The untapped power of vector embeddings
frankvandijk
2
1.8k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
11
950
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
35k
Everyday Curiosity
cassininazir
0
240
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
200
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Navigating Team Friction
lara
192
16k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
750
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
123
22k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
340
Transcript
production: an owner's manual
hello!
broken computers
None
getting sidetracked now so sorry* * not sorry
None
None
None
back to serious business
!
None
a production system is a system that serves real users
the goal of operations is to ensure services are reliable
in order to provide a good user experience
None
failure
app
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber
app linux kernel the cloud
None
• cosmic rays • disk failure • power outages •
software bugs • ...
entropy
None
capacity
None
None
None
cascading failure
None
system design
redundancy
"
scale
None
"
p1 m3 c1 m2 m1 p2 c2
data storage
"
"
protocols
None
monitoring
many components many req/s
None
measure all the things?
✅ ⏱
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
0 - 50 [1620]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (74.55%) 50 - 100 [ 447]: ∎∎∎∎∎∎∎∎∎∎ (20.57%) 100 - 150 [ 49]: ∎ (2.25%) 150 - 200 [ 15]: (0.69%) 200 - 250 [ 15]: (0.69%) 250 - 300 [ 10]: (0.46%) 300 - 350 [ 6]: (0.28%) 350 - 400 [ 1]: (0.05%) 400 - 450 [ 0]: (0.00%) 450 - 500 [ 4]: (0.18%)
golden signals • latency • traffic • errors • saturation
saturation traffic latency errors
None
humans
None
oops, deleted the database
bad human!
why does this button even exist?
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber humans
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber humans h u m a n s
epic failure is almost always systemic
failure
recap
• a production system serves real users • users like
things that work and are fast • epic failure is almost always systemic
thx @igorwhilefalse
None