Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
58
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
370
A Brief Introduction To DevOps
esigler
0
110
Humans are terrible compilers: A User's Guide
esigler
0
120
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
180
"Is there any strong objection?"
esigler
0
220
Fear, Uncertainty, and Continuous Deployment
esigler
1
120
3AM, a survey.
esigler
0
230
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
160
Engineering for Engineers
esigler
0
89
Other Decks in Technology
See All in Technology
FAST導入1年間のふりかえり〜現実を直視し、さらなる進化を求めて〜 / Review of the first year of FAST implementation
wooootack
1
120
分散トレーシングによる コネクティッドカーのデータ処理見える化の試み
thatsdone
0
210
会社もクラウドも違うけど 通じたコスト削減テクニック/Cost optimization strategies effective regardless of company or cloud provider
aeonpeople
2
160
Expertise as a Service via MCP
yodakeisuke
1
140
AI駆動開発 with MixLeap Study【大阪支部 #3】
lycorptech_jp
PRO
0
200
MCPと認可まわりの話 / mcp_and_authorization
convto
1
140
エンジニアリングマネージャー“お悩み相談”パネルセッション
ar_tama
1
650
手動からの解放!!Strands Agents で実現する総合テスト自動化
ideaws
2
290
Talk to Someone At Delta Airlines™️ USA Contact Numbers
travelcarecenter
0
170
PHPでResult型やってみよう
higaki_program
0
190
Step Functions First - サーバーレスアーキテクチャの新しいパラダイム
taikis
1
280
少人数でも回る! DevinとPlaybookで支える運用改善
ishikawa_pro
1
230
Featured
See All Featured
Site-Speed That Sticks
csswizardry
10
720
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Git: the NoSQL Database
bkeepers
PRO
431
65k
Side Projects
sachag
455
43k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Bash Introduction
62gerente
613
210k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
5.9k
Scaling GitHub
holman
461
140k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler