Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
59
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
380
A Brief Introduction To DevOps
esigler
0
110
Humans are terrible compilers: A User's Guide
esigler
0
120
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
180
"Is there any strong objection?"
esigler
0
230
Fear, Uncertainty, and Continuous Deployment
esigler
1
130
3AM, a survey.
esigler
0
240
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
170
Engineering for Engineers
esigler
0
99
Other Decks in Technology
See All in Technology
第65回コンピュータビジョン勉強会
tsukamotokenji
0
140
ユーザーストーリー x AI / User Stories x AI
oomatomo
0
190
ググるより、AIに聞こう - Don’t Google it, ask AI
oikon48
0
890
決済システムの信頼性を支える技術と運用の実践
ykagano
0
630
Introducing RFC9111 / YAPC::Fukuoka 2025
k1low
1
240
re:Invent完全攻略ガイド
junjikoide
1
340
嗚呼、当時の本番環境の状態で AI Agentを再評価したいなぁ...
po3rin
0
420
Javaコミュニティの歩き方 ~参加から貢献まで、すべて教えます~
tabatad
0
120
QAエンジニアがプロダクト専任で チームの中に入ると。。。?/登壇資料(杉森 太樹)
hacobu
PRO
1
550
クレジットカードの不正を防止する技術
yutadayo
16
7.4k
こんな時代だからこそ! 想定しておきたいアクセスキー漏洩後のムーブ
takuyay0ne
4
580
自己的售票系統自己做!
eddie
0
450
Featured
See All Featured
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
Embracing the Ebb and Flow
colly
88
4.9k
Writing Fast Ruby
sferik
630
62k
Unsuck your backbone
ammeep
671
58k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
658
61k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.7k
Designing for Performance
lara
610
69k
Navigating Team Friction
lara
190
15k
Stop Working from a Prison Cell
hatefulcrawdad
272
21k
How to train your dragon (web standard)
notwaldorf
97
6.4k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler