Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
72
0
Share
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
410
A Brief Introduction To DevOps
esigler
0
120
Humans are terrible compilers: A User's Guide
esigler
0
140
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
210
"Is there any strong objection?"
esigler
0
250
Fear, Uncertainty, and Continuous Deployment
esigler
1
150
3AM, a survey.
esigler
0
270
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
190
Engineering for Engineers
esigler
0
110
Other Decks in Technology
See All in Technology
Gaussian Splattingの実用化 - 映像制作への展開
gpuunite_official
0
200
AI全盛の今だからこそ、あえてもう一度振り返るAPIの基礎
smt7174
3
140
TypeScriptはどのようにどこまで推論できるのか ─ とにかく as は禁止で
ypresto
0
200
Cortex(Code) を ML モデルの 精度改善サイクルに組み込む.pdf
oimo23
0
240
ワールドカフェ再び、そしてゴール・ルール・ロール・ツール / World Café Revisited, and the Goals-Rules-Roles-Tools
ks91
PRO
0
180
How to learn AWS Well-Architected with AWS BuilderCards: Security Edition
coosuke
PRO
0
180
Claude Code で使える DuckDB Skills を試してみた / DuckDB Skills and Claude Code
masahirokawahara
1
1.3k
AWSアップデートから考える継続的な運用改善
toru_kubota
2
310
AI時代に、 データアナリストがデータエンジニアに異動して
jackojacko_
0
1.1k
20260515 ⾃分のアカウントとプライバシーを守る認証と認可の話〜利⽤者向け〜
oidfj
0
790
サイボウズ、プラットフォームエンジニアリング始めるってよ ― プラットフォームチームの事業貢献と組織アラインメントの強化
ueokande
0
120
PdM・Eng・QAで進めるAI駆動開発の現在地/aidd-with-pdm-eng-qa
shota_kusaba
0
260
Featured
See All Featured
Technical Leadership for Architectural Decision Making
baasie
3
370
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
1
220
Optimising Largest Contentful Paint
csswizardry
37
3.7k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
4k
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
180
[SF Ruby Conf 2025] Rails X
palkan
2
1k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
WCS-LA-2024
lcolladotor
0
590
Art, The Web, and Tiny UX
lynnandtonic
304
21k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler