Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
68
0
Share
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
400
A Brief Introduction To DevOps
esigler
0
120
Humans are terrible compilers: A User's Guide
esigler
0
130
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
200
"Is there any strong objection?"
esigler
0
240
Fear, Uncertainty, and Continuous Deployment
esigler
1
140
3AM, a survey.
esigler
0
260
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
190
Engineering for Engineers
esigler
0
110
Other Decks in Technology
See All in Technology
DMBOKを使ってレバレジーズのデータマネジメントを評価した
leveragestech
0
490
Microsoft Fabricで考える非構造データのAI活用
ryomaru0825
0
550
TUNA Camp 2026 京都Stage ヒューリスティックアルゴリズム入門
terryu16
0
650
トイルを超えたCREは何屋になるのか
bengo4com
0
110
出版記念イベントin大阪「書籍紹介&私がよく使うMCPサーバー3選と社内で安全に活用する方法」
kintotechdev
0
120
MIX AUDIO EN BROADCAST
ralpherick
0
140
ThetaOS - A Mythical Machine comes Alive
aslander
0
230
遊びで始めたNew Relic MCP、気づいたらChatOpsなオブザーバビリティボットができてました/From New Relic MCP to a ChatOps Observability Bot
aeonpeople
1
130
SSoT(Single Source of Truth)で「壊して再生」する設計
kawauso
2
400
VSCode中心だった自分がターミナル沼に入門した話
sanogemaru
0
870
サイボウズ 開発本部採用ピッチ / Cybozu Engineer Recruit
cybozuinsideout
PRO
10
77k
AI時代のシステム開発者の仕事_20260328
sengtor
0
320
Featured
See All Featured
Designing for Performance
lara
611
70k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.4k
Ecommerce SEO: The Keys for Success Now & Beyond - #SERPConf2024
aleyda
1
1.9k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
170
Music & Morning Musume
bryan
47
7.1k
Believing is Seeing
oripsolob
1
100
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
1
160
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.3k
GraphQLとの向き合い方2022年版
quramy
50
14k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
[RailsConf 2023] Rails as a piece of cake
palkan
59
6.4k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler