Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
55
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
360
A Brief Introduction To DevOps
esigler
0
100
Humans are terrible compilers: A User's Guide
esigler
0
110
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
160
"Is there any strong objection?"
esigler
0
210
Fear, Uncertainty, and Continuous Deployment
esigler
1
120
3AM, a survey.
esigler
0
220
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
160
Engineering for Engineers
esigler
0
87
Other Decks in Technology
See All in Technology
ビジネスとデザインとエンジニアリングを繋ぐために 一人のエンジニアは何ができるか / What can a single engineer do to connect business, design, and engineering?
kaminashi
2
770
OpsJAWS34_CloudTrailLake_for_Organizations
hiashisan
0
190
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
1
360
Porting PicoRuby to Another Microcontroller: ESP32
yuuu
4
500
日経電子版 for Android の技術的課題と取り組み(令和最新版)/android-20250423
nikkei_engineer_recruiting
1
560
ドキュメント管理の理想と現実
kazuhe
1
270
OpenLane-V2ベンチマークと代表的な手法
kzykmyzw
0
120
白金鉱業Meetup_Vol.18_AIエージェント時代のUI/UX設計
brainpadpr
1
250
生成AIによるCloud Native基盤構築の可能性と実践的ガードレールの敷設について
nwiizo
7
1.3k
LiteXとオレオレCPUで作る自作SoC奮闘記
msyksphinz
0
880
【Λ(らむだ)】最近のアプデ情報 / RPALT20250422
lambda
0
130
AIにおけるソフトウェアテスト_ver1.00
fumisuke
1
290
Featured
See All Featured
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Mobile First: as difficult as doing things right
swwweet
223
9.6k
Adopting Sorbet at Scale
ufuk
76
9.3k
RailsConf 2023
tenderlove
30
1.1k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
Music & Morning Musume
bryan
47
6.5k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
13
810
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
30
2k
How GitHub (no longer) Works
holman
314
140k
Side Projects
sachag
453
42k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
21k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler