Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
59
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
380
A Brief Introduction To DevOps
esigler
0
110
Humans are terrible compilers: A User's Guide
esigler
0
120
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
180
"Is there any strong objection?"
esigler
0
230
Fear, Uncertainty, and Continuous Deployment
esigler
1
120
3AM, a survey.
esigler
0
230
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
160
Engineering for Engineers
esigler
0
94
Other Decks in Technology
See All in Technology
Linux カーネルが支えるコンテナの仕組み / LF Japan Community Days 2025 Osaka
tenforward
1
120
AI時代の開発を加速する組織づくり - ブログでは書けなかったリアル
hiro8ma
1
290
AI-Readyを目指した非構造化データのメダリオンアーキテクチャ
r_miura
1
300
Okta Identity Governanceで実現する最小権限の原則 / Implementing the Principle of Least Privilege with Okta Identity Governance
tatsumin39
0
170
「魔法少女まどか☆マギカ Magia Exedra」におけるバックエンドの技術選定
gree_tech
PRO
0
120
AI時代、“平均値”ではいられない
uhyo
8
2.4k
デザインとエンジニアリングの架け橋を目指す OPTiMのデザインシステム「nucleus」の軌跡と広げ方
optim
0
100
今この時代に技術とどう向き合うべきか
gree_tech
PRO
2
2.2k
HonoとJSXを使って管理画面をサクッと型安全に作ろう
diggymo
0
170
まだ間に合う! 2025年のhono/ssg事情
watany
3
620
AWS UG Grantでグローバル20名に選出されてre:Inventに行く話と、マルチクラウドセキュリティの教科書を執筆した話 / The Story of Being Selected for the AWS UG Grant to Attending re:Invent, and Writing a Multi-Cloud Security Textbook
yuj1osm
1
130
ストレージエンジニアの仕事と、近年の計算機について / 第58回 情報科学若手の会
pfn
PRO
2
410
Featured
See All Featured
Product Roadmaps are Hard
iamctodd
PRO
55
11k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Faster Mobile Websites
deanohume
310
31k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.7k
Into the Great Unknown - MozCon
thekraken
40
2.1k
The World Runs on Bad Software
bkeepers
PRO
72
11k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.2k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
630
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
116
20k
Typedesign – Prime Four
hannesfritz
42
2.8k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler