Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
62
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
390
A Brief Introduction To DevOps
esigler
0
120
Humans are terrible compilers: A User's Guide
esigler
0
130
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
190
"Is there any strong objection?"
esigler
0
230
Fear, Uncertainty, and Continuous Deployment
esigler
1
130
3AM, a survey.
esigler
0
240
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
180
Engineering for Engineers
esigler
0
110
Other Decks in Technology
See All in Technology
ハッカソンから社内プロダクトへ AIエージェント「ko☆shi」開発で学んだ4つの重要要素
sonoda_mj
6
2k
Microsoft Agent Frameworkの可観測性
tomokusaba
1
120
AWSの新機能をフル活用した「re:Inventエージェント」開発秘話
minorun365
2
520
M&Aで拡大し続けるGENDAのデータ活用を促すためのDatabricks権限管理 / AEON TECH HUB #22
genda
0
310
松尾研LLM講座2025 応用編Day3「軽量化」 講義資料
aratako
14
4.8k
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
3
240
AI駆動開発ライフサイクル(AI-DLC)の始め方
ryansbcho79
0
280
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
1
810
AIエージェントを5分で一気におさらい!AIエージェント「構築」元年に備えよう
yakumo
1
130
戰略轉變:從建構 AI 代理人到發展可擴展的技能生態系統
appleboy
0
170
[Data & AI Summit '25 Fall] AIでデータ活用を進化させる!Google Cloudで作るデータ活用の未来
kirimaru
0
4.2k
_第4回__AIxIoTビジネス共創ラボ紹介資料_20251203.pdf
iotcomjpadmin
0
160
Featured
See All Featured
Technical Leadership for Architectural Decision Making
baasie
0
200
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
110
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
150
Prompt Engineering for Job Search
mfonobong
0
130
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
0
360
Producing Creativity
orderedlist
PRO
348
40k
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
120
Documentation Writing (for coders)
carmenintech
77
5.2k
A Modern Web Designer's Workflow
chriscoyier
698
190k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
350
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
0
1.1k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler