Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SRE Practices in Organizations
Search
Narimichi Takamura
November 16, 2021
Technology
16
10k
SRE Practices in Organizations
Infra Study 2nd #7「SREと組織」の登壇資料です。
https://forkwell.connpass.com/event/228038/
Narimichi Takamura
November 16, 2021
Tweet
Share
More Decks by Narimichi Takamura
See All by Narimichi Takamura
Observability — Extending Into Incident Response
nari_ex
1
610
インシデントキーメトリクスによるインシデント対応の改善 / Improving Incident Response using Incident Key Metrics
nari_ex
1
12k
組織的なインシデント対応を目指して〜成熟度評価と改善のステップ〜 / Towards an Organized Incident Response - Maturity Assessment and Improvement Steps -
nari_ex
7
9k
Waroomの開発モチベーションと今後のロードマップ / Waroom development motivation and roadmap
nari_ex
1
1.7k
Engineering with Business Impact
nari_ex
2
320
How We Foster Reliability in Diversity
nari_ex
14
13k
Hardening におけるトラブルシューティング / Troubleshooting in Hardening
nari_ex
1
360
私が Engineering Manager になるまでに経験してきたこと、大切にしてきたこと / Lecture materials for Introduction to Venture Business at UEC
nari_ex
0
250
運用技術者組織の設計と運用 / Design and operation of operational engineer organization
nari_ex
11
10k
Other Decks in Technology
See All in Technology
Open Table Format (OTF) が必要になった背景とその機能 (2025.10.28)
simosako
3
550
OpenCensusと歩んだ7年間
bgpat
0
270
新米エンジニアをTech Leadに任命する ー 成長を支える挑戦的な人と組織のマネジメント
naopr
1
310
パフォーマンスチューニングのために普段からできること/Performance Tuning: Daily Practices
fujiwara3
2
180
デザインとエンジニアリングの架け橋を目指す OPTiMのデザインシステム「nucleus」の軌跡と広げ方
optim
0
130
個人でデジタル庁の デザインシステムをVue.jsで 作っている話
nishiharatsubasa
3
5.3k
設計に疎いエンジニアでも始めやすいアーキテクチャドキュメント
phaya72
18
12k
re:Invent 2025の見どころと便利アイテムをご紹介 / Highlights and Useful Items for re:Invent 2025
yuj1osm
0
450
AIエージェントによる業務効率化への飽くなき挑戦-AWS上の実開発事例から学んだ効果、現実そしてギャップ-
nasuvitz
5
1.5k
AWSが好きすぎて、41歳でエンジニアになり、AAIを経由してAWSパートナー企業に入った話
yama3133
2
210
プロファイルとAIエージェントによる効率的なデバッグ / Effective debugging with profiler and AI assistant
ymotongpoo
1
610
猫でもわかるAmazon Q Developer CLI 解体新書
kentapapa
1
190
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
2.9k
Product Roadmaps are Hard
iamctodd
PRO
55
11k
4 Signs Your Business is Dying
shpigford
186
22k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Code Review Best Practice
trishagee
72
19k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
jQuery: Nuts, Bolts and Bling
dougneiner
65
7.9k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
34
2.3k
The Language of Interfaces
destraynor
162
25k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
Transcript
None
None
about:me
None
None
None
None
Motivation • SRE ͷ৴ཧղͨ͠ && SRE ͷϓϥΫςΟεཧղͨ͠ • ҰํͰɺͲͷΑ͏ʹͯࣗࣾ͠ʹ SRE
Λಋೖ͢Ε͍͍ͷ͔ϐϯͱ͜ͳ͍ • IT ٕज़Ҏ֎ʹཁૉ͕ඞཁͦ͏͕ͩɺ۩ମతʹͲͷΑ͏ͳͷ͕͋Δͷ͔ • SRE ʹؔ͢ΔଞࣾࣄྫࢀߟʹͳΔ • ҰํͰɺࣗࣾద༻͢ΔͨΊʹɺͲͷΑ͏ͳ؍Ͱݕ౼͢ΕΑ͍ͷ͔ → ιϑτεΩϧ ͱ SRE ৫ͷઃܭϙΠϯτ ʹ͍ͭͯ͠·͢ → ࠓճͷൃද͕ ࣗࣾͷ SRE ৫ͷ্ཱͪ͛ɾशख़ ͷҰॿʹͳΕ͍Ͱ͢
Table of Contents • Why is Organization Important in SRE?
• Soft Skills required to implement SRE • SRE Organization Design
Why is Organization Important in SRE?
Business metrics Include Engineering metrics 1 1 Mohit Suley and
Kurt Andersen, Understanding Business Metrics Can Make You a Better SRE, 2019, SREcon
SRE collaborate a lot!!
Culture beats strategy every time — Chapter 31 - Communication
and Collaboration in SRE
Soft Skills required to implement SRE
Why are soft skills so important?
Chapter 31 - Communication and Collaboration in SRE
“A good SRE has an ability to critically examine a
system and use that to guide them when asking questions of the system.” — Jamie Wilkinson, SRE at Google
Top 5 Soft Skills in SRE3 1. Problem Solving 2.
Teamwork 3. Composure underpressure 4. Written communication 5. Verbal communicaiton 3 Catchpoint, 2018 SRE report
SRE ʹٻΊΒΕΔιϑτεΩϧ • ΛޮՌతʹղܾ͢ΔͨΊʹɺଞऀͱ͏·͘ڠྗ͢Δೳྗ ͕ඞཁͰ͋Δ • ͯ͢ͷ͑Λ͍ͬͯΔ͜ͱΛظ͞Ε͍ͯΔͷͰͳ͘ɺ νʔϜ৫ͷதͰ୭ʹॿ͚ΛٻΊΕΑ͍ͷ͔ɺͲͷΑ͏ʹ ίϛϡχέʔγϣϯΛͱΕΑ͍ͷ͔Λ͍ͬͯΔඞཁ͕͋Δ
Soft Skills Example in Implement SRE
Case Soft Skill Postmortem Blameless, Critical Thinking... SLI/SLO Organizational Behavior...
Building consensus with managers Facilitation, Negotiation...
Organizational Behavior • ਓΛಈ͔ͨ͢ΊͷΞϓϩʔν2ͭʹྨ͞ΕΔ • HRM: ΈʹΑΔΞϓϩʔν • OB: ରਓతͳΞϓϩʔν
• SLI/SLO ϙετϞʔςϜͳͲɺଞνʔϜΛר͖ࠐΉΑ͏ͳ γʔϯͰ OB ʹཱͭ
None
ϕʔεͱͳΔߦಈݪཧ
ॏཁͱͳΔ3ͭͷجૅࣝΧςΰϦ
جૅཧͷ۩ମྫ • ݸਓ • ex. εϖϯαʔʮණࢁϞσϧʯɺϘϠςΟζʮίϯϐςϯγʔ֓೦ ਤʯɺόϯσϡʔϥʮࣗݾޮྗײͷߏཁૉʯ • ूஂ •
ex. ϨϰΟϯʮ৫มֵϓϩηεʯɺλοΫϚϯϞσϧ • Ϧʔμʔγοϓ • ex. ΧϦεϚϦʔμʔγοϓɺαʔόϯτϦʔμʔγοϓ...
όϯσϡʔϥʮࣗݾޮྗײͷߏཁૉʯ5 5 GLOBIS ݟ࣮ʂ, MBAᶈ ࣗมֵɺߦ͖ͭΓͭɺগͣͭ͠ʲ࠷ऴճʳ, 2015
ϨϰΟϯʮ৫มֵϓϩηεʯ
OB ͷ࣮ફྫ: SLI/SLO ͷஈ֊తͳಋೖ • ৫ͷಛੑΛѲ ্ͨ͠Ͱɺղౚˠมֵˠ࠶ౚ݁ͷεςοϓ Λ ܦͭͭಋೖ͢Δ •
Dev ͷߦಈݪཧΛཧղ ্ͨ͠ͰɺSLI/SLO ͷಋೖোนΛԼ͛Δ • SLI/SLO ΛτϦΨʔʹΞΫγϣϯͰ͖ΔΑ͏ʹɺߦಈม༰Λଅ ͢ࢪࡦ ʹऔΓΉ
SLI/SLO ಋೖͷϑΣʔζ͚ͷྫ
SLI/SLO ಋೖ: ϑΣʔζ1 ·ͣ SRE ͕ओମͱͳͬͯ৫ʹ SLI/SLO Λಋೖ͠ɺՁݕূΛߦ͏͜ͱΛࢦ ͢ɻ ӡ༻શମΛר͖ࠐΈͭͭɺSRE
͕ίϯτϩʔϧͰ͖ΔൣғͰ͡ΊΔͱΑ͍ɻ 1. SLI/SLO ͕ఆٛ͞Ε͍ͯΔ 2. SLI/SLO ʹؔ͢ΔϫʔΫϑϩʔ͕ఆٛ͞Ε͍ͯΔ 3. αʔϏενʔϜΛר͖ࠐΈͭͭɺSRE ͕ओମͱͳͬͯ SLO ͷӡ༻͕ߦΘΕ͍ͯΔ • SLO ͷΛτϦΨʔʹΞϥʔτ௨Λ͢Δ • ৼΓฦΓձΛߦ͏
SLI/SLO ಋೖ: ϑΣʔζ2 SRE ͷతͳࢧԉͳ͠Ͱ SLI/SLO ͕ӡ༻͞ΕΔମ੍Λࢦ͢ɻ ϑΣʔζ1ͰɺSLI/SLO ʹର͢ΔՁ͕ೝΊΒΕ͔ͯΒ͜ͷϑΣʔζʹҠߦ͢Δɻ ר͖ࠐΉਓϩʔϧ͕૿͍͑ͯΔ͕ϑΣʔζ1ͱҟͳΔɻ
ΑΓଟ͘ͷਓ͕ސ٬ࢹΛ࣋ͬͯ SLI/SLO Λӡ༻͢Δঢ়ଶΛࢦ͢ɻ 1. PdM ࣄۀऀͳͲͱͱʹɺࣄۀࢹΛ౿·͑ͯ SLI/SLO ΛఆΊΔ͜ͱ ͕Ͱ͖Δ 2. αʔϏενʔϜ͕ओମͱͳͬͯ SLO ͷӡ༻͕ߦΘΕ͍ͯΔ 3. Embedded SRE ͱͯ͠αʔϏενʔϜΛϑΥϩʔ͢Δମ੍͕͋Δ
Facilitation • ೲಘײͷ͋Δ݁ʹ౸ୡ͢Δ ͨΊͷεΩϧ • ޮՌతͳ ձٞͷ४උͱਐߦ Λߦ͏ͨΊʹඞཁͳձٞϚωδϝϯ τͷఆੴ
it's difficult to find someone who's lucky enough to only
have useful, effective meetings. This is equally true for SRE. — Chapter 31 - Communication and Collaboration in SRE
None
None
None
SRE Organization Design
ཧͱݱ࣮ͷΪϟοϓʹର͢Δղ૾Λ্͛Δ 1. SRE धཁʹରͯ͠ϦιʔεෆʹؕΔ͜ͱ͕ଟ͍ 2. εέʔϧ͢ΔߏΛऔΔඞཁ͕͋Δ 3. εέʔϥϏϦςΟΛอͱ͏ͱ͢Δͱ༷ʑͳϓϥΫςΟε͕ඞཁʹͳΔ 4. ࣮ࡍʹϦιʔε͕গͳ͍ͷͰɺগͣͭ͠ਐΊΔඞཁ͕͋Δʢཁό
ϥϯεʣɺͰࢥߟΛࢭΊͳ͍ 5. → SRE ৫Λߏங͢Δ্ͰɺͰ͖ΔϙΠϯτͲ͜ʹ͋Δ͔Λ ཧղ͢Δ
SRE ৫Λߏங͢Δࡍʹॏཁͳ3ͭͷϙΠϯτ • Roles • Responsibilities • Mindset
දతͳ 2 ͭͷϩʔϧ6 6 New Relic, SRE-iously: Defining the Principles,
Habits, and Practices of Site Reliability Engineering , 2018
Responsibilities • ۀͷ୲ͷॴࡏΛ໌֬ʹ͢Δ • RACIϚτϦΫεҎԼͷ4ͭͷཁૉΛ໌֬ʹࣔ͢ࡍʹ༗ޮ • RʢResponsibleʣ: ࣮ߦऀ • AʢAccountableʣ:
આ໌ऀ • CʢConsultedʣ: ૬ஊઌ • IʢInformedʣ: ใࠂઌ • Google ͷهࣄͰ RACI ༻ޠ͕ར༻͞Ε͍ͯΔ7 7 Alex Bramley, Are we there yet? Thoughts on assessing an SRE team’s maturity, 2021
RACI Matrix example8 8 Devops Raci Matrix Ppt Powerpoint Presentation
File Format
Mindset • ৫ͷ৴པੑʹ 5 ͭͷجຊతஈ֊͕͋Γɺ͋Δ࣌ͷ৫ͷϚΠϯυηοτΛද͢9 • Absent: ৫ʹͱͬͯ৴པੑޙճ͠ʹͳ͍ͬͯΔঢ়ଶ • Reactive:
ۙͰੜͨ͡৴པੑͷͷϑΥϩʔ͕ߦΘΕΔ͕ɺγεςϜͷظతͳࢿ͠ͳ͍ • Proactive: ఆظతͳ৫ϓϩηεΛ௨ͯ͡જࡏతͳ৴པੑϦεΫ͕ಛఆ͞Εରॲ͞ΕΔ • Strategic: ΞʔΩςΫνϟɺϓϩμΫτɺϓϩηεΛମܥతʹมߋ͢Δ͜ͱͰϦεΫͷΫϥεΛཧ ͢Δ • Visionary: ৴པੑͷ࠷ߴҐʹ౸ୡ͓ͯ͠Γɺ৴པੑͷ෯͍औΓΈΛϕετϓϥΫςΟε͓Αͼ ܦݧʹج͍ͮͯࣾ֎ͰਪਐͰ͖Δ 9 What’s your org’s reliability mindset? Insights from Google SREs
Mindset ͷཁ • ඞͣ͠ Strategic ϑΣʔζ Visionary ϑΣʔζʹ͍Δඞཁͳ͍ • ෳͷϑΣʔζʹ·͕ͨΔଐੑΛ͍࣋ͬͯΔ͜ͱҰൠత
• େ෦डಈత͕ͩҰ෦ੵۃతଐੑΛ࣋ͭύλʔϯ͋Δ • ϚΠϯυηοτ৫ͷঢ়ଶʹ߹ΘͤͯมԽ͍ͤͯ͘͞ඞཁ͕͋Δ • e.g. डಈతˠੵۃతˠઓུత • ࡞ۀΛநԽ͠ɺٕೳΛঝ͠ɺߟ͑Λ໌จԽ͠ͳ͕ΒϑΣʔζΛ্͛ ͍ͯ͘
Lessons Learned
Why is Organization Important in SRE? • ৴པੑϏδωεʹ͓͍ͯॏཁͳࢦඪͰ͋ΓɺاۀશମʹӨڹ͕͋Δ ͨΊ •
৴པੑސ٬ʹڧ͘ඥ͍͓ͯΓɺSRE νʔϜ୯ମͰཧ͢Δͷࠔ • SRE ͷ࣮ફɺଟ͘ͷίϥϘϨʔγϣϯΛ௨ͯ͡৫తʹऔΓΉඞཁ͕ ͋Δ ͨΊ • Ұ؏ͨ͠৴ʹج͍ͮͨϓϥΫςΟεͷ࣮ફʹɺจԽͷৢͱՁ؍ͷ ڞ༗͕ඞཁෆՄ Ͱ͋ΔͨΊ • ݸਓͰͳ͘ɺ৫తʹऔΓΉඞཁ͕͋Δ
Soft Skills required to implement SRE • SRE ʹϋʔυεΩϧ͚ͩͰͳ͘ιϑτεΩϧॏཁ •
৫ʹ SRE Λಋೖ͢Δ্ͰॏཁͳιϑτεΩϧͷྫΛհ • SLI/SLO ϙετϞʔςϜͳͲͷϓϥΫςΟεͷ࣮ફʹཱ ͭεΩϧͱͯ͠ɺOrganizational Behavior ͱ Facilitation Λઆ ໌
SRE Organization Design • ࣗࣾʹͱͬͯదͳ SRE ৫Λͭ͘Δࡍʹॏཁͳ3ͭͷϙΠϯτΛհ • গͣͭ͠ਐΊΔͨΊʹ֤ϙΠϯτΛஈ֊తʹҠߦ͍ͯ͘͠ͱΑ͍ •
Roles: ·ͣ Pure SRE ͔Β͡Ίͯɺঃʑʹ Embedded SRE Λݕ౼͢Δ • Responsibilities: ·ͣ SRE ͕ R Λ୲͍ͳ͕Βɺগͣͭ͠ݖݶҠৡΛਐ Ίͯ A C ʹҠߦ͢Δ • Mindset: ·ͣ Absent Λղফ͠ɺม༰Ͱ͖Δ෦Λݟ͚ͭͯ Reactive Proactive ʹ͍ͯ͘͠
We are Hiring! topotal.com/careers/software_engineer_sre