Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Who owns the Service Level?
Search
Takeshi Kondo
May 15, 2022
Technology
5
10k
Who owns the Service Level?
SRE NEXT 2022
https://sre-next.dev/2022/
Takeshi Kondo
May 15, 2022
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
4.7k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
4.4k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
1.5k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
1.7k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
1.6k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
7.6k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
4.4k
『スタディサプリ 中学講座』における E2E Test の運用と計測による改善 / Improved E2E testing through measurement
chaspy
0
4.1k
『スタディサプリ』における SLI/SLO の継続的改善 / Continuous improvement of SLI/SLO at StudySapuri
chaspy
1
3k
Other Decks in Technology
See All in Technology
安心してください、日本語使えますよ―Ubuntu日本語Remix提供休止に寄せて― 2024-11-17
nobutomurata
1
990
IBC 2024 動画技術関連レポート / IBC 2024 Report
cyberagentdevelopers
PRO
0
110
複雑なState管理からの脱却
sansantech
PRO
1
140
マルチプロダクトな開発組織で 「開発生産性」に向き合うために試みたこと / Improving Multi-Product Dev Productivity
sugamasao
1
300
RubyのWebアプリケーションを50倍速くする方法 / How to Make a Ruby Web Application 50 Times Faster
hogelog
3
940
The Role of Developer Relations in AI Product Success.
giftojabu1
0
120
【Startup CTO of the Year 2024 / Audience Award】アセンド取締役CTO 丹羽健
niwatakeru
0
990
Lexical Analysis
shigashiyama
1
150
Oracle Cloud Infrastructureデータベース・クラウド:各バージョンのサポート期間
oracle4engineer
PRO
28
12k
Lambda10周年!Lambdaは何をもたらしたか
smt7174
2
110
リンクアンドモチベーション ソフトウェアエンジニア向け紹介資料 / Introduction to Link and Motivation for Software Engineers
lmi
4
300k
Python(PYNQ)がテーマのAMD主催のFPGAコンテストに参加してきた
iotengineer22
0
470
Featured
See All Featured
A designer walks into a library…
pauljervisheath
204
24k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.1k
A Philosophy of Restraint
colly
203
16k
YesSQL, Process and Tooling at Scale
rocio
169
14k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
StorybookのUI Testing Handbookを読んだ
zakiyama
27
5.3k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
48k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
42
9.2k
Imperfection Machines: The Place of Print at Facebook
scottboms
265
13k
What's in a price? How to price your products and services
michaelherold
243
12k
A better future with KSS
kneath
238
17k
Transcript
Who owns the Service Level? Takeshi Kondo / @chaspy 2022/05/15
SRE NEXT 2022
Who am I chaspy chaspy_ Engineering Manager, Site Reliability at
Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
લఏɿϓϩμΫτհ - ελσΟα ϓϦ
Hello, again! SRE NEXT
ࠓ͢͜ͱ SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ
Tl;dr / SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ • ࢦඪΛݩʹվળ͠ଓ͚ΔจԽ • ࣗݾ݁ԽΛࢧ͑Δ Platform •
༏ઌॱҐΛมߋ͢ΔͨΊͷٕज़ઓུ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
લఏ෦3݄ͷ6ࣾ߹ಉษڧձͱಉ͡༰Ͱ͓͞Β͍͠·͢
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
ͦͦ Site Reliability Engineering ͱ: Not like this • αʔϏε͕ʮߴ͍৴པੑ
(ʹ100%)ʯΛอ͍ͬͯΔ͜ͱ • SLI/SLO ΛकΕ͍ͯΔ͜ͱ • ΦϯίʔϧϩʔςʔγϣϯΛ։ൃνʔϜͰߦ͏͜ͱ https://github.com/twitter/twemoji
ͦͦ Site Reliability Engineering ͱ: Like this! • αʔϏε͕ʮϢʔβ͕ظ͢Δ৴པੑʯΛอ͍ͬͯΔ͜ͱ •
SLI/SLO Λઃఆ͠ɺඇػೳཁ݅ͱػೳཁ݅ͷ༏ઌܾఆͷ ࢦඪͱͯ͠׆༻͍ͯ͠Δ • SLO ҧ͕ൃੜͨ͠ͱ͖ʹదʹରॲͰ͖ΔΑ͏ͳϞχλ Ϧϯάํ๏ͱϙϦγʔ͕νʔϜͰಉҙ͞Ε͍ͯΔ • ্ه͕ఆظతʹݟ͞Ε͍ͯΔ https://github.com/twitter/twemoji
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷ৴པੑʹ ؔ͢Δ
Capability औಘ Λࢧԉ͢Δ ࣗͨͪͷαʔϏεͷ ৴པੑΛࣗͨͪͰί ϯτϩʔϧͰ͖͍ͯΔ
Team Topologies • 4ͭͷνʔϜύλʔϯ • Stream Aligned • Platform •
Enabling • Complicated Subsystem • 3ͭͷίϛϡχέʔγϣϯύλʔϯ • Collaboration • X as a Service • Facilitation https://pub.jmam.co.jp/book/b593881.html
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ԽΛ ࢧ͑ΔϓϥοτϑΥʔϜͱ
จԽΛ࡞Δ Platform Team Enabling Team Stream Aligned Team ࣗͨͪͰඞཁͳͷΛ ࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
• ࢧԉҰ࣌తͰɺظԽͤͯ͞ͳΒͳ͍ ༨ஊ: Enabling ͱ͍͏୯ޠʹϙδςΟϒɾωΨςΟϒ྆໘͋Δ https://ja.wikipedia.org/wiki/%E3%82%A4%E3%83%8D%E3%83%BC%E3%83%96%E3%83%AA%E3%83%B3%E3%82%B0
SRE Team ͷ Vision / Mission / Values https://blog.studysapuri.jp/entry/sre-vision-mission-values
Mission ࣗݾ݁νʔϜ͕ϓϩμΫ τΛૉૣ҆͘શʹಧ͚ଓ͚ ΔͨΊͷϓϥοτϑΥʔϜ ͱจԽΛ࡞Δ
ͳͥࣗݾ݁Խ͕ॏཁ͔ SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ԽΛ ࢧ͑ΔϓϥοτϑΥʔϜͱ จԽΛ࡞Δ Platform Team Enabling
Team Stream Aligned Team ࣗͨͪͰඞཁͳͷΛ ࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
ͳͥࣗݾ݁Խ͕ॏཁ͔: Not “VS”, but “And” • Dev vs and Ops
• Ϣʔβ͔ΒߴʹϑΟʔυόοΫΛಘΔ (DevOps) • Dev vs and Infrastructure • ηϧϑαʔϏεͰߏஙͯ͠ϦʔυλΠϜॖ • Productivity vs and Reliability • ੜ࢈ੑͱ৴པੑ૬ޓʹґଘ͢Δ
ͳͥࣗݾ݁Խ͕ॏཁ͔ • ࣗݾ݁νʔϜͳΒ • ຊ൪ڥ͔ΒϑΟʔυόοΫΛಘଓ͚ΒΕΔ • ࠷খݶͷίϛϡχέʔγϣϯͰߴͳҙࢥܾఆ͕Ͱ͖Δ • ػೳཁٻ͚ͩͰͳ͘ඇػೳཁٻʹԠ͑Δ͜ͱ͕Ͱ͖Δ
• ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭͯ ͍Δ͜ͱ • ։ൃνʔϜ͕”ࣗݾ݁Խ”͍ͯ͠Δঢ়ଶ • SRE νʔϜ͜ΕΛϓϥοτϑΥʔϜͱจԽৢͰࢧ͑Δ
• ͜ΕΛ࣮ݱ͢ΔʹϓϩμΫτ։ൃʹด͡ͳ͍ଟ༷ͳࢹ͕ඞཁ • ϢʔβͷظΛΔ / Product Management • ߴ͍։ൃੜ࢈ੑ / Development Skills • ඇػೳཁٻʹͲΕ͚ͩίετΛ͔͚Δ͔ / Business Development ·ͱΊɿSRE Λ࣮ݱ͢Δͱ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ @chaspy ೖࣾޙ • 2018: @chaspy ೖࣾ • 2019:
Application Platform Λ Kubernetes Ҡ • 2020: Microservices Readiness ͷඋ • αʔϏεΦʔφʔγοϓͷࡦఆ • Design Doc / Production Readiness Checklist • Self-services Infrastructure (terraform monorepo) • SLI/SLO • 2021: SLI/SLO ӡ༻Λ։ൃνʔϜʹશҠৡ Platform Team ͱͯ͠ Platform Λ࡞͍ͬͯΔ Enabling Team ͱͯ͠ ։ൃ৫ʹ SLI/SLO ͳͲͷΧϧνϟʔৢ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ 2021 • COVID-19 ྲྀߦɺΞΫηε૿େ • Platform ͷਐԽ •
Terraform monorepo • Loadtest Platform • GitHub Actions ʹΑΔ monorepo CI • ৫ͷมԽ • ٕज़ઓུάϧʔϓൃ • ࣄۀҠʹΑΓϦΫϧʔτస੶ɺQuipper ຊࢧళਫ਼ࢉ • chaspy EM ༻
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ 2022 • ৽͍͠ SRE ࣮ફͷܗ • Partially Embedded
SRE • Partially Enabling SRE in development team
৫نͷਪҠ ։ൃऀ 35 53 54
73 114 43& 4 5 7 7 7 ։ൃऀελσΟαϓϦɾQuipper ྆ํͷɺWeb Engineer (frontend&backend) ͷɻNative আ֎͍ͯ͠Δɻ 2022͔ΒۀҕୗͷํΧϯτ͍ͯ͠Δɻ2021Ҏલۀҕୗͷํͱࣄ͍ͯͨ͠ɻ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ͍ͣΕͷ࣌ Platform Team ͱ Enabling Team ͷৼΔ ͍Λ͍ͯ͠Δ
• ಛʹ2019͔Βʮࣗݾ݁ԽʯΛςʔϚʹɺ͓ئ͍͞Ε Δ͜ͱΛۃྗݮΒͤΔ Platform Λ࡞͖ͬͯͨ • ಉ࣌ʹ։ൃνʔϜͷʮจԽΛͭ͘Δʯ͜ͱʹ౿ΈࠐΈɺSLI/ SLO Λݟ͍ͯ͘จԽΛ৫ʹৢͨ͠ • →ʮSLO Reviewʯat SRE NEXT 2020
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
ʮSLO Reviewʯat SRE NEXT 2020 • ։ൃ৫ʹ SLO Λ Review
͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product, 15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷ4εςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
Α͔ͬͨ • ։ൃ৫ʹ SLO Λ Review ͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product,
15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷεςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review ։ൃνʔϜͷೝෛՙΛపఈత ʹԼ͛Δ͜ͱʹͩ͜Θͬͨ తෆ࣮֬ੑͷݮͷͨΊ ϑΟʔυόοΫαΠΫϧΛճͨ͠
Α͘ͳ͔ͬͨʁ • ։ൃ৫ʹ SLO Λ Review ͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product,
15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷεςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review ͳͥ͏·͍͔͘ͳ ͔ͬͨͷ͔ʁ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ Ͱ͖ͨ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ ͏·͍͔͘ͳ͔ͬͨ
ͳͥ”ߦಈ͢Δ”·ͰࢸΒͳ͔ͬͨͷ͔ • ࣌ɺSLO ҧ࣌ͷΞΫγϣϯ Product Manager / Team ʹҠৡ͍ͯͨ͠ •
·ͬͨ͘ԿͰ͖ͳ͔ͬͨΘ͚Ͱͳ͍ • ͱͱνʔϜʹ༧ࢉͷ͋ΔɺվળͷͨΊͷ࣌ؒͰͰ͖Δ͜ͱ͔͠Ͱ͖ͳ͔ͬͨʢִि1ʣ • QB Day ͱݺΕΔ • Τϥʔʹର͢Δతͳରॲɺܰඍͳ Performance վળͳͲ • ΞʔΩςΫνϟมߋɺΠϯϑϥͳͲɺظతɾࠜຊతରॲ͔ͬͨ͠ • ʮࢦඪΛݩʹػೳཁٻͱඇػೳཁٻͷ༏ઌஅ͕Ͱ͖Δʯ·Ͱ౸ୡ͠ͳ͔ͬͨ • ༏ઌஅʹʹཱͨͳ͍ͷͰ͋Εɺ։ൃνʔϜʹͱͬͯΔ͜ͱ͕૿͚͑ͨͩͱݴ͑Δ
ͳͥ”ߦಈ͢Δ”·ͰࢸΒͳ͔ͬͨͷ͔ • ৴པੑͷ(ඇػೳཁٻ)ʹରॲ͢Δݖݶɾ༧ࢉෆ • ػೳ։ൃͷ༏ઌ͕ߴ͔ͬͨ
SLO ҧͨ͠Β“ϦϦʔεετοϓ”ͱ͍͏ݬ • SLO ҧͨ͠Βશͯͷ։ൃΛࢭΊͯͦͷमਖ਼Λߦ͏ͷݱ࣮ తͰͳ͍ • ͦΕΛߦ͏ͨΊʹࣄۀऀ͕͜ͷӡ༻ʹ߹ҙ͢Δඞཁ͕͋Δ • ͦΕΛߦ͏ͨΊʹ”ᘳͳSLI/SLO”͕ඞཁ͕ͩɺ”ᘳͳSLI/SLO”
ଘࡏ͠ͳ͍ • ৫͕֦େ͠ɺBiz Dev / Product / CS ͱΛ୲͍ͯ͠Δঢ়گͰɺ ݱஅͰ͜ΕΛΔͷίϛϡχέʔγϣϯίετ͕ߴ͗͢Δ
SLI/SLO ӡ༻ͷཧ: ͱ͋Δ Developer ͷίϝϯτ
ʮSLO Reviewʯat SRE NEXT 2020 ͦͷޙͷ·ͱΊ • ʮ৴པੑࢦඪΛఆΊɺ؍͢ΔʯจԽΛ࡞ͬͨ͜ͱʹՁ͕͋ͬͨ • ࣄۀઓ্ུͷҙࢥܾఆʹʹཱͭࢦඪʹҭͭ·ͰʹࢸΒͳ͔ͬͨ
• ཧ༝1. ඇػೳཁٻͱػೳཁٻͷόϥϯεΛม͑Δҙࢥܾఆݖݶɾ༧ࢉ͕ϓϩμΫτ ։ൃνʔϜʹͳ͔ͬͨ • ৽نػೳ։ൃͷΠϯηϯςΟϒ͕େ͖͍ঢ়گ • ͦͷΑ͏ͳٕज़ઓུ/ٕज़ࢿΛϓϩμΫτ All Ͱߦ͑ΔΈ͕ͳ͔ͬͨ • ཧ༝2. ৴པੑࢦඪ͕ Biz/Dev/SRE શһ͕ཧղ͍͢͠ࢦඪͰͳ͔ͬͨ • backend API ͷ SLI ϢʔβମݧΛද͓ͯ͠ΒͣɺLatency ʹؔ͢Δରॲ TPM ͷઆ໌͍͠
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
2021 ελσΟαϓϦ খதߴϓϩμΫτ։ൃ෦ ٕज़ઓུάϧʔϓൃ
ελσΟαϓϦখֶɾதֶɾߴߍɾେֶडݧߨ࠲ ελσΟαϓϦ For TEACHERS ελσΟαϓϦ For SCHOOL ݱঢ়ͷ৫ਤ: খதߴϓϩμΫτ։ൃ෦ ҎԼ17άϧʔϓ
TPM BtoB TPM BtoC TPM ForSCHOOL TPM ԣஅ BtoC BtoB QA ։ൃࢧԉ SRE ٕज़ઓུ ίʔνϯά ৽ن։ൃ1 Τϯϋϯε ֶशࢧԉ Native iOS Android ৽ن։ൃ2 ਐ࿏ओମੑ ίϛϡχέʔγϣϯࢧԉ ForSCHOOLϞόΠϧ
Disclaimer • ٕज़ઓུάϧʔϓͷ্ཱͪ͛લϚωʔδϟ͕ߦͬͨͷ • ࡢ࣌Ͱ @chaspy DevOps WG ͷ
Lead -> EM/Lead • લͷୀ৬ʹ͍ٕज़ઓུάϧʔϓͷ EM ෦͕݉ͭ͠ ͭɺଞ໊ͷ EM ͱҰॹʹӡӦ͍ͯ͠Δ • SLO ҧͷରॲ͕Ͱ͖ͳ͍͜ͱ͕ཧ༝Ͱ্ཱ͕ͪͬͨΘ͚Ͱ ͳ͍
ٕज़ઓུͱ ϓϩμΫτ։ൃ৫ʹ͓͍ͯ ԿΛΔ͔ɺԿΛΒͳ͍͔
ͳٕͥज़ઓུ͕ඞཁ͔ • Ϗδωεͷʹਵ͢ΔͨΊ • มߋΛ͍͢͠γεςϜɾίʔυɾ৫ʹ͢Δ • มߋΛ͛ΔཁҼઌճΓͯ͠ରॲ͢Δ(ٕज़తෛ࠴ͷฦࡁ) • ͦͷঢ়ଶΛظతʹҡ࣋Ͱ͖ΔೳྗΛಘΔ
ͳٕͥज़ઓུ”άϧʔϓ”͕ඞཁ͔ • ٕज़ઓུͷܾΊํ৫ʹΑͬͯҟͳΔ • 1ਓͷ CTO ͕τοϓμϯͰܾΊ͍͍ͯ • ϘτϜΞοϓͰશһ߹ٞͰܾΊ͍͍ͯ •
ͦͷதؒͰ͍͍ • ελσΟαϓϦখதߴ։ൃ৫ٕज़ઓུΛ1ਓʹґଘ͠ͳ͍ΈΛ ࡞Δ͜ͱʹઓ͍ͯ͠Δ
ٕज़ઓུάϧʔϓͱ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ
࠷ॳͷ1Ͱͬͨ͜ͱ • ৽ن։ൃɿΤϯϋϯεɿٕज़తෛ࠴ղফ = 1:1:1 ͷ༧ࢉએݴ • ٕज़తෛ࠴ղফҊ݅ΛܾΊͯ։ൃϩʔυϚοϓʹΈࠐΉ • ٕज़՝ͷཧํ๏ͱൃੜ࣌ͷτϦΞʔδํ๏ͷܾఆ
SLO ҧ࣌ͷΞΫγϣϯ͕औΕΔΑ͏ʹΈ͕Ͱ͖ͨ • Ұఆඇػೳཁٻʹ༧ࢉ͕ͯΒΕɺൃੜٕͨ͠ज़՝Λཧ ͠ɺ࣮ߦͰ͖Δମ੍͕ͬͨ • SLO ҧͨ͠ࡍʹ࣮ࢪ͢Δ͜ͱ͕͔ٕͬͨ͠ज़՝Λܭը ʹΈࠐΊΔΑ͏ʹͳͬͨ
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • ٕज़՝Λ͛ࠐΊΔॴɾΈΛ࡞Δ • ٕज़՝ͷ༏ઌΛ͚ͭΔ • དྷظͷ։ൃϩʔυϚοϓʹଓ͢Δ
ٕज़՝Λ͛ࠐΊΔॴɾΈΛ࡞Δ ୭Ͱٕज़՝Λ͛ࠐΊΔ spreadsheet reacji Ͱؾܰʹ͛ࠐΊΔ
͛ࠐ·Εͨ՝ͷ re fi nement ͭΒ͞ = ϖΠϯΛݴޠԽ ݱ࣌Ͱු͔Ϳ How Λهࡌ
(༏ઌ͕͚ͭΒΕΕ͍͍ͷͰɺ͜ͷཝ optional)
՝ͷ༏ઌॱҐ͚ ϖΠϯͷසͱڧͰϚοϐϯά ٕज़՝ͷ૬ରධՁΛ͚ͭΔ
དྷظͷ։ൃϩʔυϚοϓʹଓ͢Δ ϖΠϯͷԽ ։ൃϩʔυϚοϓଓঢ়گ
ͪΖΜɺᘳͰͳ͍ • ٕज़՝සͱڧͰଌΕΔͷͰͳ͍ • ఆੑతͰ͋Δ • ࢀՃϝϯόʔͷภΓ͕͋Δ͔ • ෳ member
ͷ vote ݁Ռͷॏ৺ʹஔ͍͍ͯΔͷͰਫ਼ʹٙ • ։ൃϦιʔεɺٕज़తқɺϦεΫʹΑ͙ͬͯ͢ʹऔΓ͔͔Εͳ ͍՝͋Δ • ՝ͷ༏ઌ͚ʹ͕͔͔࣌ؒΔ • etc…
͔͠͠ɺಛఆͷਓʹґଘ͠ͳ͍ελʔτϥΠϯʹཱͬͨ • ୭Ͱٕज़՝Λද໌͢Δ͜ͱ͕Ͱ͖Δ • ද໌ͨ͠՝ඞͣऔΓѻΘΕΔ • શһͰʹ͖߹͑Δ - vs
ࢲͨͪ • ظతͳ৴པੑͷ Controlability ୲อʹཱͭ
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
DevOps WG ͷతͱ׆ಈ • తɿʢ։ൃνʔϜͷʣࣗݾஅೳྗͷ֫ಘͷͨΊʹઃஔ • ϝϯόʔ ྖҬ͝ͱͷ WebDev /
QA / SRE • ׆ಈ༰ • όϦϡʔετϦʔϜϚοϐϯάͷ࣮ࢪ • ީิͱͳΔ Metrics / Indicator ͷચ͍ग़͠ͱܭଌ • DX Criteria ͷ࣮ࢪ • όϦϡʔετϦʔϜΛ્͢ΔཁҼͷղܾ(e.g. E2E Automation) • ϓϩμΫτ։ൃ෦֎ͷใ׆ಈ • ·ͣ༗ޮͦ͏ͳ metrics ΞηεϝϯτΛݕূͨ͠
όϦϡʔετϦʔϜϚοϐϯά Meta Issue (Epic) ࡞~ ։ൃ
όϦϡʔετϦʔϜϚοϐϯά ։ൃྃ ~ εϓϦϯτϨϏϡʔ QA ~ ຊ൪ϦϦʔε
ީิͱͳΔ Metrics / Indicator ͷચ͍ग़͠ͱܭଌ όϦϡʔετϦʔϜϚοϐ ϯάΛΠϯϓοτʹɺվ ળͨ͠߹ͷϝϦοτͷ ԾઆɺͦΕΛ્ͯ͠ ͍ΔΛϐοΫΞοϓ
Q https://blog.studysapuri.jp/entry/2020/08/17/dx-criteria-system
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද https://blog.studysapuri.jp/entry/2020/08/17/dx-criteria-system • ॴଐάϧʔϓΛ͑ͨࣄۀঢ়گΛΔ • Ϛʔέοτχϡʔε
• ࣄۀঢ়گ • ϓϩμΫτ KPI • SLI / ։ൃऀੜ࢈ੑ • ͦͷଞτϐοΫ͞·͟· • SRE ͱͳʹʁ • ϚΠΫϩαʔϏεͬͯͳʹʁ͏Ε͍͠ͷʁ
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද • “͡Ίͯͷ SRE” ͱͯ͠։ൃ෦Ҏ֎ͷํ͚ʹൃද
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද • “͡Ίͯͷ SRE” ͱͯ͠։ൃ෦Ҏ֎ͷํ͚ʹൃද
DevOps WG ͷతͱ׆ಈ • తɿʢ։ൃνʔϜͷʣࣗݾஅೳྗͷ֫ಘͷͨΊʹઃஔ • ʢ։ൃνʔϜ͕ʣࢦඪΛݩʹߦಈ͢ΔจԽΛ࡞Ζ͏ͱ͍ͯ͠Δ • SLI/SLO Λݟͯ৴པੑΛ؍͍ͯ͘͜͠ͱͱಉ͡Ͱ…?
SRE ͱٕज़ઓུ • DevOps WG ͷ׆ಈ ʮSRE ͷ࣮ݱʯͷจԽ໘Ͱͷ֦ு • զʑ͕ݟΔ͖ࢦඪγεςϜͷ৴པੑࢦඪ͚ͩͰͳ͍
• ͋ΒΏΔͷΛࢦඪΛݟͯɺҙࢥܾఆ͢Δ • ࠓޙ͜ͷจԽৢͦͷͷͷվળαΠΫϧΛճ͢ • 1. ީิͱͳΔ metrics ͷ༗ޮੑ͕໌Β͔ʹͳΓɺԽ͢Δ • 2. ։ൃνʔϜ͕ͦΕΛݟͯɺΞΫγϣϯΛߟ͑Δ͜ͱ͕Ͱ͖Δ • 3. ։ൃνʔϜ͕ΞΫγϣϯ->վળͷαΠΫϧΛճ͢ • 4. 1-3 ͦΕࣗମ͕͏·͍͍ͬͯ͘Δ͔ΛධՁ͢Δ
SRE ͱٕज़ઓུ: ·ͱΊ • SRE Λ࣮ݱ͢ΔͨΊʹɺSLO ҧΛͨ࣌͠ʹߦಈͰ͖Δ༧ࢉͱݖݶ ͕ඞཁ • ͦͷ্Ͱɺٕज़՝Λղܾ͢Δ༏ઌॱҐΛ͚ͭΒΕΔٕज़ઓུ͕ඞཁ
• ʰελσΟαϓϦʱখதߴϓϩμΫτ։ൃ෦Ͱ͜ͷٕज़ઓུΛ1ਓ ʹґଘͤͣɺάϧʔϓͰ࣮ݱ͢Δ͜ͱʹઓ͍ͯ͠Δ • ͋ΒΏΔͷΛࢦඪͰݟ͍ͯ͘จԽ͕৴པੑͷͨΊʹॏཁ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
·ͱΊ / SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ • ࢦඪΛݩʹվળ͠ଓ͚ΔจԽ • ࣗݾ݁ԽΛࢧ͑Δ Platform •
༏ઌॱҐΛมߋ͢ΔͨΊͷٕज़ઓུ
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
SRE “NEXT” in ʰελσΟαϓϦʱ • “৴པੑ” ʹؔͯ͠ Enabling Team ͱͯ͠ͷ
SRE Team ׂΛՌͨͭͭ͋͠Δ • SRE Team ͷࠓޙ • ΑΓ৫Λ Sustainable / Scalable ʹ͢ΔͨΊʹɺPlatform ʹؔ͢Δ ΦϯϘʔσΟϯάͷ֦ॆɺ։ൃνʔϜ͕ࣗతʹ৴པੑʹؔ͢Δ Capability शಘΛஅͰ͖ΔΞηεϝϯτΛఏڙ͢Δ • ৴པੑ͚ͩͰͳ͍ɺ։ൃੜ࢈ੑΛՌͨͤΔ Platform ։ൃʹྗ͢Δ
Who owns the Service Level? • Service Level ϓϩμΫτʹؔΘΔશһͷͷ •
શһ͕ؔ৺Λ࣋ͯΔΑ͏ͳ৴པੑࢦඪʹਐԽͤ͞·͠ΐ͏ • ϢʔβମݧΛతʹද͢ Client-side(WebFrontend/Native) Ͱͷ SLI/SLO Λ͏ • ʮࢦඪΛݟͯߦಈ͢ΔʯͦͷͷͷվળαΠΫϧΛճ͠·͠ΐ͏
Thank you! chaspy chaspy_ Engineering Manager, Site Reliability at Recruit
Co., Ltd. Takeshi Kondo https://chaspy.me