Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SRE を実現するための組織マネジメント / Management to achieve SRE
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Takeshi Kondo
March 12, 2022
Technology
8k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
SRE を実現するための組織マネジメント / Management to achieve SRE
https://line.connpass.com/event/236497/
Takeshi Kondo
March 12, 2022
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SREの知識地図 - 第2章の紹介 - / Knowledge Map of SRE – Introduction to Chapter 2 –
chaspy
0
89
SRE NEXT CfP チームが語る 聞きたくなるプロポーザルとは / Proposals by the SRE NEXT CfP Team that are sure to be accepted
chaspy
2
1.7k
Slack Platform(Deno) での RAG 実装 - LangChain(js) を使ってみた / rag-implementation-on-slack-platform-deno-experimenting-with-langchain-js
chaspy
0
300
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
8.3k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
6k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
2.3k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
2.5k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
2.4k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
9.1k
Other Decks in Technology
See All in Technology
MUSUBI 田中裕一『AIと共に行う「しごとのリデザイン」- スモールバックオフィス編』AI Ops Lab #4
musubi
0
310
AI時代のコスト管理を考えよう〜明日から使える実践AWSノウハウ~
yoshimi0227
0
860
週末にループ・エンジニアリングの理解を深めるためのスライド
nagatsu
0
130
AIのReact習熟度を測る
uhyo
2
680
From Prompt Engineering to Loop Engineering
shibuiwilliam
1
190
AIはどのように 組織のアジリティを変えるのか?
junki
4
1.4k
Oracle Cloud Infrastructure:2026年6月度サービス・アップデート
oracle4engineer
PRO
0
290
PostgreSQL 19 新機能概要 OSC Hokkaido 2026
nori_shinoda
0
240
クラウドファンディング版StackChan 3体(4体)をインタラクティブな体験型作品にして展示もした話 / スタックチャンお誕生日会2026
you
PRO
0
180
AIに障害切り分けを全部やってもらった。 。 。 。
estie
0
130
4人目のSREはAgent
tanimuyk
0
150
事業会社における 機械学習・推薦システム技術の活用事例と必要な能力 / ml-recsys-in-layerx-wantedly-2026
yuya4
0
160
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
240
Crafting Experiences
bethany
1
190
The Cult of Friendly URLs
andyhume
79
6.9k
sira's awesome portfolio website redesign presentation
elsirapls
0
280
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
780
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Bash Introduction
62gerente
615
220k
Building Flexible Design Systems
yeseniaperezcruz
330
40k
What's in a price? How to price your products and services
michaelherold
247
13k
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
450
A Soul's Torment
seathinner
6
3k
Transcript
SRE Λ࣮ݱ͢ΔͨΊͷ৫Ϛωδϝϯτ Takeshi Kondo / @chaspy 2022/03/12 6ࣾ߹ಉ SREษڧձ
Who am I chaspy chaspy_ Engineering Manager, Site Reliability at
Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
Who am I chaspy chaspy_ ʢגʣϦΫϧʔτ ϓϩμΫτ౷ׅຊ෦ ϓϩμΫτ։ൃ౷ׅࣨ ϓϩμΫτσΟϕϩοϓϝϯτࣨ ·ͳͼྖҬϓϩμΫτσΟϕϩοϓϝϯτϢχοτ
খதߴϓϩμΫτ։ൃ෦ খதߴ̨̧̚άϧʔϓ άϧʔϓϚωʔδϟ Takeshi Kondo https://chaspy.me
ࠓ͢͜ͱ ϦΫϧʔτάϧʔϓͷ ʮϛογϣϯϚωδϝϯτʯΛ ׆༻ͯ͠։ൃνʔϜͷ SRE Capability शಘ Λࢧԉͨ͠ࣄྫ
͋Δ͍ (Partially) Embedded / Enabling SRE ͷࣄྫ
• ։ൃνʔϜͷ৴པੑʹؔ͢Δ Capability शಘʹ2छྨ͋Δ • Embedded SRE (from Pure SRE)
/ ֎͔Β͑Δ • Enabling SRE (in the Team) / ͔Β͛Δ • ৫نɾϑΣʔζʹΑͬͯ࠷దͳύλʔϯ͕ҟͳΔ • খن / ։ൃॳظϑΣʔζͰ͋Ε Embedded SRE Pattern • தେن / ։ൃνʔϜ͕ख़ͯ͘͠Ε Enabling SRE Pattern • ͜ͷ2ͭͷύλʔϯϚωδϝϯτͰσβΠϯͰ͖Δ • 100/0 Ͱͳ͘”෦తʹ”࣮ફ͢Δ͚ͩͰޮՌ͕͋Δ Tl;dr
Disclimer • Management ͷྫͱͯ͠հ͠·͕͢ɺՌ͕ग़ͨͷ ϛογϣϯΛҾ͖ड͚ͯ͘ΕͨϝϯόʔɺSREɺ։ൃνʔϜ ͷօ͞Μͷ͓͔͛Ͱ͢ɻ͍ͭ͋Γ͕ͱ͏͍͟͝·͢ʂ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
ͦͦ Site Reliability Engineering ͱ: Not like this • αʔϏε͕ʮߴ͍৴པੑ
(ʹ100%)ʯΛอ͍ͬͯΔ͜ͱ • SLI/SLO ΛकΕ͍ͯΔ͜ͱ • ΦϯίʔϧϩʔςʔγϣϯΛ։ൃνʔϜͰߦ͏͜ͱ https://github.com/twitter/twemoji
ͦͦ Site Reliability Engineering ͱ: Like this! • αʔϏε͕ʮϢʔβ͕ظ͢Δ৴པੑʯΛอ͍ͬͯΔ͜ͱ •
SLI/SLO Λઃఆ͠ɺඇػೳཁ݅ͱػೳཁ݅ͷ༏ઌܾఆͷ ࢦඪͱͯ͠׆༻͍ͯ͠Δ • SLO ҧ͕ൃੜͨ͠ͱ͖ʹదʹରॲͰ͖ΔΑ͏ͳϞχλ Ϧϯάํ๏ͱϙϦγʔ͕νʔϜͰಉҙ͞Ε͍ͯΔ • ্ه͕ఆظతʹݟ͞Ε͍ͯΔ https://github.com/twitter/twemoji
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷ৴པੑʹ ؔ͢Δ
Capability औಘ Λࢧԉ͢Δ ࣗͨͪͷαʔϏεͷ ৴པੑΛࣗͨͪͰί ϯτϩʔϧͰ͖͍ͯΔ
Team Topologies • 4ͭͷνʔϜύλʔϯ • Stream Aligned • Platform •
Enabling • Complicated Subsystem • 3ͭͷίϛϡχέʔγϣϯύλʔϯ • Collaboration • X as a Service • Facilitation https://pub.jmam.co.jp/book/b593881.html
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ ԽΛࢧ͑Δϓϥοτ
ϑΥʔϜͱจԽΛ࡞Δ Platform Team Enabling Team Stream Aligned Team ։ൃνʔϜࣗͨͪͰඞཁͳ ͷΛࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
SRE Team ͷ Vision / Mission / Values https://blog.studysapuri.jp/entry/sre-vision-mission-values
Mission ࣗݾ݁νʔϜ͕ϓϩμΫ τΛૉૣ҆͘શʹಧ͚ଓ͚ ΔͨΊͷϓϥοτϑΥʔϜ ͱจԽΛ࡞Δ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
ʢͦͷલʹʣ ϓϩμΫτհ
None
None
None
ྺ࢙͔ΒৼΓฦΔ ʰελσΟαϓϦʱSRE
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • 2019: Application Platform Λ Kubernetes Ҡ • 2020:
Microservices Readiness ͷඋ • αʔϏεΦʔφʔγοϓͷࡦఆ • Design Doc / Production Readiness Checklist • Self-services Infrastructure (terraform monorepo) • SLI/SLO • 2021: SLI/SLO ӡ༻Λ։ൃνʔϜʹશҠৡ Platform Team ͱͯ͠ Platform Λ ࡞͍ͬͯΔ Enabling Team ͱͯ͠ ։ൃ৫ʹ SLI/SLO ͳͲͷΧϧνϟʔৢ
৫نͷਪҠ ։ൃऀ
43& ։ൃऀελσΟαϓϦɾQuipper ྆ํͷɺWeb Engineer (frontend&backend) ͷɻNative আ֎͍ͯ͠Δɻ
2021ɺEnabling SRE Λ։ൃνʔϜ͔Β࡞ΔΑ͏ઓུมߋ • ʮ৴པੑʯΛऔΓר͘։ൃ৫ͷঢ়گ͕ΑΓΞϓϦέʔγϣϯɾ υϝΠϯʹಛԽͨ͠ʹͳΓͭͭ͋ͬͨ • ෛՙࢼݧ • υϝΠϯಛԽͷ
Pod Auto Scaling • Frontend Performance ͷଌఆ ͓Αͼ SLI/SLO ͷվળ • QA ࣗಈԽ • 1ͭͷ SRE Team ͕ Enabling Team ͱͯ͠ৼΔ͏ΑΓɺ։ൃ νʔϜʹ Enabling SRE Λ࡞Δํʹઓུมߋ https://blog.studysapuri.jp/entry/2022/02/17/sre-study-session
։ൃνʔϜ Enabling SRE Λ࡞Δ
։ൃνʔϜ Enabling SRE Λ࡞Δ
2020ࠒͷঢ়گ SRE ։ൃ νʔϜ ։ൃ νʔϜ Facilitating Facilitating Enabling Team
Stream Aligned Team
2022ݱࡏ SRE ։ൃνʔϜ Facilitation SRE mem ber mem ber mem
ber Facilitation ϑϥΫλϧతʹͳΔ Platform Team Enabling Team Stream Aligned Team Enabling SRE X as a Service
Pure SRE vs Embedded SRE https://www.slideshare.net/newrelic/sreiously-de fi ning-the-principles-habits-and-practices-of-site-reliability-engineering-112178269
2020ࠒͷঢ়گ SRE ։ൃ νʔϜ ։ൃ νʔϜ Facilitating Facilitating Pure SRE
2022ݱࡏ SRE ։ൃνʔϜ Facilitating SRE mem ber mem ber mem
ber Facilitating ϑϥΫλϧతʹͳΔ Pure SRE Embedded SRE X as a Service
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
ࠓ͢͜ͱ ϦΫϧʔτάϧʔϓͷ ʮϛογϣϯϚωδϝϯτʯΛ ׆༻ͯ͠։ൃνʔϜͷ SRE Capability शಘ Λࢧԉͨ͠ࣄྫ
͋Δ͍ (Partially) Embedded / Enabling SRE ͷࣄྫ
ϛογϣϯϚωδϝϯτ https://github.com/twitter/twemoji
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ • ϝϯόʔͷ Will / Can / Must ΛϚωʔδϟͱ͢Γ߹ΘͤΔ •
֤ϛογϣϯׂ߹ɾ༰ɾୡج४Λ߹ҙ͞ΕΔ • ϛογϣϯͷϨϙʔτϥΠϯඞͣ͠ଐͷνʔϜϚωʔ δϟͰ͋Δඞཁͳ͍
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ EM Mem ber Mem ber Mem ber Mem ber
ϛογϣϯͷ 30%Λ SRE ؔ ͷͷʹઃఆ SRE ։ൃνʔϜ
۩ମతʹͲΜͳϛογϣϯΛઃఆ͔ͨ͠ • ։ൃνʔϜϝϯόʔʢதֶߨ࠲ϦχϡʔΞϧͷ։ൃʣ • ΠϯϑϥྖҬͷࣗݾ݁Խͷਪਐ 30% • ϓϩμΫτ։ൃͷͨΊͷϛογϣϯ 70% •
SRE ϝϯόʔ • ։ൃνʔϜͷ։ൃऀੜ࢈ੑͷαϙʔτ 20% • Production Release ͷαϙʔτ 20% • SRE ͷͨΊͷϛογϣϯ 60% https://studysapuri.jp/course/junior/
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ EM Mem ber Mem ber ΠϯϑϥྖҬͷ ࣗݾ݁Խͷਪ ਐ(30%) SRE
։ൃνʔϜ ϓϩμΫτ։ൃʹؔ͢Δ ϛογϣϯ(70%) ։ൃऀੜ࢈ੑ/ Production Release ͷ αϙʔτ(40%) / (ଞ60%)
Ϛωʔδϟ͕ͬͨ͜ͱ • ֤ϝϯόʔͱͷఆظతͳ 1on1 • ϛογϣϯͷதؒৼΓฦΓ • ϛογϣϯΛՄࢹԽ͢ΔϛογϣϯπϦʔͷ࡞ • ϛογϣϯͷ૬ޓઆ໌ͷͷઃఆ
ϛογϣϯΛՄࢹԽ͢ΔϛογϣϯπϦʔ https://blog.studysapuri.jp/entry/2022/02/25/sre-mission-tree
Կ͕ى͖ͨͷ͔(1) • ੜ࢈ੑվળαΠΫϧͷՃ • ՝ͷٵ্͍͛ -> ࣮ -> ϑΟʔυόοΫ ->
վળͷαΠΫϧ͕Ճ
Կ͕ى͖ͨͷ͔(2) • SRE Culture ͷൖɿϓϨϞʔςϜͷ࣮ࢪ https://blog.studysapuri.jp/entry/pre-mortem
Կ͕ى͖ͨͷ͔(3) • ΞϥʔτϋϯυϦϯάͷαϙʔτ • Alert ͦͷͷͷઆ໌ɺௐࠪํ๏ SRE ͕αϙʔτ • ରԠͦͷͷ։ൃνʔϜͰ࣮ࢪ
݁ՌͲ͏ͳ͔ͬͨ • େ͖ͳোͳ͘ελσΟαϓϦதֶߨ࠲ͷϑϧϦχϡʔΞϧ ͕ϦϦʔε • ։ൃνʔϜͰΞϥʔτରԠ࣮ݱ https://studysapuri.jp/course/junior/ https://github.com/twitter/twemoji
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Facilitation Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Facilitating Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Collaboration Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͷύλʔϯͷߟ • Enabling SRE ʹΑΔ Facilitating ”த”͔Β࡞Δํ͕ྑ͍ • ΑΓ։ൃνʔϜͷӡ༻ελΠϧʹ͋ͬͨܗͰద༻Ͱ͖Δ •
ٕज़తͳ࣮ Platform ʹৄ͍͠ Pure SRE ͕”֎”͔Β Embedded ͞Εͯ Collaboration ͨ͠ํ͕ྑ͍ • ArgoCD, GitHub Actions ͳͲ Infrastructure Pure SRE ͕ৄ͍͠ • ՝ൃݟɺ࣮ɺϑΟʔυόοΫαΠΫϧΛߴʹճ͢͜ͱͰΑΓྑ ͍ Platform ͕ఏڙͰ͖Δ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
• ։ൃνʔϜͷ৴པੑʹؔ͢Δ Capability शಘʹ2छྨ͋Δ • Embedded SRE (from Pure SRE)
/ ֎͔Β͑Δ • Enabling SRE (in the Team) / ͔Β͛Δ • ৫نɾϑΣʔζʹΑͬͯ࠷దͳύλʔϯ͕ҟͳΔ • খن / ։ൃॳظϑΣʔζͰ͋Ε Embedded SRE Pattern • தେن / ։ൃνʔϜ͕ख़ͯ͘͠Ε Enabling SRE Pattern • ͜ͷ2ͭͷύλʔϯϚωδϝϯτͰσβΠϯͰ͖Δ • 100/0 Ͱͳ͘”෦తʹ”࣮ફ͢Δ͚ͩͰޮՌ͕͋Δ Tl;dr
ࠓޙ͞Βʹ։ൃνʔϜͷεέʔϥϏϦςΟͷͨΊʹҎԼʹऔΓΉ • SRE Capability शಘࢧԉ • ϛογϣϯϚωδϝϯτʹΑΔ։ൃνʔϜ Enabling SRE ͷ࠾༻
• SRE ख़Ξηεϝϯτͷ࡞ɾ࣮ࢪ • SRE ࣝɾٕज़शಘͷͨΊͷΦϯϘʔσΟϯάࢧԉ • Developer Success / ։ൃੜ࢈ੑ্ࢧԉ • Platform Λ Product ͱͯ͠։ൃ͢Δ • Developer Support ࠓճͷࣄྫ
Special Thanks • @kyontan • As Embedded SRE • @ravelll
• As Enabling SRE • ʰελσΟαϓϦʱதֶߨ࠲ϑϧϦχϡʔΞϧʹؔΘͬͨશͯͷਓ • SRE νʔϜϝϯόʔ
Thank you! chaspy chaspy_ Engineering Manager, Site Reliability at Recruit
Co., Ltd. Takeshi Kondo https://chaspy.me
͓·͚ɿSRE ख़Ξηεϝϯτ