Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ポストモーテム運用を支える文化と技術 / Culture and Technology Sup...
Search
Takeshi Kondo
February 09, 2023
Technology
2
2.9k
ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations
https://findy.connpass.com/event/273197/
Takeshi Kondo
February 09, 2023
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SREの知識地図 - 第2章の紹介 - / Knowledge Map of SRE – Introduction to Chapter 2 –
chaspy
0
47
SRE NEXT CfP チームが語る 聞きたくなるプロポーザルとは / Proposals by the SRE NEXT CfP Team that are sure to be accepted
chaspy
2
1.5k
Slack Platform(Deno) での RAG 実装 - LangChain(js) を使ってみた / rag-implementation-on-slack-platform-deno-experimenting-with-langchain-js
chaspy
0
260
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
8k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
5.8k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
2.1k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
2.4k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
2.2k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
8.7k
Other Decks in Technology
See All in Technology
あたらしい上流工程の形。 0日導入からはじめるAI駆動PM
kumaiu
4
600
名刺メーカーDevグループ 紹介資料
sansan33
PRO
0
1k
今日から始めるAmazon Bedrock AgentCore
har1101
4
240
Amazon Bedrock AgentCore EvaluationsでAIエージェントを評価してみよう!
yuu551
0
190
AI推進者の視点で見る、Bill OneのAI活用の今
sansantech
PRO
2
280
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
2.9k
エンジニアとマネジメントの距離/Engineering and Management
ikuodanaka
3
690
Amazon Bedrock AgentCore 認証・認可入門
hironobuiga
2
450
Databricks Free Edition講座 データサイエンス編
taka_aki
0
250
AI開発をスケールさせるデータ中心の仕組みづくり
kzykmyzw
0
190
【NGK2026S】日本株のシステムトレードに入門してみた
kazuhitotakahashi
0
240
Riverpod3.xで実現する実践的UI実装
fumiyasac0921
2
360
Featured
See All Featured
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
230
Embracing the Ebb and Flow
colly
88
5k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
720
Joys of Absence: A Defence of Solitary Play
codingconduct
1
280
Build your cross-platform service in a week with App Engine
jlugia
234
18k
ラッコキーワード サービス紹介資料
rakko
1
2.1M
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Between Models and Reality
mayunak
1
180
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
71k
Amusing Abliteration
ianozsvald
0
91
Automating Front-end Workflow
addyosmani
1371
200k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.7k
Transcript
ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़ Takeshi Kondo / @chaspy 2023/02/07 ΠϯγσϯτʹͲ͏ରԠ͖͔ͯͨ͠ʁΈΜͳͰֶͿϙετϞʔςϜ Lunch LT
Who am I chaspy chaspy_ Engineering Manager Site Reliability and
Web Application Development at Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
લఏɿϓϩμΫτհ - ελσΟαϓϦ
ࠓ͢͜ͱ ʮϙετϞʔςϜӡ༻ʯͷલఏͱͳΔจԽͱٕज़
ࠓ͞ͳ͍͜ͱ ʮϙετϞʔςϜӡ༻ʯͦΕࣗମͷ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜӡ༻ͷݱঢ় • োൃੜޙʮϙετϞʔςϜॻ͖·͠ΐ͏ʯͷ • ؔऀͰू·ͬͯڞ༗ • ΞΫγϣϯ֤νʔϜͷΠγϡʔͱͯ͠ੵ·ΕΔ
ΧδϡΞϧʹϙετϞʔςϜ͕ߦΘΕΔ༷ࢠ ܰඍͳͷͰʮֶͼͷνϟϯεʯͱଊ͑Δ త͕ਁಁ͍ͯ͠Δ །Ұͷͱͯ͠ Slack ΧελϜ ϨεϙϯεͰ issue template ͕ग़
ͯ͘Δͷॻͨ͘ΊͷϋʔυϧΛ Լ͍͛ͯΔ…?
ੲॻ͍ͨهࣄ͕ࠓͰҾ༻͞Ε͍ͯΔ ࠓճ Findy ͞Μʹ͔͚ͯΒͬ ͨͷ͜ͷهࣄΛݟͯΒ͔ͬͨ ΒͰͨ͠🙏 2019… ʮোରԠͱϙετϞʔςϜ ελσΟαϓϦʯͰݕࡧʂ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜӡ༻ͷྺ࢙ • Issue Template ͷ First Commit 20195݄ • ͦΕ͔ΒςϯϓϨʔτͷߋ৽΄ͱΜͲͳ͍
ϙετϞʔςϜӡ༻ͷྺ࢙ • SRE ຊ͔ΒςϯϓϨʔτྲྀ༻ • Issue Template ͷ First Commit
20195݄
ϙετϞʔςϜӡ༻ͷྺ࢙ • TTD/TTR Λه
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜΛࢧ͑ΔจԽ • ୭͔1ਓͷ͍ͤʹͳΒͳ͍Α͏ʹ͢Δ • Design Doc • Production Readiness Checklist
• ૉૣ͘ɺΈΜͳͰରԠ͢Δ • োରԠϑϩʔ • ো͔ΒֶͿ • ϙετϞʔςϜڞ༗ձ • ϙετϞʔςϜಡॻձ ඪ४Խ͢Δ తҙࣝͷৢ
Design Doc / Production Readiness Checklist • ʮ͏͔ͬΓʯΛඪ४Խ͢Δ • ෳਓͰϨϏϡʔ͢Δ͜ͱͰʮݸਓͷ͍ͤʯʹͮ͠Β͘͢Δ
• ϨϏϡʔͳ͠୯ಠΦϖϨʔγϣϯͰϛεΔͱͲ͏ͯ͠ݪҼ͕ݸਓʹ ͍ͯ͠·͏Ͱ͠ΐ͏ ʮProduction Readiness ελσΟαϓϦʯͰݕࡧʂ
োରԠϑϩʔ • োରԠϑϩʔɾোϨϕϧ͕ఆٛ͞Ε͍ͯΔ • Slack work fl ow ͰใࠂͰ͖Δ •
ো͔ʁͰใࠂ͢Δ͜ͱΛਪ͍ͯ͠Δ
োରԠϑϩʔ ઌͷ CircleCI ͷ݅ͷใࠂྫ ऀʹࣗಈͰϝϯγϣϯ͕ඈͿ
ϙετϞʔςϜಡॻձ • SRE νʔϜͰΦϯϘʔσΟϯάͰϙετϞʔςϜಡॻձΛ ࣮ࢪ • શ෦ಡΊͳ͍ʢ૿͑ΔʣͷͰʮ͓͢͢ΊʯϙετϞʔςϜ ΛϥϕϧͰཧ • ֶͼ͕ଟ͍ͷ
• ݱࡏͷߏཧղʹͭͳ͕Δͷ • োൃੜ࣌ͷಈ͖ͱͯ͠ࢀߟʹͳΔͷ
͓͢͢ΊϙετϞʔςϜ8બ
ϙετϞʔςϜΛࢧ͑ΔจԽ·ͱΊ • ϋʔυϧΛԼ͛Δࡉ͔ͳΈ • Issue Template, Slack custom response •
ඪ४Խ • Production Readiness Checklist, োରԠϑϩʔɺϨϕϧఆٛ • ʮֶͼͷͨΊʯͱ͍͏తҙࣝͷৢ • ࠷ॳݴ͍ଓ͚Δɾॻ͖ଓ͚Δ͔͠ͳ͍ؾ͕͠·͢ • աڈ Slack ݕࡧͯ͠ΈΔͱোʹରͯ͠ʮॻ͍ͯΒ͑·͔͢ʁʯͱΑ͓͘ئ͍͍ͯͨ͠ • ॻ͍ͨ݅ chaspy ͕Ұ൪ଟͦ͏… • ϒϩάΛॻ͘ͷޮՌ͋ͬͨͱࢥ͍·͢
ϙετϞʔςϜΛࢧ͑Δٕज़ • ॏཁͳোࣄલʹ͛ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ • దʹϦεΫΛऔΔ͜ͱ͕Ͱ͖͍ͯ·͔͢ʁ • ʮ೦ͷҝ֬ೝʯ͕؆୯ʹͰ͖ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ
ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ • ෛՙςετ • Canary Release • E2E Test Automation
• σʔλϕʔεϦετΞ
ෛՙςετ Production Readiness Checklist Ͱ Performance Risk Λಛఆͯ͠Β͍ɺ ඞཁͰ͋Ε Loadtest
ΛҊ Load Test ࣮ࢪ༰ͷ Template Requirements Λهࡌͯ͠ SRE ͱ։ൃ νʔϜͰઢΛ߹ΘͤΔ
ෛՙςετ • Gatling ͷίʔυΛॻ͍ͯςετ͕࣮ࢪͰ͖Δڥ • ςετ݁Ռ͕ PR ʹషΒΕΔ • ෛՙςετ͕ߴʹࢼߦࡨޡͰ͖Δ
Ϩϙʔτੜ
ෛՙςετ • ڥ४උ؆୯ͱݴΘͳ͍͕ɺϋʔυϧԼ͕͍ͬͯΔ • Databaseʢຊ൪͔ΒϦετΞ͢Δɻޙड़ʣ • Application (Pull Request Λ࡞ΕͰ͖Δʣ
• EKS Node Group • Test code
Canary Release • Argo Rollouts Λ׆༻ • Rails Upgrade ͳͲɺػೳมߋͳ͍͕ɺϦεΫͷߴ͍มߋʹ͏
φΠεTryͰ͢ΑͶ 1% ͔ΒϦϦʔε͠ɺΤϥʔ͕ग़ͨΒ͙͢ ͢͜ͱͰඃΛ࠷খݶʹͰ͖·ͨ͠
E2E Test Automation • ϒϩάΛݟ͍ͯͩ͘͞ʂ • ݕࡧʮελσΟαϓϦ E2Eʯ • ݕग़͢Δෆ۩߹ͦΕͳΓʹ͋Γɺຊ൪োΛ͍Ͱ͍Δ
σʔλϕʔεϦετΞ • ͪ͜ΒৄࡉϒϩάΛ͝ཡ͍ͩ͘͞ʂ • ݕࡧʮελσΟαϓϦ σʔλϕʔεϦετΞʯ
·ͱΊ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़Λհ͠·ͨ͠ • ϓϩηεɾจԽ໘ඪ४Խͱతҙࣝͷৢ͕ॏཁ • ٕज़໘ൃੜޙͷ࠶ൃࢭͷੵΈॏͶ • จԽͱٕज़ɺ྆ํ͕૬ޓʹ࿈ܞ͢Δ •
ੵΈॏͶΔ͜ͱͰʮಉ͡োʯى͖ͮΒ͘ͳΔ • ʮ৽͍͠োʯֶͼͷνϟϯεʹͳΔ
ࠓ͞ͳ͔ͬͨ͜ͱʢεϐʔΧʔτʔΫͰͤͨΒخ͍͠ʣ • োͷධՁɺϨϕϧ͚ • MTTR / MTTD ͷܭଌ • ࣄޙͷλεΫΛ͍͔ʹ։ൃΛ͠ͳ͕Β࣮ࢪ͢Δ͔
• োͱ SLI/SLO
Thank you! chaspy chaspy_ Engineering Manager Site Reliability and Web
Application Development at Recruit Co., Ltd. Takeshi Kondo https://chaspy.me