Upgrade to Pro — share decks privately, control downloads, hide ads and more …

3 Things to Avoid When Rolling Out SLI/SLO

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for Kota Kota
March 20, 2026

3 Things to Avoid When Rolling Out SLI/SLO

LT slide for Tamachi.sre#3 (2026/03/19)

URL: https://tamachi-sre.connpass.com/event/381960/

Avatar for Kota

Kota

March 20, 2026
Tweet

More Decks by Kota

Other Decks in Technology

Transcript

  1. 3 Things to Avoid When Rolling Out SLI/SLO Tamachi.sre#3 (2026/03/19)

    Money Forward, Inc. Platform and Reliability Engineering Department Kota Yagi
  2. Money Forward, Inc. 2 Kota Yagi Profile: • SRE at

    Money Forward, Inc. (2024/04~) • sig-etcd member (2025/10~) Other Notes: • Blog・Conference • Hobby: Anime X ID: @88888888_kota
  3. 3 / 20 Scope What I Will (and Won't) Cover

    ✅ Will Cover • 3 lessons learned from rolling out SLI/SLO • What I changed after each failure ❌ Won't Cover • Definitions of SLI / SLO / Error Budget • "Success story" versions of SLI/SLO adoption
  4. 5 / 20 SRE's Problem No way to tell which

    services were reliable vs at risk. No data to guide where to invest SRE resources. Dev Team's Problem Developers had no visibility into whether their services were actually delivering a good user experience. → We wanted data-driven communication about reliability. Why We Started SLI/SLO
  5. 7 / 20 Lesson 1 The First Wall: "Why now?"

    What I argued • Balances Dev & Ops priorities • Improves service reliability • Enables data-driven decisions via Error Budget → The response "I understand the importance, but... why now?" 💡 I spoke in industry generics — not in terms of our organization's specific pain points.
  6. 8 / 20 Lesson 1 Tie It to Your Org's

    Specific Problem ❌ Before "SLI/SLO helps balance Dev and Ops priorities. It improves service reliability." → ✅ After "Right now, SRE can't tell which services need our resource. We're flying blind on resource allocation. SLI/SLO would fix that directly." Lesson 1: Connect the pitch to org-specific problems — not generic benefits.
  7. 10 / 20 Lesson 2 I Tried to Roll Out

    to 15+ Services at Once 15+ services — all pinged on Slack at once What happened • Multiple teams complained: "The ask is too heavy" • Consumed huge amounts of engineer & PM time • "Don't set deadlines for us without asking" • Trust in SRE dropped
  8. 11 / 20 Lesson 2 Complaint ① "The SLI/SLO implementation

    ask is too heavy" 💬 Actual Feedback 1 I felt too much task to product development team from SRE from the first time. The task requires EM, engineers and product manager's time. In my idea, you should start from small steps and then make something better incrementally. I'm not sure what outcome we can get by current your team's requests, but we need to use much time for your request. I guess this is unhealthy investment and resource usage. → "Start small and iterate. Your current ask is way too heavy."
  9. 12 / 20 Lesson 2 Complaint ② "Consider the team's

    situation when setting deadlines" 💬 Actual Feedback 2 I received some negative feedback from the managers about your SLI/SLO initiative. They feel that they are not heard, and you do not listen to their situation. i.e., XXX has asked you about the strictness of deadlines before and that they should be adjusted to the team's workload and not set arbitrarily. → "Consider the team's situation. Don't set arbitrary deadlines."
  10. 13 / 20 Lesson 2 Start with One Team, Then

    Scale 1 Pick one team Design SLI/SLO together from scratch. Discuss requirements in depth. 2 Learn & templatize Accumulate know-how through trial and error. 3 Scale out Bring the template to the next team. Use the first as a reference success story. Lesson 2: Don't try to do everything at once. Learn from one team, then scale.
  11. 15 / 20 Lesson 3 Miss the Planning Cycle =

    Treated as Interruption ❌ What I did wrong • Brought up SLOs after the half already started • Dev teams already had full roadmaps • SRE requests landed as unplanned interruptions • → Treated as low-priority side work ✅ What worked • Got manager-level buy-in before the half began • Registered as a shared SRE+Dev project in planning • Capacity was budgeted from day one • → Work smoothly
  12. 16 / 20 Lesson 3 From Zero-Sum to Positive-Sum ⚔

    Zero-Sum SRE and Dev priorities conflict with each other One side's project stalls → 🤝 Positive-Sum Create a shared goal for SRE × Dev Joint project Wins for both sides Trust grows Lesson 3: Get manager alignment before the half. Make it a joint SRE+Dev project.
  13. 17 / 20 Summary 3 Lessons 1 Generic arguments don't

    move people. Tie your pitch to org-specific problems. 2 Start with one team, then scale. Don't try to do everything at once. 3 Pre-align SRE and Dev before the half starts. Register it as a shared project in the planning cycle.