Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
620
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Servers are doomed to fail
rakyll
3
1.6k
Serverless Containers
rakyll
1
280
Critical Path Analysis
rakyll
0
690
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
ベクトル検索のフィルタを用いた機械学習モデルとの統合 / python-meetup-fukuoka-06-vector-attr
monochromegane
2
340
AI時代のソフトウェア開発でも「人が仕様を書く」から始めよう-医療IT現場での実践とこれから
koukimiura
0
140
エラーログのマスキングの仕組みづくりに役立ったASTの話
kumoichi
0
110
2026/02/04 AIキャラクター人格の実装論 口 調の模倣から、コンテキスト制御による 『思想』と『行動』の創発へ
sr2mg4
0
720
2026年は Rust 置き換えが流行る! / 20260220-niigata-5min-tech
girigiribauer
0
220
AIに任せる範囲を安全に広げるためにやっていること
fukucheee
0
120
nilとは何か 〜interfaceの構造とnil!=nilから理解する〜
kuro_kurorrr
3
1.8k
Event Storming
hschwentner
3
1.3k
AWS Infrastructure as Code の新機能 2025 総まとめ 〜SA 4人による怒涛のデモ祭り〜
konokenj
10
3.3k
Docコメントで始める簡単ガードレール
keisukeikeda
1
100
開発ステップを細分化する、破綻しないAI開発体制
kspace
0
110
New in Go 1.26 Implementing go fix in product development
sunecosuri
0
380
Featured
See All Featured
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
81
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
260
The agentic SEO stack - context over prompts
schlessera
0
680
Designing Experiences People Love
moore
143
24k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
140
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.4k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.8k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
900
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
200
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
280
Joys of Absence: A Defence of Solitary Play
codingconduct
1
300
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll