$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
600
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Servers are doomed to fail
rakyll
3
1.6k
Serverless Containers
rakyll
1
270
Critical Path Analysis
rakyll
0
660
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
The Art of Re-Architecture - Droidcon India 2025
siddroid
0
120
Patterns of Patterns
denyspoltorak
0
270
SwiftUIで本格音ゲー実装してみた
hypebeans
0
490
Navigation 3: 적응형 UI를 위한 앱 탐색
fornewid
1
440
Grafana:建立系統全知視角的捷徑
blueswen
0
110
エディターってAIで操作できるんだぜ
kis9a
0
750
ELYZA_Findy AI Engineering Summit登壇資料_AIコーディング時代に「ちゃんと」やること_toB LLMプロダクト開発舞台裏_20251216
elyza
2
580
TerraformとStrands AgentsでAmazon Bedrock AgentCoreのSSO認証付きエージェントを量産しよう!
neruneruo
4
1.6k
Context is King? 〜Verifiability時代とコンテキスト設計 / Beyond "Context is King"
rkaga
10
1.4k
re:Invent 2025 のイケてるサービスを紹介する
maroon1st
0
150
ZJIT: The Ruby 4 JIT Compiler / Ruby Release 30th Anniversary Party
k0kubun
0
260
AI 駆動開発ライフサイクル(AI-DLC):ソフトウェアエンジニアリングの再構築 / AI-DLC Introduction
kanamasa
11
3.6k
Featured
See All Featured
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Reality Check: Gamification 10 Years Later
codingconduct
0
1.9k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
400
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
110
Are puppies a ranking factor?
jonoalderson
0
2.4k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
How Software Deployment tools have changed in the past 20 years
geshan
0
30k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
1.9k
Mobile First: as difficult as doing things right
swwweet
225
10k
What does AI have to do with Human Rights?
axbom
PRO
0
1.9k
Bash Introduction
62gerente
615
210k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll