Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
580
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
250
Critical Path Analysis
rakyll
0
610
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
オンコール⼊⾨〜ページャーが鳴る前に、あなたが備えられること〜 / Before The Pager Rings
yktakaha4
2
1k
Agentic Coding: The Future of Software Development with Agents
mitsuhiko
0
130
PHPUnitの限界をPlaywrightで補完するテストアプローチ
yuzneri
0
120
dbt民主化とLLMによる開発ブースト ~ AI Readyな分析サイクルを目指して ~
yoshyum
3
1.1k
猫と暮らす Google Nest Cam生活🐈 / WebRTC with Google Nest Cam
yutailang0119
0
170
PipeCDのプラグイン化で目指すところ
warashi
1
310
ソフトウェア品質を数字で捉える技術。事業成長を支えるシステム品質の マネジメント
takuya542
2
15k
Deep Dive into ~/.claude/projects
hiragram
14
14k
生成AI時代のコンポーネントライブラリの作り方
touyou
1
290
What's new in AppKit on macOS 26
1024jp
0
150
코딩 에이전트 체크리스트: Claude Code ver.
nacyot
0
940
副作用と戦う PHP リファクタリング ─ ドメインイベントでビジネスロジックを解きほぐす
kajitack
2
150
Featured
See All Featured
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Fireside Chat
paigeccino
37
3.5k
Thoughts on Productivity
jonyablonski
69
4.7k
StorybookのUI Testing Handbookを読んだ
zakiyama
30
5.9k
Optimizing for Happiness
mojombo
379
70k
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
How STYLIGHT went responsive
nonsquared
100
5.6k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
The Straight Up "How To Draw Better" Workshop
denniskardys
235
140k
A Tale of Four Properties
chriscoyier
160
23k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll