Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
590
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
620
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
Server Less Code More - コードを書かない時代に生きるサーバーレスデザイン / server-less-code-more
gawa
1
200
Introducing FrankenPHP gRPC
dunglas
1
450
API Platform 4.2: Redefining API Development
soyuka
0
430
Design Foundational Data Engineering Observability
sucitw
3
210
print("Hello, World")
eddie
2
530
時間軸から考えるTerraformを使う理由と留意点
fufuhu
16
4.8k
Reading Rails 1.0 Source Code
okuramasafumi
0
260
JSONataを使ってみよう Step Functionsが楽しくなる実践テクニック #devio2025
dafujii
1
660
Platformに“ちょうどいい”責務ってどこ? 関心の熱さにあわせて考える、責務分担のプラクティス
estie
1
290
Tool Catalog Agent for Bedrock AgentCore Gateway
licux
7
2.6k
デザイナーが Androidエンジニアに 挑戦してみた
874wokiite
0
590
テストカバレッジ100%を10年続けて得られた学びと品質
mottyzzz
2
620
Featured
See All Featured
Statistics for Hackers
jakevdp
799
220k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
36
2.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
GitHub's CSS Performance
jonrohan
1032
460k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
850
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
930
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll