Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
520
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.6k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
510
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
CSC509 Lecture 09
javiergs
PRO
0
110
Vitest Browser Mode への期待 / Vitest Browser Mode
odanado
PRO
2
1.7k
CSC509 Lecture 08
javiergs
PRO
0
110
役立つログに取り組もう
irof
27
8.7k
Amazon Neptuneで始めてみるグラフDB-OpenSearchによるグラフの全文検索-
satoshi256kbyte
4
330
Snowflake x dbtで作るセキュアでアジャイルなデータ基盤
tsoshiro
2
430
Why Spring Matters to Jakarta EE - and Vice Versa
ivargrimstad
0
1k
破壊せよ!データ破壊駆動で考えるドメインモデリング / data-destroy-driven
minodriven
16
4.1k
外部システム連携先が10を超えるシステムでのアーキテクチャ設計・実装事例
kiwasaki
1
230
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
1
230
【Kaigi on Rails 2024】YOUTRUST スポンサーLT
krpk1900
1
250
EventSourcingの理想と現実
wenas
6
2.1k
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
504
140k
Producing Creativity
orderedlist
PRO
341
39k
GitHub's CSS Performance
jonrohan
1030
460k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
37
1.8k
Rails Girls Zürich Keynote
gr2m
93
13k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
32
1.8k
Embracing the Ebb and Flow
colly
84
4.4k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
28
9.1k
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
7
150
Fashionably flexible responsive web design (full day workshop)
malarkey
404
65k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll