Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
530
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
530
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
menu基盤チームによるGoogle Cloudの活用事例~Application Integration, Cloud Tasks編~
yoshifumi_ishikura
0
110
PHPで学ぶプログラミングの教訓 / Lessons in Programming Learned through PHP
nrslib
3
300
create_tableをしただけなのに〜囚われのuuid編〜
daisukeshinoku
0
270
17年周年のWebアプリケーションにTanStack Queryを導入する / Implementing TanStack Query in a 17th Anniversary Web Application
saitolume
0
250
103 Early Hints
sugi_0000
1
230
責務を分離するための例外設計 - PHPカンファレンス 2024
kajitack
6
1.4k
数十万行のプロジェクトを Scala 2から3に完全移行した
xuwei_k
0
280
ゆるやかにgolangci-lintのルールを強くする / Kyoto.go #56
utgwkk
2
400
開発者とQAの越境で自動テストが増える開発プロセスを実現する
92thunder
1
190
競技プログラミングへのお誘い@阪大BOOSTセミナー
kotamanegi
0
360
nekko cloudにおけるProxmox VE利用事例
irumaru
3
440
良いユニットテストを書こう
mototakatsu
8
2.8k
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
5
450
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
111
49k
Building Adaptive Systems
keathley
38
2.3k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
8
1.2k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
The Cost Of JavaScript in 2023
addyosmani
45
7k
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
No one is an island. Learnings from fostering a developers community.
thoeni
19
3k
Making Projects Easy
brettharned
116
5.9k
Navigating Team Friction
lara
183
15k
BBQ
matthewcrist
85
9.4k
For a Future-Friendly Web
brad_frost
175
9.4k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll