Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
630
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
RPC Metrics at Google
JBD
August 09, 2018
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.8k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.7k
Are you ready for production?
rakyll
8
3k
Servers are doomed to fail
rakyll
3
1.6k
Serverless Containers
rakyll
1
290
Critical Path Analysis
rakyll
0
700
Monitoring and Debugging Containers
rakyll
2
1.2k
Other Decks in Programming
See All in Programming
dRuby over BLE
makicamel
2
340
並列実装の現場、2ヶ月間実務でAIを使い倒したAIもPCも私も限界が近い
ming_ayami
0
130
net-httpのHTTP/2対応について
naruse
0
480
OSもどきOS
arkw
0
560
気圧・高度・GPSを記録&可視化するアプリ「Koudo」を作った話
hjmkth
1
260
Semantic Version 単位で戦略を柔軟に変えて、パッケージアップデートを自動化する
daitasu
1
240
jQueryをバージョンアップする前に使いたいjQuery Migrate
matsuo_atsushi
0
500
IBM Bobを活用したレガシーアプリの最新化
oniak3ibm
PRO
1
200
Datadog × OpenTelemetry 入門と実践のあいだ
kn_to_maxpno
1
160
ローカルLLMでどこまでコードが書けるか -拡張版 / How much code can be written on a local LLM Extended
kishida
11
4.1k
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
230
AIとASP.NET Coreで雑Webアプリを作った話
mayuki
0
620
Featured
See All Featured
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
470
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
2
400
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Raft: Consensus for Rubyists
vanstee
141
7.5k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
170
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
180
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
Ethics towards AI in product and experience design
skipperchong
2
310
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.9k
Crafting Experiences
bethany
1
180
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll