Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
RPC Metrics at Google
Search
JBD
August 09, 2018
Programming
2
540
RPC Metrics at Google
JBD
August 09, 2018
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Servers are doomed to fail
rakyll
3
1.5k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
540
Monitoring and Debugging Containers
rakyll
2
1.1k
Other Decks in Programming
See All in Programming
EC2からECSへ 念願のコンテナ移行と巨大レガシーPHPアプリケーションの再構築
sumiyae
3
590
『改訂新版 良いコード/悪いコードで学ぶ設計入門』活用方法−爆速でスキルアップする!効果的な学習アプローチ / effective-learning-of-good-code
minodriven
28
4.2k
サーバーゆる勉強会 DBMS の仕組み編
kj455
1
300
アクターシステムに頼らずEvent Sourcingする方法について
j5ik2o
6
710
Fibonacci Function Gallery - Part 2
philipschwarz
PRO
0
210
月刊 競技プログラミングをお仕事に役立てるには
terryu16
1
1.2k
Swiftコンパイラ超入門+async関数の仕組み
shiz
0
180
watsonx.ai Dojo #6 継続的なAIアプリ開発と展開
oniak3ibm
PRO
0
170
ChatGPT とつくる PHP で OS 実装
memory1994
PRO
3
190
20年もののレガシープロダクトに 0からPHPStanを入れるまで / phpcon2024
hirobe1999
0
1k
責務を分離するための例外設計 - PHPカンファレンス 2024
kajitack
9
2.4k
快速入門可觀測性
blueswen
0
500
Featured
See All Featured
Making Projects Easy
brettharned
116
6k
KATA
mclloyd
29
14k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
10
870
Side Projects
sachag
452
42k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Into the Great Unknown - MozCon
thekraken
34
1.6k
Building Applications with DynamoDB
mza
93
6.2k
Designing Experiences People Love
moore
139
23k
How to train your dragon (web standard)
notwaldorf
89
5.8k
Visualization
eitanlees
146
15k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
230
52k
Transcript
RPC Metrics at Google JBD, Google (@rakyll)
gRPC Metrics at Google JBD, Google (@rakyll)
Request Metrics at Google JBD, Google (@rakyll)
@rakyll "100% is the wrong reliability target for basically everything."
-- Benjamin Treynor Sloss, VP of Engineering, Google
@rakyll "A service is available if users cannot tell that
there was an outage."
@rakyll Principled way of saying what level of downtime is
acceptable. • Error rate • Latency expectations SLOs
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store
@rakyll Questions infra teams want to ask: • Are we
meeting the SLO for the other team? • What’s the impact of a product on infra? • How much do we need to scale up if product grows 10%?
@rakyll High-Cardinality Breaking down the metrics data...
@rakyll Query the collected data in various ways: • Latency
distribution for RPCs originated at Google Analytics. • Requests take took more than 100ms for the customer #123. • Compare the request latency initiated at web vs mobile frontend.
@rakyll Analytics frontend server Authentication Reporting Users ... Spanner Blob
Store originator=analytics; ...
@rakyll Blob store read errors by originator
@rakyll Dynamically choose aggregation (split between recording and aggregation)
@rakyll Exemplars
@rakyll /rpz and /statz
@rakyll http://server:7777/debug/rpcz
@rakyll Export? Monarch, Prometheus, and more.
@rakyll import “cloud.google.com/go/pubsub”
@rakyll +
Thank you! JBD, Google
[email protected]
@rakyll