Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real-time application monitoring
Search
Igor Afonov
June 19, 2012
Programming
5
1k
Real-time application monitoring
Real-time application monitoring. Coffee and code Donetsk June 2012.
Igor Afonov
June 19, 2012
Tweet
Share
More Decks by Igor Afonov
See All by Igor Afonov
Application deployment with Chef
iafonov
6
950
Web servers - and how I created my own one
iafonov
8
1.2k
Chef - Infrastructure as code
iafonov
6
660
Other Decks in Programming
See All in Programming
Spring gRPC で始める gRPC 入門 / Introduction to gRPC with Spring gRPC
mackey0225
2
490
Perplexity Slack Botを作ってAI活用を進めた話 / AI Engineering Summit プレイベント
n3xem
0
650
既存デザインを変更せずにタップ領域を広げる方法
tahia910
1
200
從零到一:搭建你的第一個 Observability 平台
blueswen
1
900
事業戦略を理解してソフトウェアを設計する
masuda220
PRO
22
6k
Datadog RUM 本番導入までの道
shinter61
1
270
イベントストーミングから始めるドメイン駆動設計
jgeem
4
830
技術懸念に立ち向かい 法改正を穏便に乗り切った話
pop_cashew
0
1.4k
A comprehensive view of refactoring
marabesi
0
300
プロダクト開発でも使おう 関数のオーバーロード
yoiwamoto
0
150
赤裸々に公開。 TSKaigiのオフシーズン
takezoux2
0
130
関数型まつり2025登壇資料「関数プログラミングと再帰」
taisontsukada
2
800
Featured
See All Featured
Building a Scalable Design System with Sketch
lauravandoore
462
33k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
107
19k
Build The Right Thing And Hit Your Dates
maggiecrowley
36
2.7k
A better future with KSS
kneath
239
17k
Visualization
eitanlees
146
16k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
6
690
KATA
mclloyd
29
14k
GraphQLとの向き合い方2022年版
quramy
46
14k
Writing Fast Ruby
sferik
628
61k
Done Done
chrislema
184
16k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
123
52k
Transcript
Real-time application monitoring Igor Afonov @iafonov
Background • SaaS application • 5 production servers (2 auxiliary
servers) • Everything managed by chef • Apache/Passenger/Rails/MySQL/Postfix
Metrics that matter • Server state • Application state
How metrics data is stored • Round-robin database • RRDTool
(C), Whisper (Python) (Minor offtopic)
Server state • Munin - storage and graphs • Monit
- alerts, basic rescue actions Shard munin-server munin-node Shard munin-node
Munin • Server pulls data from nodes • Gathers basic
server health data • A lot of custom plugins
Munin + Chef = — # Client config template munin_servers
= search(:node, "role:monitoring") <% munin_servers.sort.each do |server| -%> allow ^<%= server[:ipaddress].to_s.gsub(/\./, '\.') %>$ <% end %> # Server config template munin_nodes = search(:node, "munin:[* TO *]") <% munin_nodes.each do |system| -%> [<%= system[:hostname] %>] address <%= system[:ipaddress] %> use_node_name yes <% end %>
Application state • Subscriptions • Logins • Orders • Business
metrics • ...
Our setup Shard Monitoring Server StatsD Graphite Whisper Carbon WebApp
Shard Shard
StatsD • Lightweight proxy to graphite • 300 lines of
Javascript • Uses UDP (small size, non-blocking) • 10+ implementations • Simple protocol
StatsD • Count: counter:1|c • Measure: metric:200|ms • Gauge: value:9000|g
• Supports sampling • A lot of client-side libraries
StatsD StatsD.server = '178.22.33.88:8125' # increment counter StatsD.increment("users.new") # measure
task StatsD.measure("cron.#{task}") do task.run end # meta-programming - track subscriptions Subscription.extend StatsD::Instrument Subscription.statsd_count :subscribe, 'subscriptions'
Graphite • Python everywhere • Whisper - RRD, stores data
• Carbon - backend • Graphite - draws graphs, works with data
Graphite [stats] priority = 100 pattern = ^stats\..* retentions =
10:2160,60:10080,600:262974 (Storage schemas)
Problems • Basic UI • Complex installation and setup
telemetry.io • Fun project • SaaS application • Custom front-end
for StatsD + Graphite • Optimized for big screens (TVs) • Free • Maybe open-source
telemetry.io Client telemetry.io StatsD Graphite Whisper Carbon WebApp Custom Front-end
Client Client Client
Workflow • Get access token • Use one of available
libs or create own • Prepend token to metric name • Integrate into application StatsD.increment("#{token}.subscribers")
None
None
None
None
Implementation • Ruby on Rails • CoffeeScript • Chef (full
node bootstrap in 10 minutes)
Links • http://telemetry.io • http://code.flickr.com/blog/2008/10/27/ counting-timing/ • http://codeascraft.etsy.com/2011/02/15/ measure-anything-measure-everything/
http://iafonov.github.com/ @iafonov