Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to scale a Logging Infrastructure
Search
Paul Stack
June 03, 2015
Technology
0
170
How to scale a Logging Infrastructure
Logging infrastructure using ELK + Kafka
Paul Stack
June 03, 2015
Tweet
Share
More Decks by Paul Stack
See All by Paul Stack
Infrastructure as Software
stack72
0
60
Mirror, Mirror on the way, what is the vainest metric of them all?
stack72
1
2.3k
Continuously Delivering Infrastructure to the Cloud
stack72
0
170
DevOops 2016
stack72
0
110
The Quest for Infrastructure Management 2.0
stack72
0
130
The Biggest Trick Consultants Ever Pulled was Telling The World Continuous Delivery is Easy
stack72
1
99
The Transition from Product to Infrastructure
stack72
0
57
Continuous Delivery - the missing parts
stack72
0
900
Windows: Having its ass kicked by puppet and powershell
stack72
0
120
Other Decks in Technology
See All in Technology
静的解析で実現した効率的なi18n対応の仕組みづくり
minako__ph
1
170
Lambda10周年!Lambdaは何をもたらしたか
smt7174
2
130
【Startup CTO of the Year 2024 / Audience Award】アセンド取締役CTO 丹羽健
niwatakeru
0
1.4k
Python(PYNQ)がテーマのAMD主催のFPGAコンテストに参加してきた
iotengineer22
0
550
IBC 2024 動画技術関連レポート / IBC 2024 Report
cyberagentdevelopers
PRO
1
120
CDCL による厳密解法を採用した MILP ソルバー
imai448
3
200
OS 標準のデザインシステムを超えて - より柔軟な Flutter テーマ管理 | FlutterKaigi 2024
ronnnnn
1
310
【Pycon mini 東海 2024】Google Colaboratoryで試すVLM
kazuhitotakahashi
2
570
rootlessコンテナのすゝめ - 研究室サーバーでもできる安全なコンテナ管理
kitsuya0828
3
390
ノーコードデータ分析ツールで体験する時系列データ分析超入門
negi111111
0
430
Oracle Cloud Infrastructureデータベース・クラウド:各バージョンのサポート期間
oracle4engineer
PRO
29
13k
アジャイルでの品質の進化 Agile in Motion vol.1/20241118 Hiroyuki Sato
shift_evolve
0
190
Featured
See All Featured
XXLCSS - How to scale CSS and keep your sanity
sugarenia
246
1.3M
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.2k
Git: the NoSQL Database
bkeepers
PRO
427
64k
Done Done
chrislema
181
16k
Building Your Own Lightsaber
phodgson
103
6.1k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
27
4.3k
[RailsConf 2023] Rails as a piece of cake
palkan
52
4.9k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
16
2.1k
Gamification - CAS2011
davidbonilla
80
5k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Rails Girls Zürich Keynote
gr2m
94
13k
Agile that works and the tools we love
rasmusluckow
327
21k
Transcript
How do you scale a logging infrastructure to accept a
billion messages a day? Paul Stack http://twitter.com/stack72 mail:
[email protected]
About Me Infrastructure Engineer for a cool startup :) Reformed
ASP.NET / C# Developer DevOps Extremist Conference Junkie
Background Project was to replace the legacy ‘logging solution’
Iteration 0: A Developer created a single box with the
ELK all in 1 jar
Time to make it production ready now
None
Iteration 1: Using Redis as the input mechanism for LogStash
None
None
Enter Apache Kafka
“Kafka is a distributed publish- subscribe messaging system that is
designed to be fast, scalable, and durable” Source: Cloudera Blog
Introduction to Kafka • Kafka is made up of ‘topics’,
‘producers’, ‘consumers’ and ‘brokers’ • Communication is via TCP • Backed by Zookeeper
Kafka Topics Source: http://kafka.apache.org/documentation.html
Kafka Producers • Producers are responsible to chose what topic
to publish data to • The producer is responsible for choosing a partition to write to • Can be handled round robin or partition functions
Kafka Consumers • Consumption can be done via: • queuing
• pub-sub
Kafka Consumers • Kafka consumer group • Strong ordering
Kafka Consumers • Strong ordering
https://github.com/opentable/puppet-exhibitor
None
Iteration 2 Introduction of Kafka
None
None
Iteration 3 Further ‘Improvements’ to the cluster layout
None
The Numbers • Logs kept in ES for 30 days
then archived • 12 billion documents active in ES • ES space was about 25 - 30TB in EBS volumes • Average Doc Size ~ 1.2KB • V-Day 2015: ~750M docs collected without failure
What about metrics and monitoring?
Monitoring - Nagios • Alerts on • ES Cluster •
zK and Kafka Nodes • Logstash / Redis nodes
None
https://github.com/stack72/nagios-elasticsearch
Metrics - Kafka Offset Monitor
https://github.com/opentable/KafkaOffsetMonitor
Metrics - ElasticSearch
None
None
None
Visibility Rocks!
None
So what would I do differently?
Questions?
Paul Stack @stack72