$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Etsy on Migrating to Kafka (in three short years)
Search
Hakka Labs
January 22, 2015
Programming
4
6k
Etsy on Migrating to Kafka (in three short years)
Full post with video here:
Hakka Labs
January 22, 2015
Tweet
Share
More Decks by Hakka Labs
See All by Hakka Labs
New Workflows for Building Data Pipelines
hakka_labs
0
2.9k
Collaborative Topic Models for Users and Texts
hakka_labs
0
2.8k
Groupcache with Evan Owen
hakka_labs
2
5.4k
Testing Android at Spotify
hakka_labs
1
4.5k
It's Not a Bug, It's a Feature!
hakka_labs
0
3.2k
K-means Clustering to Understand Your Users
hakka_labs
0
2k
Building Amy: The Email-based Virtual Assistant by x.ai
hakka_labs
0
5k
Deep Learning and NLP Applications
hakka_labs
3
13k
Go and the Gophers
hakka_labs
2
11k
Other Decks in Programming
See All in Programming
dnx で実行できるコマンド、作ってみました
tomohisa
0
130
バックエンドエンジニアによる Amebaブログ K8s 基盤への CronJobの導入・運用経験
sunabig
0
130
【CA.ai #3】ワークフローから見直すAIエージェント — 必要な場面と“選ばない”判断
satoaoaka
0
200
TypeScript 5.9 で使えるようになった import defer でパフォーマンス最適化を実現する
bicstone
1
990
Why Kotlin? 電子カルテを Kotlin で開発する理由 / Why Kotlin? at Henry
agatan
2
6.1k
Reactive Thinking with Signals and the new Resource API
manfredsteyer
PRO
0
160
251126 TestState APIってなんだっけ?Step Functionsテストどう変わる?
east_takumi
0
290
dotfiles 式年遷宮 令和最新版
masawada
1
650
ZOZOにおけるAI活用の現在 ~モバイルアプリ開発でのAI活用状況と事例~
zozotech
PRO
8
4k
CloudNative Days Winter 2025: 一週間で作る低レイヤコンテナランタイム
ternbusty
7
1.9k
ローターアクトEクラブ アメリカンナイト:川端 柚菜 氏(Japan O.K. ローターアクトEクラブ 会長):2720 Japan O.K. ロータリーEクラブ2025年12月1日卓話
2720japanoke
0
410
手が足りない!兼業データエンジニアに必要だったアーキテクチャと立ち回り
zinkosuke
0
360
Featured
See All Featured
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.8k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
36
6.2k
The Invisible Side of Design
smashingmag
302
51k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
Docker and Python
trallard
46
3.7k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.6k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
How GitHub (no longer) Works
holman
316
140k
Speed Design
sergeychernyshev
33
1.4k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
3k
Transcript
Migrating to Kafka in Three Short Years A look at
the choices that defined the Etsy analytics stack
None
Path Dependence
Decisions made in the past limit options in the present,
even if the circumstances under which those past decisions were made are no longer relevant.
In other words, we can’t upgrade the Hadoop cluster until
we port all of the Cascading.jruby jobs to Scalding.
Sneak Preview ! 1. How Etsy built its original analytics
stack 2. Handling changes prepared us to rebuild our data pipeline 3. Kafka!
Starting from scratch
Choice #1 ! Acquire Adtuitive
None
None
Before you can work on search, you need real analytics
Choice #2 ! Build a zero-impact analytics stack
Etsy is not a cloud company but the first analytics
stack was cloud-based
(illustration here) browser CDN EMR S3 mysql FTP
Legacy effects: ! 24 hour latency on events 48 hour
latency on visits
Choice #3 ! Cascading.jruby
Hadoop Cascading Cascading.jruby
Choice #4 ! Use GA _utma cookie to define visits
Benefits: ! •Simpler ETL •Visits computed on the client side
•Easy to reconcile against Google Analytics
Choice #5 ! Using existing feature library for A/B tests
Leveraged existing experience with operational ramp-ups
Low impact: just required a logging change
Choice #6 ! Build analytics stack around visit-level metrics
Great for search and ads, less great for measuring engagement
Changing the tires without stopping the car
How do we instrument the iOS app? Summer 2012
1. Native app visits should have the same structure as
Web visits
2. Native app events should use the existing data pipeline
3. The native app should buffer events and send them
when convenient
Solution: ! 1. App uploads bundles of events to API
endpoint 2. Backend event logger curls the beacon for every event
Side effect: ! We have a backend event logger that
is now used all over the place
CDN diversification project Fall 2012
None
Migrated to our own beacon infrastructure
Data pipeline based on Apache, PHP, logrotate, and cron
We built our own Hadoop cluster: Etsydoop Fall 2012
We hired the Scalding guy Fall 2012
Hadoop Cascading Cascading.jruby Scalding
None
Uh oh, the Google Analytics JS hurts performance Fall 2012
The event logger’s GA dependency precluded async loading, hurting performance
First idea: duplicate the _utma functionality in our own code
The trouble with backend events
Visit Time Logger Event Type 1 12:01 frontend home 1
12:03 backend login 1 12:03 frontend view listing 1 1:31 backend logout 2 1:31 frontend view listing 2 1:32 frontend search 2 1:33 frontend view listing wrong visit
Complete rewrite of our ETL jobs Spring/Summer 2013
None
Backend page-view events Fall 2013
None
2014: the next phase
EventPipe goals
Use POST rather than multiple GET requests to prevent data
loss
Use JSON rather than query strings for comprehensibility
Validate beacon data before it enters the data pipeline
Use a binary serialization format for long-term storage
Use Kafka for data transfer to escape the batch paradigm
Eliminate individual beacon servers as points of failure
How do we handle the impedance mismatch between Apache/PHP and
Kafka?
Wrote a server in Go to serialize beacons in Thrift
and send them to Kafka
Use Apache for SSL termination
Still to come
Real-ish time ETL
Streaming infrastructure
Offline processing for more products
Other Kafka applications
Takeaways
Every choice you make has long-term implications
Fixing stuff creates new opportunities
@rafeco http://rc3.org