Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
December 02, 2016
Programming
0
800
Kafka Will Get The Message Across, Guaranteed.
Presentation given at SymfonyCon 2016 in Berlin, Germany.
David Zuelke
December 02, 2016
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
150
Getting Things Done
dzuelke
1
410
Your next Web server will be written in... PHP
dzuelke
2
260
Your next Web server will be written in... PHP
dzuelke
3
1.1k
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
260
Heroku at BattleHack Venice 2015
dzuelke
0
130
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.4k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
460
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
480
Other Decks in Programming
See All in Programming
型付け力を強化するための Hoogle のすゝめ / Boosting Your Type Mastery with Hoogle
guvalif
1
230
從零到一:搭建你的第一個 Observability 平台
blueswen
0
220
テスト分析入門/Test Analysis Tutorial
goyoki
12
2.7k
TypeScript エンジニアが Android 開発の世界に飛び込んだ話
yuisakamoto
6
960
Devinで実践する!AIエージェントと協働する開発組織の作り方
masahiro_nishimi
6
2.6k
「兵法」から見る質とスピード
ickx
0
200
iOSアプリ開発もLLMで自動運転する
hiragram
6
2.2k
DevTalks 25 - Create your own AI-infused Java apps with ease
kdubois
2
120
AIエージェントによるテストフレームワーク Arbigent
takahirom
0
280
『Python → TypeScript』オンボーディング奮闘記
takumi_tatsuno
1
140
Building an Application with TDD, DDD and Hexagonal Architecture - Isn't it a bit too much?
mufrid
0
370
CRUD から CQRS へ ~ 分離が可能にする柔軟性
tkawae
0
230
Featured
See All Featured
Side Projects
sachag
454
42k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
106
19k
Fireside Chat
paigeccino
37
3.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
Navigating Team Friction
lara
186
15k
Optimizing for Happiness
mojombo
378
70k
Designing Experiences People Love
moore
142
24k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
1
82
GraphQLとの向き合い方2022年版
quramy
46
14k
Designing for Performance
lara
608
69k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
21k
Building an army of robots
kneath
306
45k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. SymfonyCon 2016 Berlin,
Germany
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages!
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest Twitter firehose and turn it into a pointless demo
;)
None
messaging, of course
track user activity
record runtime metrics
IoT
replicate information between data centers
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]