Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
December 02, 2016
Programming
0
780
Kafka Will Get The Message Across, Guaranteed.
Presentation given at SymfonyCon 2016 in Berlin, Germany.
David Zuelke
December 02, 2016
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
140
Getting Things Done
dzuelke
1
400
Your next Web server will be written in... PHP
dzuelke
2
250
Your next Web server will be written in... PHP
dzuelke
3
1.1k
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
250
Heroku at BattleHack Venice 2015
dzuelke
0
120
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.4k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
410
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
470
Other Decks in Programming
See All in Programming
1年目の私に伝えたい!テストコードを怖がらなくなるためのヒント/Tips for not being afraid of test code
push_gawa
0
210
Kubernetes History Inspector(KHI)を触ってみた
bells17
0
230
仕様変更に耐えるための"今の"DRY原則を考える / Rethinking the "Don't repeat yourself" for resilience to specification changes
mkmk884
3
570
Rubyで始める関数型ドメインモデリング
shogo_tksk
0
120
Amazon S3 TablesとAmazon S3 Metadataを触ってみた / 20250201-jawsug-tochigi-s3tables-s3metadata
kasacchiful
0
170
Pythonでもちょっとリッチな見た目のアプリを設計してみる
ueponx
1
570
Pulsar2 を雰囲気で使ってみよう
anoken
0
240
メンテが命: PHPフレームワークのコンテナ化とアップグレード戦略
shunta27
0
120
Grafana Loki によるサーバログのコスト削減
mot_techtalk
1
130
Grafana Cloudとソラカメ
devoc
0
170
楽しく向き合う例外対応
okutsu
0
150
第3回 Snowflake 中部ユーザ会- dbt × Snowflake ハンズオン
hoto17296
4
370
Featured
See All Featured
The World Runs on Bad Software
bkeepers
PRO
67
11k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.1k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
Why You Should Never Use an ORM
jnunemaker
PRO
55
9.2k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
The Cult of Friendly URLs
andyhume
78
6.2k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7.1k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
7k
4 Signs Your Business is Dying
shpigford
182
22k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
133
33k
Raft: Consensus for Rubyists
vanstee
137
6.8k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. SymfonyCon 2016 Berlin,
Germany
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages!
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest Twitter firehose and turn it into a pointless demo
;)
None
messaging, of course
track user activity
record runtime metrics
IoT
replicate information between data centers
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]