Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Event streaming fundamentals with Apache Kafka
Search
Keith Resar
February 24, 2022
Technology
580
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Event streaming fundamentals with Apache Kafka
Keith Resar
February 24, 2022
More Decks by Keith Resar
See All by Keith Resar
Real-Time Data Transformation by Example
keithresar
0
110
Exactly-Once Semantics and Transactions in Kafka
keithresar
0
220
Implementing Strangler pattern for microservices migrations
keithresar
0
480
Stream processing with ksqlDB and Apache Kafka
keithresar
1
480
How Nagios is leveraging Ansible Network Automation
keithresar
1
120
Automating Satellite Installation and Configuration With the Ansible Foreman Modules
keithresar
1
770
Writing your first Ansible operator for OpenShift
keithresar
1
270
Intro to CI/CD in GitLab and Anatomy of a Pipeline
keithresar
2
430
Ansible Ecosystem Future Directions
keithresar
0
190
Other Decks in Technology
See All in Technology
生成 AI × MCP で切り拓く次世代 SRE!自律型運用への挑戦と開発者体験の進化
_awache
0
160
2026.06.13_AI時代に事業会社が「SIer出身エンジニア」を求める理由 / Why Businesses Seek Engineers with a System Integrator Background in the AI Era
jumtech
0
540
ChatworkとBPaaS 異なる特性で学んだAI機能開発の ベストプラクティス
kubell_hr
2
2.9k
コードレビューを制するチームがソフトウェアデリバリーのフローを制す / Beyond Code Review: Distributing Its Responsibilities Across the SDLC
mtx2s
4
1.2k
そのPoC、何を検証したつもりでしたか? AIプロダクトの価値検証で陥った落とし穴
techtekt
PRO
0
150
Djangoユーザが知っ得なPostgreSQL機能 - 設計の選択肢を増やす / Djang-use-PostgreSQL
soudai
PRO
0
200
Dynamic Workersについて
yusukebe
2
600
BigQuery の Cross-cloud Lakehouse への歩み
phaya72
2
580
ブロックチェーン / Blockchain
ks91
PRO
0
110
地元にいないローカルオーガナイザーの立ち回り
uvb_76
1
680
AWSシリコン最前線 〜AI時代のチップ選択を読み解く〜
htokoyo
1
130
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
1.9k
Featured
See All Featured
Chasing Engaging Ingredients in Design
codingconduct
0
210
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.9k
How to Think Like a Performance Engineer
csswizardry
28
2.6k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
10k
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
200
Bash Introduction
62gerente
615
210k
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
420
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
28
3.5k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
120k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
600
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
Transcript
Event Streaming Fundamentals with Apache Kafka Keith Resar Sr. Kafka
Developer @KeithResar
Data-Driven Operations
Data-Driven Operations
Data-Driven Operations
None
@KeithResar
@KeithResar
The Rise of Event Streaming 2010 Apache Kafka created at
LinkedIn 2022 Most fortune 100 companies trust and use Kafka
A company is built on _DATA FLOWS_ but all we
have are _DATA STORES_
Example Application Architecture Serving Layer (Microservices, Elastic, etc.) Java Apps
with Kafka Streams or ksqlDB Continuous Computation High-Throughput Event Streaming Platform API-Based Clustering @KeithResar
Apache Kafka is an Event Streaming Platform 1. Storage 2.
Pub / Sub 3. Processing @KeithResar
Storage 12 @KeithResar
Core Abstractions @KeithResar • DB → table • Hadoop →
file • Kafka - ?
LOG
Immutable Event Log New Messages are added at the end
of the log Old @KeithResar
Messages are KV Bytes key: byte[] value: byte[] Headers =>
[Header] @KeithResar
Messages Inside Topics Clicks Orders Customers Topics are similar to
database tables @KeithResar
Topics divide into Partitions Messages are guaranteed to be strictly
ordered within a partition @KeithResar P 0 Clicks P 1 P 2
None
Pub / Sub 20 @KeithResar
Producing Data New Messages are added at the end of
the log Old @KeithResar
Consuming Data New Consume via sequential data access starting from
a specific offset. Old @KeithResar Read to offset & scan
Distinct Consumer Positions New Old @KeithResar Sally offset 12 Fred
offset 3 Rick offset 9
None
Messages are KV Bytes key: byte[] value: byte[] Headers =>
[Header] @KeithResar
Producing to Kafka - No Key @KeithResar P 0 P
1 P 2 P 3 Messages will be produced in a round robin fashion
Producing to Kafka - No Key @KeithResar P 0 P
1 P 2 P 3 Messages will be produced in a round robin fashion
Producing to Kafka - With Key @KeithResar P 0 P
1 P 2 P 3 hash(key) % numPartitions = N
Producing to Kafka - With Key @KeithResar P 0 P
1 P 2 P 3 hash(key) % numPartitions = N
Consumer from Kafka - Single @KeithResar P 0 P 1
P 2 P 3 Single consumer reads from all partitions
Consumer from Kafka - Multiple @KeithResar P 0 P 1
P 2 P 3 Consumers can be split into multiple groups each of which operate in isolation
CONSUMER GROUP COORDINATOR CONSUMERS CONSUMER GROUP
Consumer from Kafka - Multiple @KeithResar P 0 P 1
P 2 P 3 Consumers can be split into multiple groups each of which operate in isolation
Consumer from Kafka - Multiple @KeithResar P 0 P 1
P 2 P 3 Consumers can be split into multiple groups each of which operate in isolation
Grouped Consumers @KeithResar P 0 P 1 P 2 P
3 Consumers can be split into multiple groups each of which operate in isolation
Grouped Consumers @KeithResar P 0 P 1 P 2 P
3 Consumers can be split into multiple groups each of which operate in isolation X
None
Linearly Scalable Architecture @KeithResar Producers • Many producers machines •
Many consumer machines • Many Broker machines Consumers Single topic, No Bottleneck!
Replicate for Fault Tolerance @KeithResar Broker A Broker B Message
✓ Leader Replicate
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Follower Leader
Replication Provides Resiliency @KeithResar Producers Consumers Replica followers become leaders
on machine failure X X X X X
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Follower Leader
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Follower Leader
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Follower Leader
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Follower Leader Partition 2 Partition 1 Partition 3
Partition Leadership / Replication @KeithResar Broker 1 Broker 2 Broker
3 Broker 4 P 0 P 1 P 2 P 3 Partition 0 Partition 2 Partition 3 Partition 0 Partition 1 Partition 3 Partition 0 Partition 1 Partition 2 Follower Leader Partition 2 Partition 1 Partition 3
None
The log is a type of durable messaging system @KeithResar
Similar to a traditional messaging system (ActiveMQ, Rabbit, etc.) but with: • Far better scalability • Built-in fault tolerance/HA • Storage
None
Origins in Stream Processing Serving Layer (Microservices, Elastic, etc.) Java
Apps with Kafka Streams or ksqlDB Continuous Computation High-Throughput Event Streaming Platform API-Based Clustering
Processing 51 @KeithResar
Streaming is the toolset for working with events as they
move! @KeithResar
What is stream processing? @KeithResar auth attempts possible fraud
What is stream processing? @KeithResar User Population Coding Sophistication Core
developers who use Java/Scala Core developers who don’t use Java/Scala Data engineers, architects, DevOps/SRE BI analysts streams
Standing on the Shoulders of Streaming Giants Producer, Consumer APIs
Kafka Streams ksqlDB Ease of use Flexibility ksqlDB UDFs Powered by Powered by
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
What is stream processing? @KeithResar CREATE STREAM possible_fraud AS SELECT
card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTE) GROUP BY card_number HAVING count(*) > 3;
None
Wrap Up 64 @KeithResar
developer.confluent.io Learn Kafka. Start building with Apache Kafka at Confluent
Developer.
Free eBooks Designing Event-Driven Systems Ben Stopford Kafka: The Definitive
Guide Neha Narkhede, Gwen Shapira, Todd Palino Making Sense of Stream Processing Martin Kleppmann I ❤ Logs Jay Kreps http://cnfl.io/book-bundle
None
Thank You @KeithResar Kafka Developer confluent.io