Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
380
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
380
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
83
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
170
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
230
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
210
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
290
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
480
Other Decks in Technology
See All in Technology
短期間でRAGシステムを実現 お客様と歩んだ生成AI内製化への道のり
taka0709
1
170
GTC 2025 : 가속되고 있는 미래
inureyes
PRO
0
150
Raycast AI APIを使ってちょっと便利なAI拡張機能を作ってみた
kawamataryo
1
240
CloudComposerによる大規模ETL 「制御と実行の分離」の実践
leveragestech
0
160
2025/10/27 JJUGナイトセミナー WildFlyとQuarkusの 始め方
megascus
0
110
Kotlinで型安全にバイテンポラルデータを扱いたい! ReladomoラッパーをAIと実装してみた話
itohiro73
3
220
어떤 개발자가 되고 싶은가?
arawn
1
420
設計に疎いエンジニアでも始めやすいアーキテクチャドキュメント
phaya72
26
17k
プロダクトエンジニアとしてのマインドセットの育み方 / How to improve product engineer mindset
saka2jp
1
130
AWSが好きすぎて、41歳でエンジニアになり、AAIを経由してAWSパートナー企業に入った話
yama3133
2
230
Data Engineering Guide 2025 #data_summit_findy by @Kazaneya_PR / 20251106
kazaneya
PRO
7
950
datadog-incident-management-intro
tetsuya28
0
120
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
27
2.2k
Navigating Team Friction
lara
190
15k
The Pragmatic Product Professional
lauravandoore
36
7k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.7k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
4 Signs Your Business is Dying
shpigford
186
22k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.3k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.7k
Code Review Best Practice
trishagee
72
19k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None