Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
400
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
390
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
91
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
240
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
220
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
300
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
480
Other Decks in Technology
See All in Technology
人工知能のための哲学塾 ニューロフィロソフィ篇 第零夜 「ニューロフィロソフィとは何か?」
miyayou
0
200
日本Rubyの会: これまでとこれから
snoozer05
PRO
6
250
Directions Asia 2025 _ Let’s build my own secretary (AI Agent) Part 1 & 2
ryoheig0405
0
100
2025-12-27 Claude CodeでPRレビュー対応を効率化する@機械学習社会実装勉強会第54回
nakamasato
4
1.3k
ソフトウェアエンジニアとAIエンジニアの役割分担についてのある事例
kworkdev
PRO
1
340
Next.js 16の新機能 Cache Components について
sutetotanuki
0
210
日本の AI 開発と世界の潮流 / GenAI Development in Japan
hariby
2
710
コールドスタンバイ構成でCDは可能か
hiramax
0
120
複雑さを受け入れるか、拒むか? - 事業成長とともに育ったモノリスを前に私が考えたこと #RSGT2026
murabayashi
0
130
『君の名は』と聞く君の名は。 / Your name, you who asks for mine.
nttcom
1
130
Agent Skillsがハーネスの垣根を超える日
gotalab555
7
4.9k
[PR] はじめてのデジタルアイデンティティという本を書きました
ritou
0
440
Featured
See All Featured
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
140
New Earth Scene 8
popppiees
0
1.3k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
680
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
39
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
0
26
Designing Experiences People Love
moore
143
24k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
720
How to Talk to Developers About Accessibility
jct
1
92
Git: the NoSQL Database
bkeepers
PRO
432
66k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
170
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
180
How to train your dragon (web standard)
notwaldorf
97
6.5k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None