Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
190
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
370
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
370
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
82
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
170
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
230
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
210
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
280
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
470
Other Decks in Technology
See All in Technology
【Λ(らむだ)】最近のアプデ情報 / RPALT20250729
lambda
0
230
AI人生苦節10年で会得したAIがやること_人間がやること.pdf
shibuiwilliam
1
270
Agent Development Kitで始める生成 AI エージェント実践開発
danishi
0
120
Lambda management with ecspresso and Terraform
ijin
2
140
AIに目を奪われすぎて、周りの困っている人間が見えなくなっていませんか?
cap120
1
430
データ基盤の管理者からGoogle Cloud全体の管理者になっていた話
zozotech
PRO
0
340
마라톤 끝의 단거리 스퍼트: 2025년의 AI
inureyes
PRO
1
690
リリース2ヶ月で収益化した話
kent_code3
1
190
Tableau API連携の罠!?脱スプシを夢見たはずが、逆に依存を深めた話
cuebic9bic
3
210
Kiroから考える AIコーディングツールの潮流
s4yuba
4
670
【CEDEC2025】ブランド力アップのためのコンテンツマーケティング~ゲーム会社における情報資産の活かし方~
cygames
PRO
0
240
AI関数が早くなったので試してみよう
kumakura
0
120
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
96
6.1k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.6k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
251
21k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.5k
A Tale of Four Properties
chriscoyier
160
23k
Why You Should Never Use an ORM
jnunemaker
PRO
58
9.5k
Statistics for Hackers
jakevdp
799
220k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
Building an army of robots
kneath
306
45k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None