Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
190
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
340
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
350
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
73
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
160
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
220
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
180
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
240
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
94
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
450
Other Decks in Technology
See All in Technology
AWS re:Invent 2024 recap in 20min / JAWSUG 千葉 2025.1.14
shimy
1
100
Oracle Exadata Database Service(Dedicated Infrastructure):サービス概要のご紹介
oracle4engineer
PRO
0
12k
Bring Your Own Container: When Containers Turn the Key to EDR Bypass/byoc-avtokyo2024
tkmru
0
860
Visual StudioとかIDE関連小ネタ話
kosmosebi
1
380
.NET AspireでAzure Functionsやクラウドリソースを統合する
tsubakimoto_s
0
190
デザインシステムを始めるために取り組んだこと - TechTrain x ゆめみ ここを意識してほしい!リファクタリング勉強会
kajitack
2
130
2025年の挑戦 コーポレートエンジニアの技術広報/techpr5
nishiuma
0
150
0→1事業こそPMは営業すべし / pmconf #落選お披露目 / PM should do sales in zero to one
roki_n_
PRO
1
1.6k
DMMブックスへのTipKit導入
ttyi2
1
110
[JSAC 2025 LT] Introduction to MITRE ATT&CK utilization tools by multiple LLM agents and RAG
4su_para
1
100
JuliaTokaiとJuliaLangJaの紹介 for NGK2025S
antimon2
1
130
AWS Community Builderのススメ - みんなもCommunity Builderに応募しよう! -
smt7174
0
190
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
52k
Building Adaptive Systems
keathley
38
2.4k
Facilitating Awesome Meetings
lara
51
6.2k
How GitHub (no longer) Works
holman
312
140k
Making the Leap to Tech Lead
cromwellryan
133
9k
Site-Speed That Sticks
csswizardry
3
270
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
226
22k
Learning to Love Humans: Emotional Interface Design
aarron
274
40k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.1k
Raft: Consensus for Rubyists
vanstee
137
6.7k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None