Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
190
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
360
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
360
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
76
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
160
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
220
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
190
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
260
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
98
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
460
Other Decks in Technology
See All in Technology
試験は暗記より理解 〜効果的な試験勉強とその後への活かし方〜
fukazawashun
0
340
ブラウザのレガシー・独自機能を愛でる-Firefoxの脆弱性4選- / Browser Crash Club #1
masatokinugawa
1
370
Automatically generating types by running tests
sinsoku
1
310
Amebaにおける Platform Engineeringの実践
kumorn5s
6
890
Vision Pro X Text to 3D Model ~How Swift and Generative Al Unlock a New Era of Spatial Computing~
igaryo0506
0
260
SREが実現する開発者体験の革新
sansantech
PRO
0
160
食べログが挑む!飲食店ネット予約システムで自動テスト無双して手動テストゼロを実現する戦略
hagevvashi
1
150
Ops-JAWS_Organizations小ネタ3選.pdf
chunkof
2
110
SREの視点で考えるSIEM活用術 〜AWS環境でのセキュリティ強化〜
coconala_engineer
1
240
Would you THINK such a demonstration interesting ?
shumpei3
1
140
AWSのマルチアカウント管理 ベストプラクティス最新版 2025 / Multi-Account management on AWS best practice 2025
ohmura
3
180
AIエージェント開発における「攻めの品質改善」と「守りの品質保証」 / 2024.04.09 GPU UNITE 新年会 2025
smiyawaki0820
0
390
Featured
See All Featured
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Writing Fast Ruby
sferik
628
61k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
5
520
Bash Introduction
62gerente
611
210k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
135
33k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
A better future with KSS
kneath
239
17k
For a Future-Friendly Web
brad_frost
176
9.7k
Statistics for Hackers
jakevdp
798
220k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.2k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None