Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
200
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
380
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
380
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
82
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
170
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
230
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
210
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
290
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
110
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
470
Other Decks in Technology
See All in Technology
生成AIを活用したZennの取り組み事例
ryosukeigarashi
0
200
Where will it converge?
ibknadedeji
0
180
Shirankedo NOCで見えてきたeduroam/OpenRoaming運用ノウハウと課題 - BAKUCHIKU BANBAN #2
marokiki
0
130
職種別ミートアップで社内から盛り上げる アウトプット文化の醸成と関係強化/ #DevRelKaigi
nishiuma
2
130
AI ReadyなData PlatformとしてのAutonomous Databaseアップデート
oracle4engineer
PRO
0
170
AIAgentの限界を超え、 現場を動かすWorkflowAgentの設計と実践
miyatakoji
0
130
"複雑なデータ処理 × 静的サイト" を両立させる、楽をするRails運用 / A low-effort Rails workflow that combines “Complex Data Processing × Static Sites”
hogelog
3
1.9k
ZOZOのAI活用実践〜社内基盤からサービス応用まで〜
zozotech
PRO
0
170
Why Governance Matters: The Key to Reducing Risk Without Slowing Down
sarahjwells
0
110
20250929_QaaS_vol20
mura_shin
0
110
Optuna DashboardにおけるPLaMo2連携機能の紹介 / PFN LLM セミナー
pfn
PRO
1
880
それでも私はContextに値を詰めたい | Go Conference 2025 / go conference 2025 fill context
budougumi0617
4
1.2k
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
30
2.9k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
GraphQLとの向き合い方2022年版
quramy
49
14k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.6k
Into the Great Unknown - MozCon
thekraken
40
2.1k
Designing Experiences People Love
moore
142
24k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
32
2.2k
Learning to Love Humans: Emotional Interface Design
aarron
274
40k
[RailsConf 2023] Rails as a piece of cake
palkan
57
5.9k
The Cult of Friendly URLs
andyhume
79
6.6k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None