Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
220
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
450
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
440
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
110
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
190
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
260
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
240
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
330
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
120
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
510
Other Decks in Technology
See All in Technology
なぜ Platform Engineering の土台に Kubernetes を選ぶのか
r4ynode
2
620
ACE-Step-1.5で見る 音楽生成AIのしくみと“破綻だけ直す”Retake機能の開発【zennfes spring 2026 登壇資料】
personabb
1
260
Socrates × Looker 〜セマンティックレイヤーで進化するデータ分析エージェント〜
hanon52_
3
2.2k
日本 Fintech 未来予測レポート 2027〜2028年(オリジナル版)
8maki
0
2.1k
AIソロプレナー時代に2ヶ月で20人増員した事業創造会社の開発組織の話
miyatakoji
0
640
連合学習と機密コンピューティング
lycorptech_jp
PRO
0
110
AIはどのように 組織のアジリティを変えるのか?
junki
2
670
AIエージェントが名古屋の猛暑からあなたを守る
happysamurai294
0
110
Kubernetesにおける学習基盤とLLMOpsの概要
ry
1
280
【Cyber-sec+】経営層を"動かす"ための考え方
hssh2_bin
0
160
自律型AIエージェントは何を破壊するのか
kojira
0
160
データサイエンスを価値につなげるプロジェクト設計 〜 DS一年目が現場で得た気づき 〜
ysd113
1
220
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.4k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
770
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
330
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.7k
Accessibility Awareness
sabderemane
1
140
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
270
Chasing Engaging Ingredients in Design
codingconduct
0
220
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
140
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
280
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
570
Designing Powerful Visuals for Engaging Learning
tmiket
1
410
Mobile First: as difficult as doing things right
swwweet
225
10k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None