Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Viktor Gamov
February 03, 2017
Technology
210
1
Share
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
440
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
420
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
100
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
180
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
260
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
240
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
320
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
120
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
500
Other Decks in Technology
See All in Technology
ブラウザの投機的読み込みと投機ルールAPIを理解し、Webサービスのパフォーマンスを最適化する
shuta13
2
260
変化の激しい時代をゴキゲンに生き抜くために 〜ストレスマネジメントのススメ〜
kakehashi
PRO
4
750
古今東西SRE
okaru
1
110
大学職員のための生成AI最前線 :最前線を、AIガバナンスとして読み直すためのTips
gmoriki
1
3.2k
ハーネスエンジニアリングをやりすぎた話 ~そのハーネスは解体された~
gotalab555
5
2k
Oracle Cloud Infrastructure:2026年4月度サービス・アップデート
oracle4engineer
PRO
0
280
もっとコンテンツをよく構造化して理解したいので、LLM 時代こそ Taxonomy の設計品質に目を向けたい〜!
morinota
0
150
Scovilleモバイルエンジニア募集中.pdf
julienrudin
0
150
AI時代に越境し、 組織を変えるQAスキルの正体 / QA Skills for Transforming an Organization
mii3king
4
3.1k
多角的な視点から見たAGI
terisuke
0
120
コードや知識を組み込む / Incorporate Code and Knowledge
ks91
PRO
0
200
AIの揺らぎに“コシ”を与える階層化品質設計
ickx
0
200
Featured
See All Featured
The Cult of Friendly URLs
andyhume
79
6.9k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
290
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
1.2k
Being A Developer After 40
akosma
91
590k
Testing 201, or: Great Expectations
jmmastey
46
8.1k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
160
Amusing Abliteration
ianozsvald
1
160
A designer walks into a library…
pauljervisheath
211
24k
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
0
300
4 Signs Your Business is Dying
shpigford
187
22k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None