Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[NYJavaSig] Riding The Distributed Streams
Search
Viktor Gamov
February 03, 2017
Technology
1
190
[NYJavaSig] Riding The Distributed Streams
Presentation on Hazelcast and Distributed Streams.
Presented on NYJavaSig
Viktor Gamov
February 03, 2017
Tweet
Share
More Decks by Viktor Gamov
See All by Viktor Gamov
Processing Streaming Data with KSQL
vikgamov
4
370
[VirtualJUG] Apache Kafka — A Streaming Data Platform
vikgamov
3
370
[SF JUG] Apache Kafka — A Streaming Data Platform
vikgamov
4
80
[OracleCode NYC-2018] Apache Kafka A Streaming Data Platform
vikgamov
1
170
[OracleCode NYC-2018] Rethinking Stream Processing with KStreams and KSQL
vikgamov
2
220
[JBreak-2018] Это кто там твитить про #jbreak?
vikgamov
0
200
[DevNexus-2018] Apache Kafka A Streaming Data Platform
vikgamov
2
270
[DataSciCon] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
100
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
vikgamov
0
470
Other Decks in Technology
See All in Technology
Windows 11 で AWS Documentation MCP Server 接続実践/practical-aws-documentation-mcp-server-connection-on-windows-11
emiki
0
970
Delegating the chores of authenticating users to Keycloak
ahus1
0
120
LinkX_GitHubを基点にした_AI時代のプロジェクトマネジメント.pdf
iotcomjpadmin
0
170
AWS CDK 実践的アプローチ N選 / aws-cdk-practical-approaches
gotok365
6
740
AIエージェント最前線! Amazon Bedrock、Amazon Q、そしてMCPを使いこなそう
minorun365
PRO
15
5.1k
Snowflake Summit 2025全体振り返り / Snowflake Summit 2025 Overall Review
mtpooh
2
400
Uniadex__公開版_20250617-AIxIoTビジネス共創ラボ_ツナガルチカラ_.pdf
iotcomjpadmin
0
160
Node-RED × MCP 勉強会 vol.1
1ftseabass
PRO
0
140
Agentic Workflowという選択肢を考える
tkikuchi1002
1
510
Model Mondays S2E02: Model Context Protocol
nitya
0
220
Navigation3でViewModelにデータを渡す方法
mikanichinose
0
220
2025-06-26_Lightning_Talk_for_Lightning_Talks
_hashimo2
2
100
Featured
See All Featured
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
930
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
22k
Docker and Python
trallard
44
3.4k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Rebuilding a faster, lazier Slack
samanthasiow
82
9.1k
Bash Introduction
62gerente
614
210k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
181
53k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
KATA
mclloyd
29
14k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.8k
VelocityConf: Rendering Performance Case Studies
addyosmani
330
24k
Transcript
None
> whoami • Solutions Architect @Hazelcast • Hang out with
awesome people • @gamussa in internetz Please, follow me in Twitter I’m very interesting ©
Agenda • Refreshing knowledge on Java 8 Streams • Distribute
and Conquer • Distributed Data • Distributed Streams • How we did all this
Java 8 Streams
Java 8 Streams… • An abstraction represents a sequence of
elements • Is not a data structure • Convey elements from a source through a pipeline of operations • Operation doesn’t modify a source
Why I should care about Stream API? • You’re Java
developer
What does regular Java developer think about Scala? advanced
Why I should care about Stream API? • You’re Java
developer • Many Java developers know Java • It’s all about data processing
java.util.stream operations • map(), flatMap(), filter() • reduce(), collect() •
sorted()
None
None
None
Problem • One does not simply put all Big Data
in one machine
Problem • Data doesn’t fit just one machine
Problem • One does not simply put all Big Data
in one machine • Data is too important to have it only one machine
None
CACHES
Replication on Sharding? http://book.mixu.net/distsys/single-page.html
Solution • Use Distributed Map aka IMap
What’s Hazelcast IMDG? • In-memory Data Grid • Apache v2
Licensed • Distributed • Caches (IMap, JCache) • Java Collections (IList, ISet, IQueue) • Messaging (Topic, RingBuffer) • Computation (ExecutorService, M-R)
None
None
None
Green Primary Green Backup Green Shard
None
Problem • Lambda serialization 26
27
Solution • serializable version of the interfaces • Introducing DistributedStream
28
29
None
31 Jet Streams
None
What’s Hazelcast Jet? • General purpose distributed data processing framework
• Based on Direct Acyclic Graph to model data flow • Built on top of Hazelcast IMDG • Comparable to Apache Spark or Apache Flink 33
None
DAG 35
Job Execution 36
None
Future (It’s bright!) • Memory module for processing big data
• Higher level streaming and batching APIs • Reactive Streams • Distributed Classloading • Integrations (HDFS/Yarn/Mesos)
Your fuel, our Jet Engine • Public release – Feb
7th. • Developer Preview today - yay! • http://hazelcast.org/jet-signup • Send me a note
[email protected]
• Follow @hazelcast and @gamussa (duh!!) • Your questions #hazelcast #hazelcastjet
Conclusion • Java Stream API provides very white range of
data processing tools • War And Piece – is a Big (a lot of data) Book! • Now we’re pretty sure that Andrew and Pierre are the main characters
None