$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cascalog
Search
αλεx π
May 22, 2013
Technology
4
150
Cascalog
Short demo talk on Cascalog on Hadoop UG in Munich
αλεx π
May 22, 2013
Tweet
Share
More Decks by αλεx π
See All by αλεx π
Scalable Time Series With Cassandra
ifesdjeen
1
380
Bayesian Inference is known to make machines biased
ifesdjeen
2
370
Cassandra for Data Analytics Backends
ifesdjeen
7
440
Stream Processing and Functional Programming
ifesdjeen
1
750
PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript
ifesdjeen
0
480
Clojure - A Sweetspot for Analytics
ifesdjeen
8
2.1k
Going Off Heap
ifesdjeen
3
1.9k
Always be learning
ifesdjeen
1
150
Learn Yourself Emacs For Great Good workshop slides
ifesdjeen
3
330
Other Decks in Technology
See All in Technology
今からでも間に合う!速習Devin入門とその活用方法
ismk
1
670
Overture Maps Foundationの3年を振り返る
moritoru
0
180
Edge AI Performance on Zephyr Pico vs. Pico 2
iotengineer22
0
140
A Compass of Thought: Guiding the Future of Test Automation ( #jassttokai25 , #jassttokai )
teyamagu
PRO
1
260
新 Security HubがついにGA!仕組みや料金を深堀り #AWSreInvent #regrowth / AWS Security Hub Advanced GA
masahirokawahara
1
1.8k
Databricks向けJupyter Kernelでデータサイエンティストの開発環境をAI-Readyにする / Data+AI World Tour Tokyo After Party
genda
1
100
「Managed Instances」と「durable functions」で広がるAWS Lambdaのユースケース
lamaglama39
0
300
5分で知るMicrosoft Ignite
taiponrock
PRO
0
340
意外とあった SQL Server 関連アップデート + Database Savings Plans
stknohg
PRO
0
310
今年のデータ・ML系アップデートと気になるアプデのご紹介
nayuts
1
290
GitHub Copilotを使いこなす 実例に学ぶAIコーディング活用術
74th
3
2.7k
30分であなたをOmniのファンにしてみせます~分析画面のクリック操作をそのままコード化できるAI-ReadyなBIツール~
sagara
0
130
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Git: the NoSQL Database
bkeepers
PRO
432
66k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
1
100
Producing Creativity
orderedlist
PRO
348
40k
Practical Orchestrator
shlominoach
190
11k
Documentation Writing (for coders)
carmenintech
76
5.2k
Being A Developer After 40
akosma
91
590k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.8k
Context Engineering - Making Every Token Count
addyosmani
9
500
Transcript
Cascalog Hassle-free MapReduce that matches your scale Thursday, May 23,
13
Thursday, May 23, 13
Setting expecations •This is not a guide •And not a
tutorial •Doesn’t claim to be complete •Mostly to give you an idea •And encourage you to explore further Thursday, May 23, 13
How much time do you spend on writing logic that
framework should take care of? Thursday, May 23, 13
How easy is it to debug your map/reduce aggragation? Thursday,
May 23, 13
Hadoop + Java composable, but too vebrose Pig, Hive too
concrete, lack of abstraction and composition Thursday, May 23, 13
Thursday, May 23, 13
• Clear, declarative syntax • Inner and outer joins •
Aggregators • Functions • Subqueries, composition • Sorting • Performant Thursday, May 23, 13
Casca-WHAT? • Built on top of Hadoop (MapReduce) • Cascading
(tuples, workflows, job execution) • Written in Clojure • Datalog (logic programming) Thursday, May 23, 13
Abstract evrthn! Thursday, May 23, 13
Source where data pours from Thursday, May 23, 13
Pipe that data flows through Thursday, May 23, 13
Filter that makes sure that only good stuff goes through
Thursday, May 23, 13
Tuple they actually flow Thursday, May 23, 13
Thursday, May 23, 13
Query anatomy Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Output Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
output vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Input input vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Logic/aggregations Thursday, May 23, 13
Sources and sinks • HDFS (go figure) • Cassandra •
MongoDB • SQL data sources • File system • Memory sources Thursday, May 23, 13
(?<- (stdout) [?person] (age ?person 25)) Exact match of second
element in a tuple Thursday, May 23, 13
(defn younger-than? [limit age] (< age limit)) (?<- (stdout) [?person
?age] (age ?person ?age) (younger-than? 32 ?age)) Predicate match, fn call Predicate Thursday, May 23, 13
(?<- (stdout) [?person ?count] (follows ?person _) (c/count ?count)) Aggregation
Thursday, May 23, 13
SHOWTIME! Thursday, May 23, 13
Benefits •Query language is same as application language •Subqueries, reusability
•Ad-hoc querying •Cascading underneath, so taps for all DBs work •Reuse application logic •Text editor integration Thursday, May 23, 13
@ifesdjeen (twitter/github) Thursday, May 23, 13