Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cascalog
Search
αλεx π
May 22, 2013
Technology
4
130
Cascalog
Short demo talk on Cascalog on Hadoop UG in Munich
αλεx π
May 22, 2013
Tweet
Share
More Decks by αλεx π
See All by αλεx π
Scalable Time Series With Cassandra
ifesdjeen
1
350
Bayesian Inference is known to make machines biased
ifesdjeen
2
360
Cassandra for Data Analytics Backends
ifesdjeen
7
410
Stream Processing and Functional Programming
ifesdjeen
1
720
PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript
ifesdjeen
0
450
Clojure - A Sweetspot for Analytics
ifesdjeen
8
2k
Going Off Heap
ifesdjeen
3
1.9k
Always be learning
ifesdjeen
1
140
Learn Yourself Emacs For Great Good workshop slides
ifesdjeen
3
320
Other Decks in Technology
See All in Technology
Oracle Cloud Infrastructure:2025年2月度サービス・アップデート
oracle4engineer
PRO
1
240
次世代KYC活動報告 / 20250219-BizDay17-KYC-nextgen
oidfj
0
270
リーダブルテストコード 〜メンテナンスしやすい テストコードを作成する方法を考える〜 #DevSumi #DevSumiB / Readable test code
nihonbuson
11
7.4k
わたしのOSS活動
kazupon
0
110
依存パッケージの更新はコツコツが勝つコツ! / phpcon_nagoya2025
blue_goheimochi
3
140
目の前の仕事と向き合うことで成長できる - 仕事とスキルを広げる / Every little bit counts
soudai
26
7.4k
Moved to https://speakerdeck.com/toshihue/presales-engineer-career-bridging-tech-biz-ja
toshihue
2
760
データマネジメントのトレードオフに立ち向かう
ikkimiyazaki
6
1k
君も受託系GISエンジニアにならないか
sudataka
2
450
エンジニアのためのドキュメント力基礎講座〜構造化思考から始めよう〜(2025/02/15jbug広島#15発表資料)
yasuoyasuo
18
6.9k
SA Night #2 FinatextのSA思想/SA Night #2 Finatext session
satoshiimai
1
150
Amazon S3 Tablesと外部分析基盤連携について / Amazon S3 Tables and External Data Analytics Platform
nttcom
0
140
Featured
See All Featured
GraphQLの誤解/rethinking-graphql
sonatard
68
10k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.6k
Building Your Own Lightsaber
phodgson
104
6.2k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
Writing Fast Ruby
sferik
628
61k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
10
1.3k
Making the Leap to Tech Lead
cromwellryan
133
9.1k
How GitHub (no longer) Works
holman
314
140k
How to Think Like a Performance Engineer
csswizardry
22
1.4k
Facilitating Awesome Meetings
lara
52
6.2k
Embracing the Ebb and Flow
colly
84
4.6k
Transcript
Cascalog Hassle-free MapReduce that matches your scale Thursday, May 23,
13
Thursday, May 23, 13
Setting expecations •This is not a guide •And not a
tutorial •Doesn’t claim to be complete •Mostly to give you an idea •And encourage you to explore further Thursday, May 23, 13
How much time do you spend on writing logic that
framework should take care of? Thursday, May 23, 13
How easy is it to debug your map/reduce aggragation? Thursday,
May 23, 13
Hadoop + Java composable, but too vebrose Pig, Hive too
concrete, lack of abstraction and composition Thursday, May 23, 13
Thursday, May 23, 13
• Clear, declarative syntax • Inner and outer joins •
Aggregators • Functions • Subqueries, composition • Sorting • Performant Thursday, May 23, 13
Casca-WHAT? • Built on top of Hadoop (MapReduce) • Cascading
(tuples, workflows, job execution) • Written in Clojure • Datalog (logic programming) Thursday, May 23, 13
Abstract evrthn! Thursday, May 23, 13
Source where data pours from Thursday, May 23, 13
Pipe that data flows through Thursday, May 23, 13
Filter that makes sure that only good stuff goes through
Thursday, May 23, 13
Tuple they actually flow Thursday, May 23, 13
Thursday, May 23, 13
Query anatomy Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Output Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
output vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Input input vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Logic/aggregations Thursday, May 23, 13
Sources and sinks • HDFS (go figure) • Cassandra •
MongoDB • SQL data sources • File system • Memory sources Thursday, May 23, 13
(?<- (stdout) [?person] (age ?person 25)) Exact match of second
element in a tuple Thursday, May 23, 13
(defn younger-than? [limit age] (< age limit)) (?<- (stdout) [?person
?age] (age ?person ?age) (younger-than? 32 ?age)) Predicate match, fn call Predicate Thursday, May 23, 13
(?<- (stdout) [?person ?count] (follows ?person _) (c/count ?count)) Aggregation
Thursday, May 23, 13
SHOWTIME! Thursday, May 23, 13
Benefits •Query language is same as application language •Subqueries, reusability
•Ad-hoc querying •Cascading underneath, so taps for all DBs work •Reuse application logic •Text editor integration Thursday, May 23, 13
@ifesdjeen (twitter/github) Thursday, May 23, 13