Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cascalog
Search
αλεx π
May 22, 2013
Technology
4
140
Cascalog
Short demo talk on Cascalog on Hadoop UG in Munich
αλεx π
May 22, 2013
Tweet
Share
More Decks by αλεx π
See All by αλεx π
Scalable Time Series With Cassandra
ifesdjeen
1
380
Bayesian Inference is known to make machines biased
ifesdjeen
2
370
Cassandra for Data Analytics Backends
ifesdjeen
7
440
Stream Processing and Functional Programming
ifesdjeen
1
750
PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript
ifesdjeen
0
470
Clojure - A Sweetspot for Analytics
ifesdjeen
8
2.1k
Going Off Heap
ifesdjeen
3
1.9k
Always be learning
ifesdjeen
1
150
Learn Yourself Emacs For Great Good workshop slides
ifesdjeen
3
330
Other Decks in Technology
See All in Technology
SRE視点で振り返るメルカリのアーキテクチャ変遷と普遍的な考え
foostan
2
410
Capitole du Libre 2025 - Keynote - Cloud du Coeur
ju_hnny5
0
120
AS59105におけるFreeBSD EtherIPの運用と課題
x86taka
0
210
リアーキテクティングのその先へ 〜品質と開発生産性の壁を越えるプラットフォーム戦略〜 / architecture-con2025
visional_engineering_and_design
0
3.9k
AIと自動化がもたらす業務効率化の実例: 反社チェック等の調査・業務プロセス自動化
enpipi
0
750
その意思決定、まだ続けるんですか? ~痛みを超えて未来を作る、AI時代の撤退とピボットの技術~
applism118
25
14k
身近なCSVを活用する!AWSのデータ分析基盤アーキテクチャ
koosun
0
2.2k
Building AI Applications with Java, LLMs, and Spring AI
thomasvitale
1
220
プロダクト負債と歩む持続可能なサービスを育てるための挑戦
sansantech
PRO
1
610
AI × クラウドで シイタケの収穫時期を判定してみた
lamaglama39
1
380
クレジットカードの不正を防止する技術
yutadayo
17
7.9k
ABEMAのCM配信を支えるスケーラブルな分散カウンタの実装
hono0130
4
1k
Featured
See All Featured
Building Flexible Design Systems
yeseniaperezcruz
329
39k
Thoughts on Productivity
jonyablonski
73
4.9k
Site-Speed That Sticks
csswizardry
13
970
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Making the Leap to Tech Lead
cromwellryan
135
9.6k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.3k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Bash Introduction
62gerente
615
210k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Docker and Python
trallard
46
3.7k
Transcript
Cascalog Hassle-free MapReduce that matches your scale Thursday, May 23,
13
Thursday, May 23, 13
Setting expecations •This is not a guide •And not a
tutorial •Doesn’t claim to be complete •Mostly to give you an idea •And encourage you to explore further Thursday, May 23, 13
How much time do you spend on writing logic that
framework should take care of? Thursday, May 23, 13
How easy is it to debug your map/reduce aggragation? Thursday,
May 23, 13
Hadoop + Java composable, but too vebrose Pig, Hive too
concrete, lack of abstraction and composition Thursday, May 23, 13
Thursday, May 23, 13
• Clear, declarative syntax • Inner and outer joins •
Aggregators • Functions • Subqueries, composition • Sorting • Performant Thursday, May 23, 13
Casca-WHAT? • Built on top of Hadoop (MapReduce) • Cascading
(tuples, workflows, job execution) • Written in Clojure • Datalog (logic programming) Thursday, May 23, 13
Abstract evrthn! Thursday, May 23, 13
Source where data pours from Thursday, May 23, 13
Pipe that data flows through Thursday, May 23, 13
Filter that makes sure that only good stuff goes through
Thursday, May 23, 13
Tuple they actually flow Thursday, May 23, 13
Thursday, May 23, 13
Query anatomy Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Output Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
output vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Input input vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Logic/aggregations Thursday, May 23, 13
Sources and sinks • HDFS (go figure) • Cassandra •
MongoDB • SQL data sources • File system • Memory sources Thursday, May 23, 13
(?<- (stdout) [?person] (age ?person 25)) Exact match of second
element in a tuple Thursday, May 23, 13
(defn younger-than? [limit age] (< age limit)) (?<- (stdout) [?person
?age] (age ?person ?age) (younger-than? 32 ?age)) Predicate match, fn call Predicate Thursday, May 23, 13
(?<- (stdout) [?person ?count] (follows ?person _) (c/count ?count)) Aggregation
Thursday, May 23, 13
SHOWTIME! Thursday, May 23, 13
Benefits •Query language is same as application language •Subqueries, reusability
•Ad-hoc querying •Cascading underneath, so taps for all DBs work •Reuse application logic •Text editor integration Thursday, May 23, 13
@ifesdjeen (twitter/github) Thursday, May 23, 13