Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Platforms for Data Science
Search
Deepak Singh
October 01, 2011
Technology
200
3
Share
Platforms for Data Science
Talk given at the "Computing on the Brink" series
Deepak Singh
October 01, 2011
More Decks by Deepak Singh
See All by Deepak Singh
Changing the Calculus of Containers (Datadog Dash)
mndoci
2
120
Platforms for scientific data analysis
mndoci
3
110
FGED Keynote
mndoci
3
100
Open Mic Science - May 7, 2012
mndoci
4
1.3k
Talk at "Genome Informatics Alliance 2012" meeting
mndoci
1
270
A Platform for Data Science
mndoci
6
15k
Intel Theater Presentation @ SC11
mndoci
6
200
Talk at West Coast Association of Shared Directors meeting
mndoci
3
160
A platform for data science - Systems Bioinformatics Workshop
mndoci
3
120
Other Decks in Technology
See All in Technology
QA組織のAI戦略とAIテスト設計システムAITASの実践
sansantech
PRO
1
320
遊びで始めたNew Relic MCP、気づいたらChatOpsなオブザーバビリティボットができてました/From New Relic MCP to a ChatOps Observability Bot
aeonpeople
1
160
スクラムを支える内部品質の話
iij_pr
0
190
I ran an automated simulation of fake news spread using OpenClaw.
zzzzico
1
770
自分をひらくと次のチャレンジの敷居が下がる
sudoakiy
5
1.7k
Zephyr(RTOS)でARMとRISC-Vのコア間通信をしてみた
iotengineer22
0
120
Oracle AI Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
3
2.1k
生成AIで支える自動E2Eテストの継続運用
sansantech
PRO
0
110
FlutterでPiP再生を実装した話
s9a17
0
250
OPENLOGI Company Profile for engineer
hr01
1
62k
Bref でサービスを運用している話
sgash708
0
220
40代からのアウトプット ― 経験は価値ある学びに変わる / 20260404 Naoki Takahashi
shift_evolve
PRO
5
770
Featured
See All Featured
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
210
Mind Mapping
helmedeiros
PRO
1
140
We Are The Robots
honzajavorek
0
210
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
200
Believing is Seeing
oripsolob
1
100
Raft: Consensus for Rubyists
vanstee
141
7.4k
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
2
1.4k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
AI Search: Where Are We & What Can We Do About It?
aleyda
0
7.2k
Leo the Paperboy
mayatellez
6
1.6k
Docker and Python
trallard
47
3.8k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Transcript
There is no magic There is only awesome D e
e p a k S i n g h Platforms for data science
bioinformatics image: Ethan Hein
3
collection
curation
analysis
what’s the big deal?
None
Source: http://www.nature.com/news/specials/bigdata/index.html
Image: Yael Fitzpatrick (AAAS)
Image: Yael Fitzpatrick (AAAS)
lots of data
lots of people
lots of places
constant change
we want to make our data more effective
versioning
provenance
filter
aggregate
extend
mashup
human interfaces
None
image: Leo Reynolds
hard problem
really hard problem
so how do get there?
information platforms
Image: Drew Conway
dataspaces Further reading: Jeff Hammerbacher, Information Platforms and the rise
of the data scientist, Beautiful Data
the unreasonable effectiveness of data Halevy, et al. IEEE Intelligent
Systems, 24, 8-12 (2009)
accept all data formats
evolve APIs
beyond databases and the data warehouse
data as a programmable resource
data is a royal garden
compute is a fungible commodity
optimizing the most valuable resource
compute, storage, workflows, memory, transmission, algorithms, cost, …
people Credit: Pieter Musterd a CC-BY-NC-ND license
Image: Chris Dagdigian
my bias
cloud services
distributed systems
scale
global
consumption models
on-demand
what is the value of your data?
None
None
Credit: Angel Pizzaro, U. Penn
mapreduce for genomics http://bowtie-bio.sourceforge.net/crossbow/index.shtml http://contrail-bio.sourceforge.net http://bowtie-bio.sourceforge.net/myrna/index.shtml
None
Bioproximity http://aws.amazon.com/solutions/case-studies/bioproximity/
None
None
30,472 cores
$1279/hr
http://cloudbiolinux.org/
http://usegalaxy.org/cloud
in summary
large scale data requires a rethink
data architecture
compute architecture
distributed, programmable infrastructure
cloud services
remove constraints
can we build data science platforms?
there is no magic there is only awesome
[email protected]
Twitter:@mndoci http://slideshare.net/mndoci http://mndoci.com Inspiration and ideas from Matt Wood&
Larry Lessig Credit” Oberazzi under a CC-BY-NC-SA license