Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドを活用したゲノム情報解析の現状
Search
Tazro Inutano Ohta
July 22, 2016
Research
2
420
クラウドを活用したゲノム情報解析の現状
情報処理学会 連続セミナー 2016 第2回 クラウド
http://www.ipsj.or.jp/event/seminar/2016/program02.html
Tazro Inutano Ohta
July 22, 2016
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
97
Standardization of biological sample information database
inutano
0
51
Describe data analysis workflow with workflow languages
inutano
5
4.4k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
320
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
72
Workflows that run everywhere and where to run them
inutano
0
130
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
240
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
92
Container, Cloud, and HPC
inutano
0
150
Other Decks in Research
See All in Research
Active Adaptive Experimental Design for Treatment Effect Estimation with Covariate Choices
masakat0
0
210
20240719_第2回熊本の交通を語る会
trafficbrain
0
510
言語処理学会30周年記念事業留学支援交流会@YANS2024:「学生のための短期留学」
a1da4
1
230
129 2 th
0325
0
170
ICLR2024: Reading "Training Unbiased Diffusion Models From Biased Dataset"
hotekagi
0
100
第60回名古屋CV・PRML勉強会:CVPR2024論文紹介(AM-RADIO)
naok615
0
240
LLM based AI Agents Overview -What, Why, How-
masatoto
2
590
「確率的なオウム」にできること、またそれがなぜできるのかについて
eumesy
PRO
7
3k
出生抑制策と少子化
morimasao16
0
430
FOSS4G 山陰 Meetup 2024@砂丘 はじめの挨拶
wata909
1
110
機械学習でヒトの行動を変える
hiromu1996
1
210
Isotropy, Clusters, and Classifiers
hpprc
3
600
Featured
See All Featured
Visualization
eitanlees
144
15k
Designing Experiences People Love
moore
138
23k
A better future with KSS
kneath
238
17k
How to Ace a Technical Interview
jacobian
275
23k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
14
1.9k
10 Git Anti Patterns You Should be Aware of
lemiorhan
654
59k
Building Better People: How to give real-time feedback that sticks.
wjessup
363
19k
Keith and Marios Guide to Fast Websites
keithpitt
408
22k
Thoughts on Productivity
jonyablonski
67
4.3k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
46
2.1k
How STYLIGHT went responsive
nonsquared
95
5.2k
The Art of Programming - Codeland 2020
erikaheidi
51
13k
Transcript
ΫϥυΛ׆༻ͨ͠ήϊϜใղੳͷݱঢ় 22 July 2016 | ใॲཧֶձ ࿈ଓηϛφʔ 2016 ୈ2ճ Ϋϥυ
େా ୡ! େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ " σʔλαΠΤϯεڞಉར༻ج൫ࢪઃ " ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ ಛݚڀһ"
[email protected]
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS)
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Agenda! #
1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" # 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" # 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาʹΑͬͯσʔλͷαΠζͱྔ͕૿Ճ" # ήϊϜͰʮ࣍ੈDNAγʔΫΤϯαʔʯ͕ొ" # σʔλͷੵʹΑͬͯܭࢉػੜֶ͕Μʹͳ͍ͬͯΔ" # λϯύΫཱ࣭ମߏσʔλɺը૾σʔλ" # σʔλॲཧɾղੳͷޮԽࠓͳ͓ٸ" # ΞϧΰϦζϜͷਐาΛ͍ͬͯΔ࣌ؒͳ͍" # ϋʔυΣΞͷੑೳͰΛղܾ͢Δ߹
λϯύΫཱ࣭ମߏղੳͷྫ! MEGADOCK: ౦େळࢁݚڀࣨ େ্ॿڭΒͷϓϩδΣΫτ http://www.nii.ac.jp/csi/openforum2016/track/pdf/20160526AM_TOUKOUDAI_akiyama2.pdf
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ήϊϜՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาΛཚʹྫ͑ΔͳΒ…" # ւ = ήϊϜ, ڕ = Ҩࢠ" # ʮͲΜͳڕ͕͍Δ͔ௐΔ͜ͱͰւΛಛ͚ͮΔʯ" # ٕज़ͷਐาͰಓ۩ͷੑೳ্͕ͨ͠" # Γ͕ఈҾ͖ʹͳͬͨ
ࣸਅࠨ: πϦόΧϝϥ @kazzwatabe https://tsuriba.camera/posts/XQeP3qmIp6A ࣸਅӈ: photo by atramos https://www.flickr.com/photos/atramos/5508960637 ࣮ݧػց͕ਐา͢Δͱ݁Ռͷղऍʹίετ͕͔͔Δ
͜Ε·ͰͷDNAγʔέϯαʔͷग़ྗσʔλͰݟͯ֬ೝͰ͖ͨ ࠓͷDNAγʔέϯαʔͷग़ྗܭࢉػ͕ͳ͍ͱԿͰ͖ͳ͍
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/ ← ←ఈҾ͖ DNAγʔέϯα ػछ͝ͱͷੑೳൺֱ
None
http://www.ncbi.nlm.nih.gov/Traces/sra/ ެڞσʔλϨϙδτϦͷσʔλαΠζͷ৳ͼ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) DNAγʔέϯα͔ΒಘΒΕΔσʔλ #
ʮήϊϜΛղಡ͢ΔʯͱҰݴͰݴ͏ͷͷ…" # ੜମαϯϓϧ͔ΒDNAΛநग़͢Δ" # நग़ͨ͠DNAΛ͍ࢠʹஅยԽ͢Δ" # DNAγʔέϯαͰղੳ͢Δ" # ͘அยԽ͞ΕͨԘجྻͷϦετͰग़ྗ͞ΕΔ" # େྔͷDNAஅยͷใ͔ΒݩͷDNAΛ෮ݩ͢Δ! # de novo Assemble" # Reference Alignment" "
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯα͔Βग़ྗ͞ΕΔσʔλஅยԽ͍ͯ͠Δ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯαΛγϡϨομʔʹྫ͑Δͱ…
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
http://www.historyofnimr.org.uk/mill-hill-essays/essays-yearly-volumes/2010-2/bringing-it-all-back-home-next-generation- sequencing-technology-and-you/ ϦϑΝϨϯεΞϥΠϯϝϯτ! = खຊ (ϦϑΝϨϯε) ʹԊͬͯฒͯ෮ݩ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) σʔλղੳιϑτΣΞ (ղੳπʔϧ)
# ଟ͘ͷղੳπʔϧ͕ΦʔϓϯιʔεͰެ։͞Ε͍ͯΔ" # ରσʔλͷੑ࣭ʹΑͬͯ࠷దͳπʔϧ͕ҟͳΔ" # σʔλղੳऀ (ੜֶऀ) ͕σʔλղੳΛߦ͏" # πʔϧ։ൃऀ(࣮ऀ)ͱར༻ऀಉҰͰͳ͍" # ར༻ऀ͕πʔϧͷڍಈΛશʹѲ͍ͯ͠ΔͱݶΒͳ͍" # ղੳऀৗʹσʔλղੳΛ͍ͯ͠ΔΘ͚Ͱͳ͍" # ੜ࣮ݧͷยखؒʹղੳΛ͢Δݚڀऀଟ͍
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/de-novo-genome-sequencing-category
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/whole-genome-resequencing-category
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔! #
·ͱΊ" # σʔλͷྔͱ͕ٸܹʹ૿͓͑ͯΓɺࠓޙ૿͑Δ" # తʹΑͬͯҟͳΔπʔϧɾΞϧΰϦζϜ͕༻͞ΕΔ" # σʔλղੳऀͱπʔϧ։ൃऀ(࣮ऀ)ҟͳΔ͜ͱ͕ଟ͍
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓͲͷΑ͏ͳܭࢉػ͕ΘΕ͍ͯΔͷ͔ #
PC" # PCΫϥελ" # ڌεύίϯ" # ࠃཱҨֶݚڀॴ εʔύʔίϯϐϡʔλγεςϜ
࣍ੈγʔΫΤϯαʔ%3:ղੳڭຊ ࡉ๔ֶผ ΑΓ ڭຊʹMacΛങ͑ͱॻ͍ͯ͋Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕΔͷ͔ #
ରσʔλ͕େ͖͘ͳΔ/૿͑Δͱ௨ৗͷPCͰݫ͍͠" # ղੳσʔλ͕ͲΜͲΜཷ·Δ" # ಡΈॻ͖͕ߴͰڊେͳετϨʔδ! # πʔϧ͕Out of memoryͰམͪΔ" # େنϝϞϦ! # όονॲཧΛେྔͷαϯϓϧʹର࣮ͯ͠ߦ͢Δ" # ࢄ࣮ߦδϣϒεέδϡʔϦϯάγεςϜ! # େܕڞ༻ܭࢉػͷཁٻͷߴ·Γ" # ҨֶݚڀॴSCͷಋೖ (2012~) => ·ͩेͰͳ͍
େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ ࠃཱҨֶݚڀॴ SuperComputer Facilities of National Institute of Genetics
photo from http://sc.ddbj.nig.ac.jp/index.php/ja-gallery
None
૿͑ଓ͚ΔϢʔβ ҨݚDDBJηϯλʔ খּݪ͞ΜͷൃදࢿྉΑΓ
ṧഭ͢ΔσΟεΫ https://sc.ddbj.nig.ac.jp/index.php/ja-nig-statistics
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ݱͰԿ͕ϘτϧωοΫͳͷ͔! εύίϯϢʔβձͳͲͷώΞϦϯάΑΓ
# ܭࢉػʹෆ׳ΕͳϢʔβͷΈ" # ܭࢉػ͝ͱʹԿ͕Ͱ͖ͯԿ͕Ͱ͖ͳ͍ͷ͔Θ͔Βͳ͍" # େنͳܭࢉػΛඞཁͱ͢Δ͕CUI͕͑ͳ͍" # ܭࢉػΛ͍͜ͳ͢ਓͷΈ" # ܭࢉػ͕ࠞΜͰ͍ͯδϣϒ͕ྲྀͤͳ͍" # σʔλͷղੳอଘʹेʹ༧ࢉΛೖͰ͖ͳ͍! # ڥߏஙʹίετ͕͔͔Δ" # ܭࢉػͷ໘Λݟͨ͘ͳ͍
ʮੜ࣮ݧʹ͓͕͔͔ۚΔ͕ɺ ɹσʔλղੳʹͦΕ΄Ͳ͓͕͔͔ۚΒͳ͍ʯͱࢥΘΕ͍ͯΔ http://trattoriainutano.tumblr.com/post/132214903857/
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔ #
·ͱΊ" # ରσʔλͱతʹΑͬͯཁٻʹ͕ࠩ͋Δ" # ήϊϜͰετϨʔδϝϞϦͷΈ͕ਂࠁ" # ϢʔβͷܭࢉػϦςϥγʹ෯͕͋Δ" # ϢʔβͷϨϕϧʹΑͬͯٻΊΔϨΠϠʔ͕ҧ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
ΫϥυͰղܾͰ͖Δ" # ಋೖίετ" # ϊʔυͷࠞࡶ" # ϝϯςφϯείετ" # Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ݚڀඅͰͷࢧ͍" # ະൃදσʔλ / ݸਓใΛؚΉσʔλͷѻ͍
Ϋϥυ׆༻ࣄྫ (SaaS)! Google Genomics https://cloud.google.com/genomics/v1/analyze-variants
Ϋϥυ׆༻ࣄྫ (IaaS)! 1000ਓήϊϜσʔλ on AWS https://aws.amazon.com/jp/1000genomes/
The NIH Commons! ถࠃͰϑΝϯσΟϯάଆ͕Ϋϥυར༻Λଅਐ “The Commons is a shared virtual
space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage, share, use and reuse data, software, metadata and workflows.” - https://datascience.nih.gov/ commons
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Ϋϥυ׆༻ࣄྫ (PaaS/SaaS)!
ήϊϜղੳύΠϓϥΠϯ on ΞΧσϛοΫɾΠϯλʔΫϥυ # JST CREST: ΠϯλʔΫϥυΛ׆༻ͨ͠ΞϓϦέʔγϣϯத৺ܕΦʔόʔ ϨΠΫϥυٕज़ʹؔ͢Δݚڀ (ද: NII߹ాઌੜ)" # ΞΧσϛοΫɾΠϯλʔΫϥυͷࢼΈ" # ҨݚεύίϯΛใݚΫϥυଞࠃͷΞΧσϛοΫΫϥυͱ࿈ܞ" # ղੳʹ༻͍ΒΕΔ֤πʔϧΛDockerԽ͢Δ͜ͱͰΞϓϦέʔγϣϯΛ ϙʔλϒϧʹ" # ༧ΊπʔϧΛΈ߹ΘͤͨϫʔΫϑϩʔΛߏங͠GUIΛఏڙ" # ղੳσʔλ͝ͱʹ࠷దͳϦιʔεΛׂΓͯͨܭࢉػΛ্ཱͪ͛
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
·ͱΊ: Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ܭࢉ࣌ߴͳI/OΛཁٻ" # อ࣌ίετͳετϨʔδ" # (༻Ϋϥυͷ߹) ݚڀඅͰͷࢧ͍" # ݸਓใΛؚΉσʔλͷѻ͍" # ҆શੑͷཱ֬ - ར༻࣮ͷੵ" # ΨΠυϥΠϯͷࡦఆ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) େֶපӃͰήϊϜใ͕࣍ʑʹ! AMEDͳͲͷػߏʹΑͬͯήϊϜ͕ਪਐ͞Ε͍ͯΔ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! ήϊϜσʔλղੳʹΫϥυΛ༻͍Δ͋ͨΊʹඞཁͳηΩϡϦςΟ ݚڀػؔͱΫϥυϓϩόΠμͷ࿈ܞʹΑͬͯ͞ΕΔඞཁ͕͋Δ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! # Security requirements" # The data privacy agreement / σʔλͷऔѻʹ͍ͭͯͷݚڀػؔͱͷ߹ҙ" # Physical and logical security / ཧ/ཧͰͷηΩϡϦςΟ" # Encryption data / σʔλͷอ/సૹ࣌ͷ҉߸Խ" # Authentication / Ϣʔβೝূ " # Principle of Least Privilege / ࠷খݖݶͷݪଇ" # Firewalls / ϑΝΠϠʔΥʔϧ" # Logging and monitoring / ϩΪϯάͱϞχλϦϯά" # Training / ηΩϡϦςΟೝূʹ͍ͭͯͷτϨʔχϯά" # Security and privacy / ݸਓใͷอޢ
ݸਓใͷऔΓѻ͍ͱݚڀར༻ͷؔ! ຊܦࡁ৽ฉʮҩֶݚڀͱݸਓใͷཱ྆Λ ʯΑΓ! http://www.nikkei.com/article/DGXKZO05121060S6A720C1EA1000/ ݸਓใΛؚΉݚڀσʔλපؾͷݪҼղ໌࣏ྍʹඇৗʹॏཁ ηΩϡΞͳڥ͕͋ΕݚڀΛਪਐ͢Δେ͖ͳثʹͳΔ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary #
ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" ◦ େنͳσʔλͷੵʹΑΓܭࢉػधཁ͕ߴ·͍ͬͯΔ" # ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" ◦ ήϊϜͰετϨʔδϝϞϦ͕ॏࢹ͞ΕΔ" ◦ ར༻ऀʹΑͬͯཁٻ͕ࡉ͔͘ҧ͏" # ΫϥυΛ׆༻ͯ͠Λղܾ͍͖͍ͯͨ͠" ◦ ΫϥυͷརศੑΛ͞ΒʹߴΊ͍ͯ͘" ◦ ར༻ࣄྫΛ૿͢͜ͱ͕ॏཁ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࢀߟࢿྉ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
ϥΠϑαΠΤϯεʹ͓͚Δσʔλϕʔε౷߹ʹࢿ͢Δٕज़ ։ൃΛ୲͏" # ج൫ٕज़։ൃ" # ηϚϯςΟοΫΣϒٕज़ࣗવݴޠॲཧΛ༻͍ͨϑΣσ Ϩʔγϣϯܕσʔλ౷߹ͷͨΊͷٕज़։ൃࠃࡍඪ४ͷࡦ ఆʹऔΓΉ" # DDBJ࿈ܞ" # େنήϊϜσʔλΛ࢝Ίͱ͢Δσʔλͷ׆༻ͷ ͨΊͷٕज़։ൃΛߦ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
JSTͷηϯλʔ NBDC ͱڞಉͰσʔλϕʔεࣄۀΛਐΊΔ" # DDBJͱಉ͡৫ (ROIS, NIIಉ͡) Ͱ࿈ܞ͍ͯ͠Δ http://dbcls.rois.ac.jp/about