Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドを活用したゲノム情報解析の現状
Search
Tazro Inutano Ohta
July 22, 2016
Research
2
430
クラウドを活用したゲノム情報解析の現状
情報処理学会 連続セミナー 2016 第2回 クラウド
http://www.ipsj.or.jp/event/seminar/2016/program02.html
Tazro Inutano Ohta
July 22, 2016
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
120
Standardization of biological sample information database
inutano
0
70
Describe data analysis workflow with workflow languages
inutano
5
5.4k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
340
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
99
Workflows that run everywhere and where to run them
inutano
0
160
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
290
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
110
Container, Cloud, and HPC
inutano
0
170
Other Decks in Research
See All in Research
利用シーンを意識した推薦システム〜SpotifyとAmazonの事例から〜
kuri8ive
1
240
引力・斥力を制御可能なランダム部分集合の確率分布
wasyro
0
220
数理最適化と機械学習の融合
mickey_kubo
15
9.1k
20250624_熊本経済同友会6月例会講演
trafficbrain
1
550
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
shunk031
14
9.6k
Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice
masakat0
0
160
Principled AI ~深層学習時代における課題解決の方法論~
taniai
3
1.2k
多言語カスタマーインタビューの“壁”を越える~PMと生成AIの共創~ 株式会社ジグザグ 松野 亘
watarumatsuno
0
110
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
satai
3
260
研究テーマのデザインと研究遂行の方法論
hisashiishihara
5
1.6k
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
satai
3
150
A multimodal data fusion model for accurate and interpretable urban land use mapping with uncertainty analysis
satai
3
260
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
272
27k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Facilitating Awesome Meetings
lara
55
6.5k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.6k
The Invisible Side of Design
smashingmag
301
51k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
YesSQL, Process and Tooling at Scale
rocio
173
14k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
Designing for humans not robots
tammielis
253
25k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
21k
Music & Morning Musume
bryan
46
6.7k
Transcript
ΫϥυΛ׆༻ͨ͠ήϊϜใղੳͷݱঢ় 22 July 2016 | ใॲཧֶձ ࿈ଓηϛφʔ 2016 ୈ2ճ Ϋϥυ
େా ୡ! େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ " σʔλαΠΤϯεڞಉར༻ج൫ࢪઃ " ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ ಛݚڀһ"
[email protected]
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS)
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Agenda! #
1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" # 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" # 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาʹΑͬͯσʔλͷαΠζͱྔ͕૿Ճ" # ήϊϜͰʮ࣍ੈDNAγʔΫΤϯαʔʯ͕ొ" # σʔλͷੵʹΑͬͯܭࢉػੜֶ͕Μʹͳ͍ͬͯΔ" # λϯύΫཱ࣭ମߏσʔλɺը૾σʔλ" # σʔλॲཧɾղੳͷޮԽࠓͳ͓ٸ" # ΞϧΰϦζϜͷਐาΛ͍ͬͯΔ࣌ؒͳ͍" # ϋʔυΣΞͷੑೳͰΛղܾ͢Δ߹
λϯύΫཱ࣭ମߏղੳͷྫ! MEGADOCK: ౦େळࢁݚڀࣨ େ্ॿڭΒͷϓϩδΣΫτ http://www.nii.ac.jp/csi/openforum2016/track/pdf/20160526AM_TOUKOUDAI_akiyama2.pdf
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ήϊϜՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาΛཚʹྫ͑ΔͳΒ…" # ւ = ήϊϜ, ڕ = Ҩࢠ" # ʮͲΜͳڕ͕͍Δ͔ௐΔ͜ͱͰւΛಛ͚ͮΔʯ" # ٕज़ͷਐาͰಓ۩ͷੑೳ্͕ͨ͠" # Γ͕ఈҾ͖ʹͳͬͨ
ࣸਅࠨ: πϦόΧϝϥ @kazzwatabe https://tsuriba.camera/posts/XQeP3qmIp6A ࣸਅӈ: photo by atramos https://www.flickr.com/photos/atramos/5508960637 ࣮ݧػց͕ਐา͢Δͱ݁Ռͷղऍʹίετ͕͔͔Δ
͜Ε·ͰͷDNAγʔέϯαʔͷग़ྗσʔλͰݟͯ֬ೝͰ͖ͨ ࠓͷDNAγʔέϯαʔͷग़ྗܭࢉػ͕ͳ͍ͱԿͰ͖ͳ͍
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/ ← ←ఈҾ͖ DNAγʔέϯα ػछ͝ͱͷੑೳൺֱ
None
http://www.ncbi.nlm.nih.gov/Traces/sra/ ެڞσʔλϨϙδτϦͷσʔλαΠζͷ৳ͼ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) DNAγʔέϯα͔ΒಘΒΕΔσʔλ #
ʮήϊϜΛղಡ͢ΔʯͱҰݴͰݴ͏ͷͷ…" # ੜମαϯϓϧ͔ΒDNAΛநग़͢Δ" # நग़ͨ͠DNAΛ͍ࢠʹஅยԽ͢Δ" # DNAγʔέϯαͰղੳ͢Δ" # ͘அยԽ͞ΕͨԘجྻͷϦετͰग़ྗ͞ΕΔ" # େྔͷDNAஅยͷใ͔ΒݩͷDNAΛ෮ݩ͢Δ! # de novo Assemble" # Reference Alignment" "
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯα͔Βग़ྗ͞ΕΔσʔλஅยԽ͍ͯ͠Δ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯαΛγϡϨομʔʹྫ͑Δͱ…
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
http://www.historyofnimr.org.uk/mill-hill-essays/essays-yearly-volumes/2010-2/bringing-it-all-back-home-next-generation- sequencing-technology-and-you/ ϦϑΝϨϯεΞϥΠϯϝϯτ! = खຊ (ϦϑΝϨϯε) ʹԊͬͯฒͯ෮ݩ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) σʔλղੳιϑτΣΞ (ղੳπʔϧ)
# ଟ͘ͷղੳπʔϧ͕ΦʔϓϯιʔεͰެ։͞Ε͍ͯΔ" # ରσʔλͷੑ࣭ʹΑͬͯ࠷దͳπʔϧ͕ҟͳΔ" # σʔλղੳऀ (ੜֶऀ) ͕σʔλղੳΛߦ͏" # πʔϧ։ൃऀ(࣮ऀ)ͱར༻ऀಉҰͰͳ͍" # ར༻ऀ͕πʔϧͷڍಈΛશʹѲ͍ͯ͠ΔͱݶΒͳ͍" # ղੳऀৗʹσʔλղੳΛ͍ͯ͠ΔΘ͚Ͱͳ͍" # ੜ࣮ݧͷยखؒʹղੳΛ͢Δݚڀऀଟ͍
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/de-novo-genome-sequencing-category
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/whole-genome-resequencing-category
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔! #
·ͱΊ" # σʔλͷྔͱ͕ٸܹʹ૿͓͑ͯΓɺࠓޙ૿͑Δ" # తʹΑͬͯҟͳΔπʔϧɾΞϧΰϦζϜ͕༻͞ΕΔ" # σʔλղੳऀͱπʔϧ։ൃऀ(࣮ऀ)ҟͳΔ͜ͱ͕ଟ͍
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓͲͷΑ͏ͳܭࢉػ͕ΘΕ͍ͯΔͷ͔ #
PC" # PCΫϥελ" # ڌεύίϯ" # ࠃཱҨֶݚڀॴ εʔύʔίϯϐϡʔλγεςϜ
࣍ੈγʔΫΤϯαʔ%3:ղੳڭຊ ࡉ๔ֶผ ΑΓ ڭຊʹMacΛങ͑ͱॻ͍ͯ͋Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕΔͷ͔ #
ରσʔλ͕େ͖͘ͳΔ/૿͑Δͱ௨ৗͷPCͰݫ͍͠" # ղੳσʔλ͕ͲΜͲΜཷ·Δ" # ಡΈॻ͖͕ߴͰڊେͳετϨʔδ! # πʔϧ͕Out of memoryͰམͪΔ" # େنϝϞϦ! # όονॲཧΛେྔͷαϯϓϧʹର࣮ͯ͠ߦ͢Δ" # ࢄ࣮ߦδϣϒεέδϡʔϦϯάγεςϜ! # େܕڞ༻ܭࢉػͷཁٻͷߴ·Γ" # ҨֶݚڀॴSCͷಋೖ (2012~) => ·ͩेͰͳ͍
େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ ࠃཱҨֶݚڀॴ SuperComputer Facilities of National Institute of Genetics
photo from http://sc.ddbj.nig.ac.jp/index.php/ja-gallery
None
૿͑ଓ͚ΔϢʔβ ҨݚDDBJηϯλʔ খּݪ͞ΜͷൃදࢿྉΑΓ
ṧഭ͢ΔσΟεΫ https://sc.ddbj.nig.ac.jp/index.php/ja-nig-statistics
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ݱͰԿ͕ϘτϧωοΫͳͷ͔! εύίϯϢʔβձͳͲͷώΞϦϯάΑΓ
# ܭࢉػʹෆ׳ΕͳϢʔβͷΈ" # ܭࢉػ͝ͱʹԿ͕Ͱ͖ͯԿ͕Ͱ͖ͳ͍ͷ͔Θ͔Βͳ͍" # େنͳܭࢉػΛඞཁͱ͢Δ͕CUI͕͑ͳ͍" # ܭࢉػΛ͍͜ͳ͢ਓͷΈ" # ܭࢉػ͕ࠞΜͰ͍ͯδϣϒ͕ྲྀͤͳ͍" # σʔλͷղੳอଘʹेʹ༧ࢉΛೖͰ͖ͳ͍! # ڥߏஙʹίετ͕͔͔Δ" # ܭࢉػͷ໘Λݟͨ͘ͳ͍
ʮੜ࣮ݧʹ͓͕͔͔ۚΔ͕ɺ ɹσʔλղੳʹͦΕ΄Ͳ͓͕͔͔ۚΒͳ͍ʯͱࢥΘΕ͍ͯΔ http://trattoriainutano.tumblr.com/post/132214903857/
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔ #
·ͱΊ" # ରσʔλͱతʹΑͬͯཁٻʹ͕ࠩ͋Δ" # ήϊϜͰετϨʔδϝϞϦͷΈ͕ਂࠁ" # ϢʔβͷܭࢉػϦςϥγʹ෯͕͋Δ" # ϢʔβͷϨϕϧʹΑͬͯٻΊΔϨΠϠʔ͕ҧ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
ΫϥυͰղܾͰ͖Δ" # ಋೖίετ" # ϊʔυͷࠞࡶ" # ϝϯςφϯείετ" # Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ݚڀඅͰͷࢧ͍" # ະൃදσʔλ / ݸਓใΛؚΉσʔλͷѻ͍
Ϋϥυ׆༻ࣄྫ (SaaS)! Google Genomics https://cloud.google.com/genomics/v1/analyze-variants
Ϋϥυ׆༻ࣄྫ (IaaS)! 1000ਓήϊϜσʔλ on AWS https://aws.amazon.com/jp/1000genomes/
The NIH Commons! ถࠃͰϑΝϯσΟϯάଆ͕Ϋϥυར༻Λଅਐ “The Commons is a shared virtual
space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage, share, use and reuse data, software, metadata and workflows.” - https://datascience.nih.gov/ commons
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Ϋϥυ׆༻ࣄྫ (PaaS/SaaS)!
ήϊϜղੳύΠϓϥΠϯ on ΞΧσϛοΫɾΠϯλʔΫϥυ # JST CREST: ΠϯλʔΫϥυΛ׆༻ͨ͠ΞϓϦέʔγϣϯத৺ܕΦʔόʔ ϨΠΫϥυٕज़ʹؔ͢Δݚڀ (ද: NII߹ాઌੜ)" # ΞΧσϛοΫɾΠϯλʔΫϥυͷࢼΈ" # ҨݚεύίϯΛใݚΫϥυଞࠃͷΞΧσϛοΫΫϥυͱ࿈ܞ" # ղੳʹ༻͍ΒΕΔ֤πʔϧΛDockerԽ͢Δ͜ͱͰΞϓϦέʔγϣϯΛ ϙʔλϒϧʹ" # ༧ΊπʔϧΛΈ߹ΘͤͨϫʔΫϑϩʔΛߏங͠GUIΛఏڙ" # ղੳσʔλ͝ͱʹ࠷దͳϦιʔεΛׂΓͯͨܭࢉػΛ্ཱͪ͛
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
·ͱΊ: Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ܭࢉ࣌ߴͳI/OΛཁٻ" # อ࣌ίετͳετϨʔδ" # (༻Ϋϥυͷ߹) ݚڀඅͰͷࢧ͍" # ݸਓใΛؚΉσʔλͷѻ͍" # ҆શੑͷཱ֬ - ར༻࣮ͷੵ" # ΨΠυϥΠϯͷࡦఆ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) େֶපӃͰήϊϜใ͕࣍ʑʹ! AMEDͳͲͷػߏʹΑͬͯήϊϜ͕ਪਐ͞Ε͍ͯΔ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! ήϊϜσʔλղੳʹΫϥυΛ༻͍Δ͋ͨΊʹඞཁͳηΩϡϦςΟ ݚڀػؔͱΫϥυϓϩόΠμͷ࿈ܞʹΑͬͯ͞ΕΔඞཁ͕͋Δ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! # Security requirements" # The data privacy agreement / σʔλͷऔѻʹ͍ͭͯͷݚڀػؔͱͷ߹ҙ" # Physical and logical security / ཧ/ཧͰͷηΩϡϦςΟ" # Encryption data / σʔλͷอ/సૹ࣌ͷ҉߸Խ" # Authentication / Ϣʔβೝূ " # Principle of Least Privilege / ࠷খݖݶͷݪଇ" # Firewalls / ϑΝΠϠʔΥʔϧ" # Logging and monitoring / ϩΪϯάͱϞχλϦϯά" # Training / ηΩϡϦςΟೝূʹ͍ͭͯͷτϨʔχϯά" # Security and privacy / ݸਓใͷอޢ
ݸਓใͷऔΓѻ͍ͱݚڀར༻ͷؔ! ຊܦࡁ৽ฉʮҩֶݚڀͱݸਓใͷཱ྆Λ ʯΑΓ! http://www.nikkei.com/article/DGXKZO05121060S6A720C1EA1000/ ݸਓใΛؚΉݚڀσʔλපؾͷݪҼղ໌࣏ྍʹඇৗʹॏཁ ηΩϡΞͳڥ͕͋ΕݚڀΛਪਐ͢Δେ͖ͳثʹͳΔ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary #
ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" ◦ େنͳσʔλͷੵʹΑΓܭࢉػधཁ͕ߴ·͍ͬͯΔ" # ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" ◦ ήϊϜͰετϨʔδϝϞϦ͕ॏࢹ͞ΕΔ" ◦ ར༻ऀʹΑͬͯཁٻ͕ࡉ͔͘ҧ͏" # ΫϥυΛ׆༻ͯ͠Λղܾ͍͖͍ͯͨ͠" ◦ ΫϥυͷརศੑΛ͞ΒʹߴΊ͍ͯ͘" ◦ ར༻ࣄྫΛ૿͢͜ͱ͕ॏཁ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࢀߟࢿྉ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
ϥΠϑαΠΤϯεʹ͓͚Δσʔλϕʔε౷߹ʹࢿ͢Δٕज़ ։ൃΛ୲͏" # ج൫ٕज़։ൃ" # ηϚϯςΟοΫΣϒٕज़ࣗવݴޠॲཧΛ༻͍ͨϑΣσ Ϩʔγϣϯܕσʔλ౷߹ͷͨΊͷٕज़։ൃࠃࡍඪ४ͷࡦ ఆʹऔΓΉ" # DDBJ࿈ܞ" # େنήϊϜσʔλΛ࢝Ίͱ͢Δσʔλͷ׆༻ͷ ͨΊͷٕज़։ൃΛߦ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
JSTͷηϯλʔ NBDC ͱڞಉͰσʔλϕʔεࣄۀΛਐΊΔ" # DDBJͱಉ͡৫ (ROIS, NIIಉ͡) Ͱ࿈ܞ͍ͯ͠Δ http://dbcls.rois.ac.jp/about