Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
軽量仮想環境による絶対に再現するデータ解析
Search
Tazro Inutano Ohta
July 01, 2015
Research
110
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
軽量仮想環境による絶対に再現するデータ解析
第4回 NGS現場の会 ポスター発表
Tazro Inutano Ohta
July 01, 2015
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
160
Standardization of biological sample information database
inutano
0
110
Describe data analysis workflow with workflow languages
inutano
5
6.1k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
380
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
130
Workflows that run everywhere and where to run them
inutano
0
190
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
330
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
150
Container, Cloud, and HPC
inutano
0
200
Other Decks in Research
See All in Research
Any-Optical-Model: A Universal Foundation Model for Optical Remote Sensing
satai
3
850
(SIGQS17) Frasco-VS:フラグメントに基づく薬剤候補化合物選抜の量子アニーリングによる実現
keisukeyanagisawa
PRO
0
130
重要だけど測れていないもの:高齢者ケアの見えない課題
theoriatec2024
0
370
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
640
第64回CV・PRML勉強会 論文紹介:Linguistic Priors for Visual Decoupling: Towards Symmetric Vision-Brain Alignment
sokikatayama
0
110
敵対生成プロンプト同時探索による内省型プロンプト最適化
kinoue_smarthr
0
240
COFFEE-Japan PROJECT Impact Report(Uminomukou Coffee)
ontheslope
0
220
SAKURAONE:An Open Ethernet-based AI HPC System And Its Observed Workload Dynamicsin a Single-Tenant LLM Development Environment
yuukit
1
380
R&Dチームを起ち上げる
shibuiwilliam
1
270
多様なデータを許容し学習し続ける模倣学習 / Advanced Imitation Learning for VLA
prinlab
0
220
通時的な類似度行列に基づく単語の意味変化の分析
rudorudo11
0
320
Ankylosing Spondylitis
ankh2054
0
180
Featured
See All Featured
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
4k
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.9k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.8k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
28
3.5k
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
310
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.8k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
170
The Curse of the Amulet
leimatthew05
2
13k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.5k
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
330
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
210
Scaling GitHub
holman
464
140k
Transcript
ܰྔԾڥʹΑΔઈରʹ࠶ݱ͢Δσʔλղੳ ڥߏஙʹϋϚͬͯࠢΛΒΕΔͷ͏ݏͩ Summary ɾσʔλղੳͷ࠶ݱੑΛ্্࣭ͤͯ͞ͳΒ͠ΛखʹೖΕ͍ͨ ɾʮٕज़ͰղܾͰ͖Δ͜ͱΛιʔγϟϧʹղܾͨ͠Βෛ͚ʯ by Dr. Itoshi Nikaido ɾެ։σʔλϨϙδτϦͱେܕܭࢉػΛ౷߹͢Δ͜ͱͰͬͱਓָ͕ؒʹͳΔ
Automation saves all େాୡ ใɾγεςϜݚڀػߏɹϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ %#$-4 UXJUUFSDPNJOVU HJUIVCDPNJOVUBOP TQFBLFSEFDLDPNJOVUBOP ࠶࣮ߦΛࣗಈԽ͢Δਓָ͕ؒʹͳΔͬͱαΠΤϯε͕Ͱ͖Δ σʔλղੳ͕࠶ݱ͠ͳ͍ͷਓ͕ؒհೖ͍͗ͯ͢͠Δ͔Β ɹঢ়گڥʹґଘ͢Δ͜ͱͳ͘ɼશ͘ಉ͡ೖྗʹରͯ͠શ͘ಉ͡खଓ͖Λ౿Ίશ͘ಉ͡ग़ྗ͕ಘΒΕΔ ͜ͱΛʮ࠶ݱੑʯͱఆٛ͢ΔͱɼNGSσʔλ͚ͩͰͳ͘ੜ໋Պֶʹ͓͚Δσʔλղੳͷ࠶ݱੑͱɼܭࢉػ ڥ࣮ߦऀͷࣝɾεΩϧͳͲɼඞཁͳ݅Λἧ͑ΔίετͱಉҰʹΈͳ͢͜ͱ͕Ͱ͖Δɽ͢ͳΘͪɼʮσʔ λղੳͷ࠶ݱੑͷ্ʯɼղੳʹཁ͢Δ࡞ۀ͔ΒଐਓੑΛഉআ͠ɼҰൠతͳܭࢉػڥͱ͔ᷮͳΩʔλΠ ϐϯάʹΑͬͯಉ͡ग़ྗ͕ಘΒΕΔΈΛཱ֬͢Δ͜ͱʹΑ࣮ͬͯݱ͞ΕΔɽ ɹαΠΤϯεͷຊ࣭Ͱ͋ΓෆஅͷྗʹΑͬͯ͜ΕΛอ࣋͠ͳ͚ΕͳΒͳ͍ͱޠΒΕΔʮ࠶ݱੑʯͱ͍͏୯ ޠɼ͔͠͠σʔλղੳʹ͓͍ͯผͷଆ໘Λ࣋ͭɽྫ͑ɼʮੲࣗͰ͜ͷղੳͬͨΑͳʯʮ͜ͷख๏ ͋ͷจͰ࣮͞ΕͯͨΑͳʯͱ͍͏ɼσʔλղੳͷݱͰසൟʹݟΒΕΔঢ়گʹ͓͍ͯɼʮੲॻ͍ͨίʔυ ࠓͬͨΒಈ͔ͳ͍͔Β·ͨॻ͖͔͢ʯʮࢼ͠ʹϏϧυͯ͠ΈͨΒṖͷΤϥʔͰίέͨࣗ͠Ͱ࣮͢Δ͔ʯ ͱ͍͏࣌ؒͷ࿘අΛආ͚Δ͜ͱʹܨ͕Δɽ࣌ؒͷ࿘අΛආ͚Δ͜ͱɼผͷ࡞ۀɼαΠΤϯεʹऔΓΉ ͨΊͷ࣌ؒΛಘΔͱ͍͏͜ͱͰ͋Δɽ͜͜Ͱհ͢Δͷɼݚڀͷຊ࣭Ͱͳ͍ڥߏஙॲཧͷ࣮ߦ͔Β ݚڀऀΛղ์͠ɼػցʹͰ͖Δ͜ͱશͯػցʹͬͯΒ͍ɼਓ͕ؒਓؒΒ͘͠ੜ͖ΔͨΊͷઓ͍Ͱ͋Δɽ Infrastructure as Code ࣮ߦʹඞཁͳશͯͷใΛܭࢉػ͕࣮ߦՄೳͳܗࣜͰهड़͢Δ ղੳͷखଓ͖Λόονॲཧ͢Δ͔ͷ͝ͱ͘ڥߏஙશͯࣗಈԽͰ͖Δɽͦ͏ɼԾڥͳΒͶɽ ɹϓϩάϥϛϯάʹ͓͚Δඒֶ͋Δֶ͍ͷ1ͭͱͯ͠ΒΕΔ DRY (Don’t Repeat Yourself) ɼ࡞ۀͷॏ ෳΛ͙ͨΊͷجຊతͳߟ͑ํͰ͋ΔɽಛʹιϑτΣΞΛΠϯετʔϧͨ͠ΓɼϚγϯͷڥઃఆΛߦͬ ͨΓɼ͜Ε·ͰରతʹߦΘΕΔ͜ͱ͕ଟ͔ͬͨ࡞ۀɼͦͷਓ͕ؒखΛಈ͔ͯ͠ߦΘΕΔ͜ͱ͕ଟ ͔ͬͨɽ͔͠͠ɼάϦουίϯϐϡʔςΟϯάԾڥͳͲͷใٕज़͕༰қʹར༻Ͱ͖ΔΑ͏ʹͳͬͨ ͜ͱʹΑΓɼܭࢉػͷ͚ͩखಈͰڥߏஙΛߦ͏͜ͱશ͘ݱ࣮తͰͳ͍ͨΊɼηοτΞοϓͷͨΊ ʹඞཁͳશͯͷखଓ͖Λ࣮ߦՄೳͳϓϩάϥϜͱͯ͠هड़͢Δ Infrastructure as Code ͱ͍͏֓೦͕ఏএ͞Ε ΔΑ͏ʹͳͬͨɽ ɹUNIX/LINUXϕʔεͷγεςϜͰ͋Εɼ୯७ͳγΣϧεΫϦϓτͰ࣮ݱՄೳͰ͋Δ͕ɼෳࡶͳॲཧΛه ड़ͨ͠ΓɼOSʹґଘ͠ͳ͍நతͳهड़Λ࣮ݱ͢ΔͨΊʹɼ͍͔ͭ͘ͷٕज़͕ఏএ͞Εͨ(Fig. 2)ɽͦͷද ͕Chef, Puppet, AnsibleͰ͋Δɽ͜ΕΒʹΑͬͯେنͳܭࢉػ܈Λίϯτϩʔϧ͢Δ͜ͱ͕ඇৗʹ༰қʹ ͳͬͨɽͦͷଞʹVagrantΛ࢝Ίͱ͢ΔԾڥͦͷͷͷىಈΛϓϩάϥϜʹམͱ͜͠ΉϓϩμΫτ͕ࠓ ·͞ʹશظͰ͋Γɺ͜Εʹʮ࣮ˠݕূˠӡ༻ʯͷ࡞ۀΛࣗಈԽͯ͠܁Γฦ͢͜ͱʹΑͬͯγεςϜͷ࣭Λ ҡ࣋͢Δ”Continuous Integration”Λ࣮ݱ͢ΔJenkinsͳͲ, ͞·͟·ͳιϑτΣΞ͕։ൃ͞Ε͍ͯΔɽ͜ΕΒ ϋʔυΣΞͰͦͷ··ಈ͘γεςϜ (ϕΞϝλϧ) ϋΠύʔόΠβʔܕԾڥΛରͱ͍ͯ͠Δɽ ɹͦͷޙʹొͨ͠ͷ͕৽ͨͳܭࢉػԾԽٕज़Ͱ͋ΔDockerͰ͋Δɽ͜ΕίϯςφܕԾԽͱݺΕΔ ͷͰɼܭࢉػ্ʹܭࢉػΛΤϛϡϨʔτ͢ΔϋΠύʔόΠβʔܕͱҟͳΓɼԾԽʹΑΔΦʔόʔϔου Λ͍͑ͯΔ͜ͱΛಛͱ͢ΔɽDocker͜ͷίϯςφܕԾʹࠩϑΝΠϧγεςϜͱDockerfileͱݺΕ Δ࣮ߦՄೳͳڥߏஙͷهड़Λαϙʔτ͍ͯ͠Δɽ Docker on NIG Supercomputer /(4σʔλղੳϫʔΫϑϩʔWTίϯςφԾPOҨݚεύίϯ ੨͍ܵҨݚεύίϯΛٹ͑Δ͔ ɹࠃཱҨֶݚڀॴͰɼඇ༻ར༻Ͱ͋ΕΞΧϯτΛਃ͢Δ͜ͱͰແྉͰར༻ Ͱ͖ΔεʔύʔίϯϐϡʔλγεςϜ(ҨݚεύίϯɼѪশ:ΓͳͪΌΜ)Λӡ༻͍ͯ͠ Δɽଞʹ͓͚Δεύίϯͷओͳར༻ํ๏͕MPIGPGPUͳͲͷΞʔΩςΫνϟΛར ༻ͨ͠CPUʹߴෛՙͷ͔͔ΔܭࢉͰ͋Δ͜ͱʹରͯ͠ɼNGSσʔλͰڊେͳྻσʔ λϦϑΝϨϯεσʔλϕʔεʹසൟʹΞΫηε͢ΔͨΊI/O͕ͱͳΔɽ·ͨɼҨ ݚεύίϯͷΑ͏ͳڞಉར༻ܕͷ߹ɼෳͷར༻ऀ͕ͦΕͧΕ༷ʑͳཁٻ༷Λ࣋ͭ શ͘ҟͳΔղੳϑϩʔΛ࣮ߦ͢ΔͨΊɼݸผͷΞϓϦέʔγϣϯʹରͯ͠ڥߏஙʹΑͬ ͯ࠷దԽΛߦ͏͜ͱ͕͍͠ɽ ɹ͜ͷΑ͏ͳΛղܾ͢ΔͨΊʹɼϋΠύʔόΠβʔܕͱҟͳΓI/OͷΦʔόʔϔο υ͕গͳ͍ίϯςφܕԾ༗རͰ͋Δɽͦ͜ͰɼNGSσʔλղੳͰ༻͍ΒΕΔιϑτ ΣΞΛDockerίϯςφʹ͢Δ͜ͱͰɼύϑΥʔϚϯεΛམͱͣ͞ʹInfrastructure as CodeΛ࣮ݱ͢ΔͨΊͷςετڥΛҨݚεύίϯ্ʹߏஙͨ͠ɽDockerίϯςφ͕ར ༻͢ΔϦιʔεΛཧ͢ΔͨΊʹ Apache Mesos ΛɼδϣϒεέδϡʔϦϯάͷͨΊʹ Chronos Λϕʔεͱͨ͠ಠࣗεέδϡʔϥΛ࣮ͨ͠ɽ ՝ͱͯ͠ɼ1) ैདྷͷάϦουΤϯδϯͰͷ࣮ߦΑΓϧʔϧ͕ଟ͘ࡶʹײ͡ΒΕ Δɼ2) ܭࢉػڥͷಋೖͷख͕ؒଟ͍ɼ3) Ҡߦ͢ΔϝϦοτΛײ͡ʹ͍͘ɼ 4) ύΠ ϓϥΠϯॲཧΛهड़͢Δ෦͕ශऑ ͳͲ͕͋ΔͨΊɼ͞ΒʹվળΛߦ͍ͬͯ͘ɽ Mesos Slave Mesos Slave Mesos Slave workflow manager Node Storage %PDLFSGJMFT 8PSLGMPXKTPO %BUB 1VTI%PDLFSGJMFTUP $POUBJOFS3FHJTUSZ Node Node Node Node Mesos Master Mesos Slave 1VTIXPSLGMPX DPOGJHVSBUJPOUPNBOBHFS 5SBOTGFS%BUB WJBJOUFSOFU 1VMMDPOUBJOFS3VO .PVOU%BUB%JSUPDPOUBJOFS 1BTTFYFDVUFS VTFS Fig. 1. FANTOM5 (http://fantom.gsc.riken.jp/5/)ʹ͓͚Δ CAGE-Seqσʔλղੳͷهड़ (a) จதͰ Materials and Methods, ͘͠SupplementaryதʹࣗવݴޠͰهड़͞Ε Δɽ (b) FANTOM5ͰจͱผʹΦϯϥΠϯͰϓϩτί ϧΛެ։͍ͯ͠Δɽେม༗Γ͍ɽ (c) ͔͠͠࠶࣮ߦ͢Δͱ ͳΔͱɼ݁ہόονεΫϦϓτΛॻ͘͜ͱʹͳΔɽ͜Ε͕ จʹఴ͞Ε͍ͯͯཉ͍͠ɽ a b c Fig. 2. Infrastructure as Code ʹؔΘΔϓϩμΫ τҰཡ ࠨ͔Β࣌ܭճΓʹ ϋΠύʔόΠβܕԾ ڥͷߏஙΛࣗಈԽ͢ΔVagrant, ৽نʹՔಇͨ͠ܭ ࢉػڥͷηοτΞοϓΛࣗಈԽ͢ΔChef, Puppet, Ansible, Continuous IntegrationΛ࣮ݱ͢Δ දతͳϓϩμΫτ Jenkins (௨শδΣϯΩϯε͓ ͡͞Μ), ίϯςφܕԾͷضखͰ͋Δdocker, ςϯ ϓϨʔτʹैͬͨܭࢉػγεςϜΛࣗಈͰՔಇ͞ ͤΔTerraformɽશ෦ͬͨ͜ͱ͋Δ/ͬͯΔਓ ͥͻ͓༑ୡʹͳΓ͍ͨͷͰɼͥͻϦΞϧ͍͍Ͷʂ εςοΧʔΛష͍ͬͯͩ͘͞ʂ Fig. 3. Ҩݚεύίϯ্ʹߏஙͨ͠γεςϜ ͷ֓ཁ ҨݚεύίϯͷϊʔυΛطଘγες Ϝ͔ΒΓ͠ɼ Apache Mesos ͰϦ ιʔεΛ ཧ͠ɼͦͷ্ͰίϯςφΛىಈ͢ ΔɽϢʔβ༧Ίdocker containerΛϨδετ Ϧʹొ͠ɼ͜ΕΒΛΈ߹ΘͤͨϫʔΫϑ ϩʔͷهड़ΛREST APIܦ༝Ͱొ͢Δɽ ͬͯΈ͍ͨਓ࿈བྷઌΛγʔϧͰష͓ͬͯ ͍͍ͯͩ͘͞ɽɹɹɹɹɹ͜ͷΜʹˣ