$30 off During Our Annual Pro Sale. View Details »

Implementation and Application of High-Performa...

Implementation and Application of High-Performance Empirical Dynamic Modeling

学際大規模情報基盤共同利用・共同研究拠点 第15回シンポジウム での発表

Keichi Takahashi

July 06, 2023
Tweet

More Decks by Keichi Takahashi

Other Decks in Science

Transcript

  1. ౦๺େֶαΠόʔαΠΤϯεηϯλʔߴڮܛஐ Gerald M. Pao, Salk Institute for Biological Studies Implementation

    and Application of High- Performance Empirical Dynamic Modeling ֶࡍେن໛৘ใج൫ڞಉར༻ɾڞಉݚڀڌ఺ ୈճγϯϙδ΢Ϝ ೥݄೔ jh220050ࠃࡍڞಉݚڀ՝୊
  2. &NQJSJDBM%ZOBNJD.PEFMJOH &%. w ඇઢܗ࣌ܥྻσʔλͷ෼ੳɾ༧ଌͷͨΊͷ৽͍͠ख๏܈ w ϞσϧϑϦʔͳख๏Ͱ͋Γɼ؍ଌσʔλͷΈʹΑΓσʔλۦಈతʹ෼ੳ w 5BLFOTͷຒΊࠐΈఆཧ< >ʹج͖ͮ஗Ԇ࠲ඪܥΛ༻͍ͯྗֶܥΛ෮ݩ 2

    ݩͷྗֶܥ ࠶ߏ੒͞Εͨྗֶܥ ؍ଌ࣌ܥྻσʔλ ඍ෼ಉ૬ ˺ہॴతͳۙ๣ؔ܎͕ҡ࣋ [1] Floris Takens, “Detecting strange attractors in turbulence,” LectureNotes in Mathematics, vol. 898, 1981, pp. 366-381. 
 [2] Ethan Deyle and George Sugihara, “Generalized theorems for nonlinear state space reconstruction,” PLoS One, vol. 6, no.3, 2011.
  3. &%.ͷద༻ྫҨ఻ࢠൃݱྔͷ࣌ؒมԽ 3 300 250 200 150 100 50 0 180

    160 140 120 100 80 60 90 80 70 60 50 40 30 20 10 0 YHP1 YHP1 CLN3 YOX1 0 10 20 30 40 50 60 70 80 90 :)1 :09 $-/ 
 Ҩ఻ࢠͷൃݱྔ 
 ૬ؔ͸ඇৗʹऑ͍ $-/ͱ:09ͷ 
 ຒΊࠐΈ $-/ͱ:)1ͷ 
 ຒΊࠐΈ :)1 :09 $-/ͷ 
 ຒΊࠐΈ Ҩ఻ࢠൃݱྔͷଞʹ΋ɼੜଶܥͷಈଶɼਆܦ׆ಈɼಓ࿏ަ௨໢ͷަ௨ྔɼͳͲɼ 
 ඇઢܗྗֶܥͱͯ͠ϞσϧԽͰ͖ΔγεςϜͷଟ͘Ͱ༗ޮͰ͋Δ͜ͱ͕ࣔ͞Ε͍ͯΔɽ
  4. 4JNQMFY1SPKFDUJPOʹΑΔ୹ظ༧ଌ 5  ؍ଌ࣌ܥྻͷ࣌ؒ஗ΕΛ ࣍ݩͷঢ়ଶۭؒʹຒΊࠐΈ    ঢ়ଶۭؒʹ͓͚Δ ͷ

    ݸͷۙ๣఺Λ୳ࡧ   ۙ๣఺ͷ εςοϓޙͷࢦ਺ॏΈ෇͖ฏۉΛܭࢉ ͍ۙۙ๣఺͸ॏΈ͕େ͖͍ E X(t) x(t) = (X(t), X(t − τ), …, X(t − (E − 1)τ)) x(t) E + 1 x(t1 ), x(t2 ), …, x(tE+1 ) Tp ̂ x(t + Tp ) = E+1 ∑ i=1 wi ∑E+1 j=1 wj ⋅ x(ti + Tp ) wi = exp { − ∥x(t) − x(ti )∥ min ∥x(t) − x(ti )∥ } εςοϓޙ Tp x(t1 + Tp ) x(t2 ) x(t1 ) x(t2 + Tp ) x(t3 ) x(t3 + Tp )
  5.  ࣌ܥྻ ͱ ΛͦΕͧΕঢ়ଶۭؒʹຒΊࠐΉ  ঢ়ଶۭؒʹ͓͚Δ ͷۙ๣఺Λ୳ࡧ   ͷۙ๣఺ͷ৘ใΛ༻͍ɼ

    ͷະདྷΛ༧ଌ 
 
 
  ͕ Λߴ͍ਫ਼౓Ͱ༧ଌͰ͖ΔͳΒɼ ͸l$$.DBVTFTz X(t) Y(t) x(t) x(t1 ), x(t2 ), …, x(tE+1 ) x(t) y(t) X(t) Y(t) Y(t) X(t) $POWFSHFOU$SPTT.BQQJOH $$. ʹΑΔҼՌ෼ੳ 6 ̂ y(t + Tp ) = E+1 ∑ i=1 wi ∑E+1 i=1 wi ⋅ y(ti + Tp ) wi = exp { − ∥x(t) − x(ti )∥ min ∥x(t) − x(ti )∥ } ˞࣮ࡍ͸࠷దͳຒΊࠐΈ࣍ݩɾ࣌ؒ஗Εͷܾఆɼσʔλ఺ͷ૿ՃʹΑͬͯ 
 ༧ଌਫ਼౓͕޲্͢Δ͔ͷऩଋੑ൑ఆͳͲ͕͞Βʹඞཁ
  6. Dataset # of time series # of time steps cppEDM

    
 512 nodes mpEDM 
 1 node mpEDM 
 512 nodes Fish1_Normo 1,450 53,053 8.5h 1,973s 20s Subject6 3,780 92,538 176h* 13,953s 101s Subject11 8,528 101,729 1,081h* 39,572s 199s ։ൃͨ͠ฒྻ෼ࢄ$$.ͷੑೳ 8 1,530x ߴ଎Խ 7,941x ࢿݯ࡟ݮ 
 (8,704 USD→1.1 USD) 'JTI@/PSNPσʔληοτͷॲཧ࣌ؒΑΓ֎ૠ [1] Wassapon Watanakeesuntorn et al., “Massively Parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution,” 
 26th International Conference on Parallel and Distributed Systems (ICPADS 2020), Dec. 2020. ۙ๣୳ࡧΛ(16্ʹΦϑϩʔυ͠ɼෳ਺ϊʔυɾ(16ʹରԠͨ͠$$.࣮૷Λ։ൃ<> 
 ࢈૯ݚ"#$* 7ϊʔυ ͰੑೳධՁ
  7. ۙࣅL//୳ࡧΛద༻ͨ͠ࡍͷ࣮ߦ࣌ؒ 11 1×10-1 1×100 1×101 1×102 1×103 1×104 1×104 1×105

    1×106 Runtime [ms] L CPU Exact GPU Exact CPU IVF GPU IVF CPU HNSW CPU K-D Tree 1×10-1 1×100 1×101 1×102 1×103 1×104 1×104 1×105 1×106 Runtime [ms] L CPU Exact GPU Exact CPU IVF GPU IVF CPU HNSW CPU K-D Tree E = 1 E = 20 $16ຒΊࠐΈ࣍ݩ͕௿࣍ݩͷ৔߹͸LE5SFFɼߴ࣍ݩͷ৔߹͸)/48͕ߴ଎ (16࣌ܥྻ͕୹͍৔߹͸શ୳ࡧɼ௕͍৔߹͸*7'͕ߴ଎ ສεςοϓͰ΋ 
 ඵະຬ Yߴ଎Խ Yߴ଎Խ
  8. ۙࣅL//୳ࡧΛద༻ͨ͠ࡍͷ༧ଌਫ਼౓ 12 0.001 0.01 0.1 1 10 1×104 1×105 1×106

    MAPE L Exact IVF HNSW K-D Tree 0.001 0.01 0.1 1 10 1×104 1×105 1×106 MAPE L Exact IVF HNSW K-D Tree 0 0.2 0.4 0.6 0.8 1 1×104 1×105 1×106 Recall L Exact IVF HNSW K-D Tree 0 0.2 0.4 0.6 0.8 1 1×104 1×105 1×106 Recall L Exact IVF HNSW K-D Tree Lۙ๣୳ࡧͷਫ਼౓ ۙ๣఺ͷ3FDBMMͰධՁ 4JNQMFY1SPKFDUJPOͷਫ਼౓ ."1& .FBO"CTPMVUF1FSDFOUBHF&SSPS ͰධՁ E = 1 E = 20 E = 1 E = 20 L//ͷ3FDBMM͕௿Լͯ͠΋ɼ4JNQMFYͷ."1&͸ఔ౓ͷ૿ՃˠۙࣅL//͕ޮՌత ۙࣅL//͸ݫີͳUPQLͰ͸ͳͯ͘΋ɼۙ๣ͷ఺Λฦ͢Մೳੑ͕ߴ͍͔Β͔ 
 σʔλ͕ີʹ෼෍͍ͯ͠Δ͜ͱ͕લఏͳͷͰɼ࣮σʔλͰͷධՁ͕ඞཁ શ୳ࡧͱͷࠩ͸ 
  3FDBMM͸௿Լ
  9. ج਺ιʔτ 14 330 100 010 212 030 323 021 102

    002 121 330 100 010 030 212 102 002 323 021 121 -4%ج਺ιʔτ O E 100 102 002 010 212 021 121 323 330 030 002 010 021 030 323 330 100 102 121 212 330 100 010 212 030 323 021 102 002 121 010 030 021 002 100 102 121 212 330 323 002 010 021 030 100 102 121 100 102 .4%ج਺෦෼ιʔτ O E L 7&ɼ(16Ͱ͸ฒྻԽɾϕΫτϧԽ͕༰қͳج਺ιʔτͷ࢖༻͕Ұൠత 
 (16Ͱ͸ɼL͕খ͍͞৔߹ʹ͸ώʔϓιʔτΛ࢖༻
  10. 5PQL୳ࡧͷ࣮ߦ࣌ؒ 15 0.01 0.1 1 10 100 1000 1000 10000

    100000 1x106 Runtime [ms] Length of vector STL full sort STL partial sort ASL full sort LSD radix full sort MSD radix partial sort L 0.001 0.01 0.1 1 10 100 1000 1×103 1×104 1×105 1×106 Runtime [ms] Array length (N) STL partial sort (VE) MSD radix partial sort (VE) STL full sort (Xeon) STL partial sort (Xeon) LSD radix full sort (Xeon) MSD radix partial sort (Xeon) w $ 45-ιʔτ TUETPSU ɼ$ 45-෦෼ιʔτ TUEQBSUJBM@TPSU ɼ/&$"4-ιʔτ BTM@TPSU@ ɼ-4%ج਺ιʔτ /&$ͷ044 ɼ.4%ج਺෦෼ιʔτ ಠ࣮ࣗ૷ Λൺֱ w 7FDUPS&OHJOF5ZQF#ͱ*OUFM9FPO4JMWFSͷίΞಉ࢜Λൺֱ 9FPOͰͷ 
 TUEQBSUJBM@TPSU͕ 
 ৗʹ࠷଎
  11. ݚڀۀ੷Ұཡ ֶज़࿦จ ࠪಡ͋Γ  • Keichi Takahashi, Kohei Ichikawa, Joseph

    Park, Gerald M. Pao, “Scalable Empirical Dynamic Modeling with Parallel Computing and Approximate k-NN Search,” IEEE Access. (in print) ࠃࡍձٞൃද ࠪಡͳ͠  • Keichi Takahashi and Gerald M. Pao, “Challenges in Scaling Empirical Dynamic Modeling,” 34th Workshop on Sustained Simulation Performance, October 24-25, 2022. ެ։ιϑτ΢ΣΞ • kEDM: https://github.com/keichi/kEDM ͦͷଞ • Keichi Takahashi, Kohei Ichikawa and Gerald M. Pao, “Toward Scalable Empirical Dynamic Modeling,” Sustained Simulation Performance 2022, Springer Cham, 2023. (in print) 17
  12. ·ͱΊͱࠓޙͷ՝୊ w ۙࣅLۙ๣୳ࡧ͸&%.ʹ༗ޮͰ͋Γɼ3FDBMM͕௿ͯ͘΋&%.ͷ݁Ռ΁ͷӨڹগ w σʔλͷಛੑʹґଘ͢ΔՄೳੑ͕͋ΔͷͰɼ࣮σʔλΛ༻͍ͯධՁ w 7FDUPS&OHJOF্Ͱͷ5PQL୳ࡧ͸9FPOΑΓੑೳ͕௿͍ w ڑ཭ܭࢉͱ5PQL୳ࡧΛͦΕͧΕ7&ͱ$16Ͱ෼୲͢Δ w

    ۭؒॆరۂઢΛ༻͍ͨ7FDUPS&OHJOF্ͰͷۙࣅL//୳ࡧख๏<>ΛධՁ w ଟม਺ͷ4JNQMFY1SPKFDUJPOΛωοτϫʔΫ্ʹ૊Έ߹Θͤͨɼੜ੒తϞσϧ ͷߏஙͱݕূ<> 18 [1] খࣉխ࢘Β, “ۭؒॆరۂઢΛ༻͍ͨϕΫτϧϓϩηοαʹ͓͚Δkۙ๣๏ͷߴ଎Խ,” ୈ185ճHPCݚڀձ, 2022. 
 [2] Gerald M. Pao et al., “Experimentally testable whole brain manifolds that recapitulate behavior,” arXiv:2106.10627, 2021.