Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Programmable Interconnect Control Adaptive to C...

Keichi Takahashi
February 04, 2019

Programmable Interconnect Control Adaptive to Communication Pattern of Applications

Keichi Takahashi

February 04, 2019
Tweet

More Decks by Keichi Takahashi

Other Decks in Science

Transcript

  1. ۙ೥ͷߴੑೳܭࢉγεςϜͷଟ͘͸ΫϥελΞʔΩςΫνϟʹجͮ͘ ‣ Ϋϥελ͸૬ޓ݁߹໢ʹΑΓ઀ଓ͞Εͨଟ਺ͷܭࢉػ͔Βߏ੒ ‣ ֤ܭࢉػ͸૬ޓ݁߹໢Λհͯ͠σʔλΛަ׵͠ͳ͕Βฒྻܭࢉ  ૬ޓ݁߹໢ͷ௨৴ੑೳͷॏཁੑ 0 1 2

    3 4 5 ૬ޓ݁߹໢ ߴੑೳωοτϫʔΫ ܭࢉػ ૬ޓ݁߹໢Λհͨ͠௨৴ ϓϩηε ং࿦ 5PQ +VOF ͷ ΞʔΩςΫνϟ಺༁   $MVTUFS .11 ૬ޓ݁߹໢ͷ௨৴ੑೳ͸Ϋϥελ શମͷܭࢉੑೳΛେ͖͘ࠨӈ͢Δ
  2. ूஂ௨৴ ʜ ର௨৴ )1$޲͚ϓϩηεؒ௨৴ϥΠϒϥϦ ‣ ϓϩηε͸/8ΞυϨεͷ୅ΘΓʹ ϥϯΫ ϓϩηε*% ʹΑͬͯࣝผ ‣

    ฒྻϓϩάϥϛϯάʹ༗༻ͳ ର௨৴ͱूஂ௨৴Λఏڙ  .FTTBHF1BTTJOH*OUFSGBDF .1* 0 1 4FOE *TFOE 3FDW *SFDW 0 2 3 1 #DBTU 1 2 3 0 3FEVDF 2 3 0 1 "MMUPBMM 5$1*1 .1*ϥΠϒϥϦ ΞϓϦέʔγϣϯ 4PDLFU 7FSCT 14. .1* *OpOJ#BOE 0NOJ1BUI ং࿦ ϓϩηε ϓϩηε
  3.  ؔ࿈ݚڀ ϓϩηε഑ஔ ϓϩηεؒ௨৴ύλʔϯ 0 1 2 3 4 5

    6 7 8 9 10 11 ૬ޓ݁߹໢ͱ௨৴ύλʔϯΛ ҙࣝͨ͠ϓϩηε഑ஔ ૬ޓ݁߹໢ͷτϙϩδΛ ҙࣝͨ͠ूஂ௨৴ΞϧΰϦζϜ 0 1 2 3 4 5 6 7 8 9 10 11 ૬ޓ݁߹໢ ং࿦ ಈతมߋ ಈతมߋ ಈతมߋ ૬ޓ݁߹໢ͷϓϩάϥϚϏϦςΟΛ ׆༻ͨ͠ݚڀ͕ෆे෼
  4.  ໨తϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ૬ޓ݁߹໢ ܭࢉػ ং࿦ ΞϓϦ" ΞϓϦ# ΞϓϦ$ 0 1

    2 3 0 1 2 3 0 1 2 3 ಈత੍ޚ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త .1*ΞϓϦέʔγϣϯ
  5.  ՝୊ ং࿦ MPI_Bcast(…); for (i = 0; i <

    N; i++) { … } … MPI_Allreduce(…); 4SD %TU *OTUSVDUJPOT A B Output to X A C Output to Y B D Output to Z .1*ΞϓϦέʔγϣϯ ϓϩηεؒ௨৴ύλʔϯ ૬ޓ݁߹໢੍ޚ໋ྩ ՝୊૬ޓ݁߹໢಺ͷύέοτϑϩʔ ͷղੳ 0 1 2 3 4 5 0 1 2 3 4 5 ՝୊௨৴ύλʔϯΛҙࣝͨ͠૬ޓ݁߹໢ͷ ಈత੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦέʔγϣϯͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త
  6.  શମߏ੒ ং࿦ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ

    ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త
  7.  શମߏ੒ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ

    ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ
  8.  ϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷݚڀ։ൃ Ϋϥελͷߏ੒ ௨৴ύλʔϯ 0 2 1 3 δϣϒηοτ K

    K K K ૬ޓ݁߹໢ ߴੑೳωοτϫʔΫ ܭࢉػ ෳ߹త࡞༻ ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ૬ޓ݁߹໢಺ͷύέοτϑ 0 1 2 3 ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δ ύέοτϑϩʔͷධՁ؀ڥ͕ෆՄܽ
  9.  ষͷ໨త Ϋϥελ্ͰΞϓϦέʔγϣϯΛ࣮ߦͨ͠ࡍʹɺ૬ޓ݁߹໢಺ʹ ൃੜ͢ΔύέοτϑϩʔΛղੳ͢ΔͨΊͷධՁ؀ڥΛߏங͢Δ ໨త ௨৴ύλʔϯ Ϋϥελͷߏ੒ 0 2 1

    3 K K K K ෳ߹త࡞༻ ૬ޓ݁߹໢ ߴੑೳωοτϫʔΫ ܭࢉػ δϣϒηοτ ཁ࣮݅ΞϓϦέʔγϣϯͷ௨৴ύλʔϯͷநग़ ཁ݅૬ޓ݁߹໢಺ͷύέοτϑϩʔΛߴ଎ʹਪఆ ཁ݅Ϋϥελͷϓϩηε഑ஔಛੑͷ࠶ݱ ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ
  10.  1'4JN૬ޓ݁߹໢಺ύέοτϑϩʔγϛϡϨʔλ 0 1 2 3 4 5 ϊʔυׂ౰ ϓϩηε഑ஔ

    ϧʔςΟϯά K K K K ૬ޓ݁߹໢಺ͷ ύέοτϑϩʔྔ εέδϡʔϦϯά ௨৴ύλʔϯ Ϋϥελͷߏ੒ 0 2 1 3 K K K K δϣϒηοτ ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ϦϓϨΠ Ϛοϐϯά ࠶ݱ ग़ྗ ೖྗ 1'4JN ग़ྗ
  11.  ධՁ࣮ݧ ߲໨ γϛϡϨʔγϣϯ৚݅ ܭࢉϊʔυ ୆ ܭίΞ τϙϩδ ஈ'BUUSFF PWFSTVCTDSJQUJPO

    SBUJP ௨৴ύλʔϯ /"4$( ϧʔςΟϯά %NPE, ఏҊπʔϧ͕ɺΫϥελ্ͰΞϓϦέʔγϣϯΛ࣮ߦͨ͠ࡍͷ ૬ޓ݁߹໢಺ͷύέοτϑϩʔΛਖ਼֬ʹਪఆͰ͖͍ͯΔ͔ධՁ͢Δ ໨త ௨৴ྔ͕ଟ͍ϕϯνϚʔΫ ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ core1 core2 edge1 edge2 edge3 edge4 ܭࢉػ܈ ૬ޓ݁߹໢
  12. $PSFεΠον&EHFεΠονؒͷύέοτϑϩʔͷྦྷੵྔΛγϛϡ Ϩʔγϣϯͱ࣮ଌ஋Ͱൺֱ ‣ ޡࠩ͸࠷େͰ͋Γɺਖ਼֬ʹύέοτϑϩʔྔΛਪఆͰ͖͍ͯΔ      DPSFˠ

    FEHF DPSFˠ FEHF DPSFˠ FEHF DPSFˠ FEHF DPSFˠ FEHF DPSFˠ FEHF DPSFˠ FEHF DPSFˠ FEHF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF FEHFˠ DPSF ࣮ଌ஋ γϛϡϨʔγϣϯ  ධՁ݁Ռ ࠷େޡࠩ ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ૬ޓ݁߹໢ ܭࢉػ܈ DPSF DPSF FEHF FEHF FEHF FEHF
  13.  શମߏ੒ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ

    ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త 4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ
  14.  ఏҊͷجຊํ਑ .1*ूஂ௨৴ؔ਺ͷ௨৴ύλʔϯΛߟྀ͠ɺ૬ޓ݁߹໢಺ͷύέοτ ϑϩʔΛෛՙ෼ࢄ͢Δ͜ͱʹΑΓɺ௨৴ܦ࿏ͷিಥΛܰݮ͠௨৴Λ ߴ଎Խ 0 2 1 3 4

    6 5 7 .1*ूஂ௨৴ؔ਺ͷ ௨৴ύλʔϯ ಈత੍ޚ ϦϯΫؒͷෛՙΛ෼ࢄ ೖྗ ૬ޓ݁߹໢ͷίϯτϩʔϥ 4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ
  15.  4%/Λ׆༻ͨ͠.1*ूஂ௨৴ϑϨʔϜϫʔΫ ੍ޚ 4%/ίϯτϩʔϥ ܭࢉϊʔυ ૬ޓ݁߹໢ .1*ϥΠϒϥϦ .1*ΞϓϦέʔγϣϯ 4%/.1*ϥΠϒϥϦ τϙϩδݕग़

    εΠον੍ޚ "MMSFEVDF #DBTU ੍ޚΞϧΰϦζϜ ֎෦*' ʜ িಥΛܰݮ͢Δ੍ޚΞϧΰϦζϜ ूஂ௨৴ͷ։࢝ ϓϩηε഑ஔ 4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ
  16.  "MMSFEVDF༻ύέοτϑϩʔ੍ޚΞϧΰϦζϜ 0 2 1 3 1 3 0 1

    2 3         0 2 1 3 .1*@"MMSFEVDFͷ ϓϩηεؒ௨৴ύλʔϯ ૬ޓ݁߹໢ͷτϙϩδ 0 2 .1*@"MMSFEVDFͷϓϩηεؒ௨৴ʹΑͬͯൃੜ͢Δܭࢉػؒͷ ύέοτϑϩʔΛᩦཉ๏ʹΑΓ৑௕ܦ࿏ؒʹ෼ࢄ ໨త 4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ
  17.  ධՁ࣮ݧ .1*@"MMSFEVDFͷ࣮ߦ࣌ؒ ஈ'BUUSFFτϙϩδ ϊʔυ ఏҊ.1*@"MMSFEVDFͷ 0QFO.1*ʹର͢Δߴ଎Խ཰ ௨৴ύλʔϯΛߟྀͨ͠ύέοτϑϩʔ੍ޚʹΑΓɺ௨৴ܦ࿏ͷ িಥΛܰݮ͠ɺ.1*@"MMSFEVDFؔ਺Λߴ଎ԽͰ͖͔ͨධՁ͢Δ ໨త

    ࣮ߦ࣌ؒ<T>      ϝοηʔδαΠζ<.#>          0QFO.1* ୅දతͳ.1*࣮૷ ఏҊ ߴ଎Խ཰      ϝοηʔδαΠζ<.#>          ͷߴ଎Խ 4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ
  18. ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ؔ࿈ۀ੷ ‣ ,FJDIJ5BLBIBTIJFUBM l6OJTPO'MPX"4PGUXBSF%FpOFE$PPSEJOBUJPO.FDIBOJTNGPS.FTTBHF 1BTTJOH$PNNVOJDBUJPOBOE$PNQVUBUJPOz *&&&"DDFTT WPM OP QQ

     ‣ ,FJDIJ5BLBIBTIJFUBM l$PODFQUBOE%FTJHOPG4%/FOIBODFE.1*'SBNFXPSLz 5IFGPVSUI FEJUJPOPGUIF&VSPQFBO8PSLTIPQPO4PGUXBSF%FpOFE/FUXPSLT &84%/ QQ 4FQ 
  19.  શମߏ੒ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ

    ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  20.  ࣮ࡍͷ)1$ΞϓϦέʔγϣϯͷ࣮ߦͷ༷ࢠ 0 1 2 3 ܭࢉ ௨৴ ϓϩηε 0

    1 2 3 0 1 2 3 0 1 2 3 .1*@"MMSFEVDF ܭࢉ ௨৴ ʜ ʜ ʜ ʜ ʜ 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 .1*@"MMUPBMM 0 1 2 3 .1*@"MMSFEVDFͷ ௨৴ύλʔϯ 0 1 2 3 .1*@"MMUPBMMͷ ௨৴ύλʔϯ ϛϦඵͷΦʔμͰมԽ .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  21.  ষͷ໨త #include <mpi.h> int main() { MPI_Init(&argc, &argv); MPI_Bcast(buf,

    count, …); /* … */ MPI_Allreduce(buf, count, …); MPI_Finalize(); } ࣌ؒมԽ͢Δ ௨৴ύλʔϯ ૬ޓ݁߹໢಺ ύέοτϑϩʔͷ੍ޚ ࿈܎ ࣮ߦ ੍ޚ 4SD."$ %TU."$ ʜ *OTUSVDUJPOT aa:aa:aa: … ff:ff:ff: … Flood bb:bb:bb: … aa:aa:aa: … Output to Port X aa:aa:aa: … bb:bb:bb: … Set Dst IP to Y,… 4SD."$ %TU."$ ʜ *OTUSVDUJPOT aa:aa:aa: … ff:ff:ff: … Flood bb:bb:bb: … aa:aa:aa: … Output to Port X aa:aa:aa: … bb:bb:bb: … Set Dst IP to Y,… 4SD."$ %TU."$ ʜ *OTUSVDUJPOT aa:aa:aa: … ff:ff:ff: … Flood bb:bb:bb: … aa:aa:aa: … Output to Port X aa:aa:aa: … bb:bb:bb: … Set Dst IP to Y,… ΞϓϦέʔγϣϯͷ࣮ߦͱ૬ޓ݁߹໢಺ͷύέοτϑϩʔ੍ޚΛ ௿ΦʔόʔϔουͰ࿈܎͢Δػߏͷ࣮ݱ ໨త .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  22. ֤ύέοτʹ.1*ΤϯϕϩʔϓΛූ߸Խͯ͠λάͱͯ͠ຒΊࠐΉ ‣ εΠονͰ͸λάͷ஋ΛݩʹύέοτΛ੍ޚ ‣ طଘͷ4%/εΠονͷύέοτॲཧύΠϓϥΠϯ͕ಡΈऔΕΔ &UIFSOFUϔομΛར༻͢Δ͜ͱͰɺ௿ΦʔόʔϔουΛ࣮ݱ  λάͷ಺༰ͱຒΊࠐΉ৔ॴ 5$1 ϔομ

    *1 ϔομ &UIFSOFU ϔομ *1 ϔομ .1* Τϯϕϩʔϓ .1* ϝοηʔδ - - - - εΠον಺)8Ͱ ॲཧՄೳ ஗ԆЖT 4%/ίϯτϩʔϥ΁ సૹ͕ඞཁ ஗ԆNT .1* ϝοηʔδ .1* ϝοηʔδ .1* ϝοηʔδ .1* Τϯϕϩʔϓ .1* Τϯϕϩʔϓ .1* Τϯϕϩʔϓ 5$1 ϔομ 5$1 ϔομ ѼઌϥϯΫɺίϛϡχέʔ λɺूஂ௨৴ͷछྨ౳ .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  23.  6OJTPO'MPXܭࢉͱ௨৴ͷ࿈܎ػߏ .1*ΞϓϦέʔγϣϯ .1*ϥΠϒϥϦ Ϣʔβۭؒ 5$1ϨΠϠ *1ϨΠϠ &UIFSOFUϨΠϠ /*$υϥΠό /*$

    Χʔωϧۭؒ 4%/ίϯτϩʔϥ ੍ޚ 4PDLFU λά෇͚ -,. ΧελϜ.1*ϥΠϒϥϦ JPDUM .1* 5BH *OTUSVDUJPOT " 0VUQVUUPQPSU9 # 0VUQVUUPQPSU: ʜ ʜ 4%/ίϯτϩʔϥ ։ൃͨ͠ιϑτ΢ΣΞ .1*ύέοτ λά .1*ύέοτ .1*ύέοτ .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  24. DPSF DPSF FEHF FEHF FEHF FEHF  .1*ͱ࿈܎ͨ͠૬ޓ݁߹໢ͷ੍ޚͷ֬ೝ int main()

    { … MPI_Allreduce(…); … MPI_Alltoall(…); … } .1*@"MMSFEVDF .1*@"MMUPBMM 0 5.0*105 1.0*106 1.5*106 2.0*106 2.5*106 3.0*106 0 50 100 150 200 250 300 350 400 Bandwidth (bps) Elapsed Time (s) core1 (port1) core2 (port41) ఏҊख๏ͳ͠ 0 5.0*105 1.0*106 1.5*106 2.0*106 2.5*106 3.0*106 3.5*106 4.0*106 4.5*106 5.0*106 0 50 100 150 200 250 300 350 400 Bandwidth (bps) Elapsed Time (s) core1 (port1) core2 (port41) ఏҊख๏͋Γ .1*@"MMSFEVDF .1*@"MMUPBMM MPI_Alltoall MPI_Allreduce ࿈܎ ࢖༻ස౓ɾ૬ޓ݁߹໢΁ͷෛՙߴ "MMSFEVDF࣮ߦத͸DPSFͷΈ࢖༻ "MMUPBMM࣮ߦத͸DPSFͷΈ࢖༻ DPSFͱ΋ʹ࢖༻ .1*௨৴ͱܭࢉͷ࿈܎ػߏ
  25. ఏҊख๏ʹΑΓɺର௨৴ʹൃੜ͢ΔΦʔόʔϔουΛධՁ ‣ .1*ͷ֤ؔ਺ͷύϑΥʔϚϯεΛଌఆ͢Δ046.JDSP#FODINBSLΛ ༻͍ɺϊʔυؒͰͷ.1*@4FOE3FDWͷεϧʔϓοτͱ஗ԆΛܭଌ  ର௨৴ͷΦʔόʔϔουͷܭଌ .1*௨৴ͱܭࢉͷ࿈܎ػߏ ஗Ԇͷൺֱ ஗Ԇ<ЖT> &

     &  &  &  &  ϝοηʔδαΠζ<#> &  &  &  6OJTPO'MPX͋Γ 6OJTPO'MPXͳ͠ εϧʔϓοτͷൺֱ εϧʔϓοτ<.#T> & &  &  &  &  &  ϝοηʔδαΠζ<#> &  &  &  6OJTPO'MPX͋Γ 6OJTPO'MPXͳ͠
  26.  ষͷ·ͱΊ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ

    ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త ݁࿦
  27.  શମߏ੒ ষ.1*௨৴ͱܭࢉͷ࿈܎ػߏ ௨৴ͱܭࢉ͕࿈܎ಈ࡞͢Δ৽ͨͳΫϥελΞʔΩςΫ νϟΛཱ֬ ষ4%/ʹΑΔ.1*ूஂ௨৴ͷߴ଎Խ .1*ूஂ௨৴Λߴ଎Խ͢Δύέοτϑϩʔ੍ޚΞϧΰ ϦζϜΛ഑උՄೳͳϑϨʔϜϫʔΫΛߏங ষ૬ޓ݁߹໢಺ύέοτϑϩʔͷղੳπʔϧ ΞϓϦέʔγϣϯ͕૬ޓ݁߹໢಺ʹੜ੒͢Δύέοτ

    ϑϩʔͷղੳखஈΛ࣮ݱ ՝୊૬ޓ݁߹໢಺ͷ ύέοτϑϩʔͷղੳ ՝୊૬ޓ݁߹໢ͷಈత ੍ޚʹΑΔ௨৴ͷߴ଎Խ ՝୊ΞϓϦͷ࣮ߦͱ ૬ޓ݁߹໢ͷ੍ޚͷ࿈܎ ΞϓϦέʔγϣϯͷϓϩηεؒ௨৴ύλʔϯΛߟྀͯ͠ύέοτ ϑϩʔΛ੍ޚ͢ΔϓϩάϥϚϒϧͳ૬ޓ݁߹໢੍ޚٕज़ͷ࣮ݱ ໨త ݁࿦
  28. ‣ Defined as a matrix T of which element Tij

    is equal to the volume of traffic sent from rank i to rank j ‣ Implies that the volume of traffic between processes as constant during the execution of a job  Representation of Communication Pattern 0 50 100 Sender Rank 0 25 50 75 100 125 Receiver Rank 0.0 0.2 0.4 0.6 0.8 1.0 Sent Bytes £108 An example obtained from running the NERSC MILC benchmark with 128 processes The communication pattern of an application is represented using its traffic matrix
  29.  1'4JN"SDIJUFDUVSF Event Event Event Queue j1 j2 j3 j4

    Job Submitted Event Handlers … Job Started Job Finished Job Queue Simulator State Update Interconnect Computing Nodes Event Dispatch Customized via Plugins