Upgrade to Pro — share decks privately, control downloads, hide ads and more …

General-purpose hybrid storage system

General-purpose hybrid storage system

第4回 Web System Architecture 研究会 (WSA研) の発表資料です。
https://websystemarchitecture.hatenablog.jp/entry/2019/02/26/100725

Narimichi Takamura

April 13, 2019
Tweet

More Decks by Narimichi Takamura

Other Decks in Technology

Transcript

  1. ࣗݾ঺հ • Takamura Narimchi / ߴଜ ੒ಓ • @nari_ex •

    גࣜձࣾϋʔτϏʔπ औక໾ VPoE • ిؾ௨৴େֶ • ৘ใཧ޻ֶ෦৘ใɾ௨৴޻ֶՊ ֶ࢜ • άϩʔϏεܦӦେֶӃ • ܦӦݚڀՊܦӦઐ߈ म࢜ʢMBAʣ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 2
  2. ໨࣍ • എܠͱ՝୊ • ఏҊ • ࣮૷ํ๏ • ຊػߏͷར༻ύλʔϯ •

    ·ͱΊ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 3
  3. ैདྷख๏3: طଘϑΝΠϧγεςϜͷ֦ு • Btrfs ͷ֦ு1 • ϚϧνσόΠεʹରԠ͍ͯ͠Δ Btrfs ͷಛ௃Λ׆͔ͨ͠ݚڀ •

    ൚༻ϒϩοΫ૚ʹͯσʔλͷҠಈΛߦ͏ • ՝୊ • σʔλҠಈ࣌ʹڞ༗ϦιʔεʢϝϞϦɺCPUʣͷෛՙ͕ൃੜ • Btrfs Ҏ֎ͷϑΝΠϧγεςϜΛར༻Ͱ͖ͳ͍ • ޿͘ར༻͞Ε͍ͯΔ ext4 ΍ xfs ͕ར༻Ͱ͖ͳ͍ 1 Hot Cold Data Tracking and Migra3on in btrfs. 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 13
  4. ՝୊·ͱΊ • ΞʔΧΠϒ • ίϯςϯπσʔλʹෆ޲͖ • ΤϯλʔϓϥΠζ޲͚੡඼ • ߴՁ •

    ϕϯμϩοΫΠϯ • Ϋϥ΢υʹෆ޲͖ • طଘϑΝΠϧγεςϜͷ֦ு • ϑΝΠϧγεςϜͷબ୒͕Ͱ͖ͳ͍ • σʔλҠಈ࣌ͷෛՙʹΑͬͯϝΠϯॲཧͷಈ࡞ʹӨڹ͕ग़Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 15
  5. ୡ੒͍ͨ͜͠ͱ • ൚༻ੑ͕ߴ͍ • ΦϯϓϨ؀ڥ͸΋ͪΖΜɺΫϥ΢υ؀ڥͰ΋ར༻Մೳ • OSS Ͱߏ੒͞ΕɺLinux্Ͱಈ࡞Λ͢Δ • ಋೖ࣌ʹಛఆͷϑΝΠϧγεςϜʹґଘ͠ͳ͍

    • ϗοτετϨʔδɺίʔϧυετϨʔδ͝ͱʹϑΝΠϧγεςϜΛબ΂Δ • ίʔϧυσʔλͷҠಈෛՙ͕े෼ʹ௿͍ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 19
  6. ࣮૷ํ਑ 1. ΞΫηεৼΓ෼͚ • ϢʔβεϖʔεϑΝΠϧγεςϜʢFUSEʣʹ࣮ͯ૷ • खܰʹಋೖՄೳ • ೚ҙͷϑΝΠϧγεςϜΛར༻Մೳ 2.

    σʔλͷ࠶഑ஔ • σʔϞϯϓϩηεʹ࣮ͯ૷ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 22
  7. Mul$-Temperature FileSystemʢMTFSʣ • ϢʔβεϖʔεϑΝΠϧγεςϜ • ϢʔβʔεϖʔεσʔϞϯϓϩηε: m%sd • ϗοτσʔλ༻ͱίʔϧυσʔλ༻ͷύʔςΟγϣϯΛͦΕ ͧΕࢦఆͯ͠ىಈ

    • ΞϓϦέʔγϣϯ͔Βཁٻ͞ΕΔϑΝΠϧૢ࡞Λίϯτϩʔ ϧ͢Δ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 24
  8. m"sd ͷॲཧ֓ཁ 1. ΞϓϦέʔγϣϯ͕ϑΝΠϧΞΫηεΛཁٻ 2. FUSE ϥΠϒϥϦΛ௨ͯ͠ m*sd ͕γεςϜίʔϧΛड৴ 3.

    ϗοτετϨʔδ΁໰͍߹Θͤ • ϑΝΠϧ͕ଘࡏ͠ͳ͍৔߹͸ίʔϧυετϨʔδʹ໰͍߹Θͤ 4. औಘͨ͠σʔλΛΞϓϦέʔγϣϯ΁ฦ٫ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 25
  9. ࣮૷্ͷ՝୊ͱରࡦ • I/O ͷଳҬɾεϧʔϓοτ੍ޚ • => cgroup2 ͰϒϩοΫ I/OΛ੍ޚ •

    IOPS੍ޚ: riops ͱ wiops ʹ੍ͯݶ • εϧʔϓοτ੍ޚ: rbps ͱ wbps Ͱ੍ݶ • => ioprio_set() ͰI/Oεέδϡʔϥ૚Λ੍ޚ • CLASS_IDLE Λࢦఆ • ࠶഑ஔҠ࣌ʹίʔϧυσʔλ͕σΟεΫΩϟογϡΛফඅͯ͠͠·͏ • => posix_fadvise(POSIX_FADV_DONTNEED)ͰҠಈର৅ͷίʔϧυσʔλͷΩϟογϡΛΫϦΞ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 31
  10. mt-relocatord ͷઃఆ಺༰ • ࠶഑ஔॲཧͷεέδϡʔϦϯάͷઃఆ • ىಈ࣌ࠁ • ࣮ߦपظʢ1೔୯Ґʣ • ࠶഑ஔͷᮢ஋ઃఆ

    • ୯Ґ࣌ؒ౰ͨΓͷΞΫηε਺ɺߋ৽਺ • ࠷ऴΞΫηεɺ࠷ऴߋ৽͔Βݱࡏ࣌ࠁ·Ͱͷܦա࣌ؒ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 34
  11. ར༻Πϝʔδ // Create partitions # mkfs.xfs /dev/vda1 # mkfs.btrfs /dev/vdb1

    // Create MTFS managed Volumes # mtfsctl hot-volume create hv0 /export/sda1/www/ # mtfsctl cold-volume create cv0 /export/sdb1/www/ // Create mfsd # mtfsctl volume start hv0 cv0 # systemctl start mtfsd // Start mt-relocatord # systemctl start mt-relocatord 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 35
  12. ຊػߏͷར༻ύλʔϯ environment Hot Data Storage Cold Data Storage On-premises(Storage Device)

    SSD HDD AWS(Block Storage) EBS Provisioned IOPS SSD Cold HDD AWS(Shared Storage) EFS(Provisioned Throughput) EFS(Infrequent Access Storage Class) ※ Shared Storage Ҏ֎͸೚ҙͷϑΝΠϧγεςϜ͕ར༻Մೳ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 36
  13. ՝୊ • ίʔϧυσʔλҠಈͷ҆શੑͱੑೳ • FUSE ʹΑΔಈ࡞Φʔόʔϔου • ϑΝΠϧ਺૿େʹର͢Δ mt-relocatord ͷॲཧෛՙ

    • mt-relocatord ͕εέδϡʔϦϯάػೳΛ࣋ͭඞཁ͕͋Δ͔ 2019/04/13 ୈ4ճ Web System Architecture ݚڀձ | @nari_ex 37