Upgrade to Pro — share decks privately, control downloads, hide ads and more …

仮想化環境での利用者公平性

Avatar for Takuya ASADA Takuya ASADA
November 20, 2012

 仮想化環境での利用者公平性

Avatar for Takuya ASADA

Takuya ASADA

November 20, 2012
Tweet

More Decks by Takuya ASADA

Other Decks in Technology

Transcript

  1. ੑೳଌఆ؀ڥ • ̎୆ͷLinuxػɾ10G NIC • 1ʙ64୆ͷKVMήετɾ128ϓϩηεͷ netperf 1VM→VM͋ͨΓ128ϑϩʔ 2VM→VM͋ͨΓ64ϑϩʔ 4VM→VM͋ͨΓ32ϑϩʔ…

    64VM→VM͋ͨΓ2ϑϩʔ • TCP Request/ResponseϞʔυ 1byteͷύέοτΛϐϯϙϯ VMϗετ ήετ ςετػ 10G NIC 10G NIC ήετ netperf netserver netperf 12೥11݄20೔Ր༵೔
  2. ϋʔυʗιϑτͷεϖοΫ Distribution Ubuntu Server 12.10 Linux Kernel 3.5.0-18-generic QEMU-KVM 1.2.0

    Netperf 2.5.0 CPU(VMϗετ) Intel Core i7 980 (3.33GHz) Memory(VMϗετ) 24GB CPU(ςετػ) Intel Core i7 860 (2.8GHz) Memory(ςετػ) 8GB NIC Intel 82599(ixgbe) ෺ཧ6ίΞɺ࿦ཧ12ίΞ 12೥11݄20೔Ր༵೔
  3. macvtap <interface type='direct'> <mac address='52:54:00:6b:28:01'/> <source dev='eth1' mode='vepa'/> <model type='virtio'/>

    <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> 12೥11݄20೔Ր༵೔
  4. SR-IOV <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x04' slot='0x10'

    function='0x2'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </hostdev> 12೥11݄20೔Ր༵೔
  5. ൺֱର৅ͱͯ͠ͷ ࣮ػੑೳଌఆ • ̎୆ͷLinuxػɾ10G NIC • KVMήετͰnetserver࣮ߦ • 128ϓϩηεͷnetperf •

    TCP Request/ResponseϞʔυ 1byteͷύέοτΛϐϯϙϯ VMϗετ ςετػ 10G NIC 10G NIC netperf netserver netperf netserver 12೥11݄20೔Ր༵೔
  6. ඵؒτϥϯβΫγϣϯ਺ 0 1375.00 2750.00 4125.00 5500.00 1 2 4 8

    16 32 64 vhost-net SR-IOV baremetal VM਺ transaction/sec ෺ཧίΞ਺ʹ͍ۙลΓͰ ੑೳ࠷ߴ 12೥11݄20೔Ր༵೔
  7. ϨΠςϯγ 0 750.00 1500.00 2250.00 3000.00 1 2 4 8

    16 32 64 vhost-net SR-IOV baremetal VM਺ ϨΠςϯγ஋ ෺ཧίΞ਺ʹ͍ۙลΓͰ ϨΠςϯγ࠷ద 12೥11݄20೔Ր༵೔
  8. CPUෛՙ 0 25.00 50.00 75.00 100.00 1 2 4 8

    16 32 64 vhost-net SR-IOV baremetal VM਺ ύʔηϯςʔδ ෛՙΛVM਺෼ͷίΞ਺ʹ ͔͠෼ࢄͰ͖ͯͳ͍ 12೥11݄20೔Ր༵೔
  9. VMؒͷภΓ ʢ ඵؒτϥϯβΫγϣϯ਺ʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 vhost-net SR-IOV VM਺ ύʔηϯςʔδ vhost-net < 10% SR-IOV < 30% 12೥11݄20೔Ր༵೔
  10. VMؒͷภΓ ʢ ϨΠςϯγʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 vhost-net SR-IOV VM਺ ύʔηϯςʔδ vhost-net < 10% SR-IOV < 30% 12೥11݄20೔Ր༵೔
  11. VMؒͷภΓ ʢ CPUෛՙʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 vhost-net SR-IOV VM਺ ύʔηϯςʔδ vhost-net < 10% SR-IOV < 20% 12೥11݄20೔Ր༵೔
  12. RPS $ echo "f" > /sys/class/net/eth1/queues/rx-0/ rps_cpus $ echo 4096

    > /sys/class/net/eth1/queues/ rx-0/rps_flow_cnt $ echo 32768 > /proc/sys/net/core/ rps_sock_flow_entries 12೥11݄20೔Ր༵೔
  13. ඵؒτϥϯβΫγϣϯ਺ ʢvhost-netʣ 0 1375.00 2750.00 4125.00 5500.00 1 2 4

    8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ transaction/sec VM =< 4ͳΒ΍΍վળ 12೥11݄20೔Ր༵೔
  14. ඵؒτϥϯβΫγϣϯ਺ ʢvhost-net, RPSʣ 0 1375.00 2750.00 4125.00 5500.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ transaction/sec VM =< 4ͳΒRPSͰ ੑೳվળ 12೥11݄20೔Ր༵೔
  15. ඵؒτϥϯβΫγϣϯ਺ ʢSR-IOVʣ 0 1375.00 2750.00 4125.00 5500.00 1 2 4

    8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ transaction/sec VM =< 4ͳΒ΍΍վળ 12೥11݄20೔Ր༵೔
  16. ඵؒτϥϯβΫγϣϯ਺ ʢSR-IOV, RPSʣ 0 1375.00 2750.00 4125.00 5500.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ transaction/sec VM =< 4ͳΒRPSͰ ੑೳվળ 12೥11݄20೔Ր༵೔
  17. ϨΠςϯγ ʢvhost-netʣ 0 750.00 1500.00 2250.00 3000.00 1 2 4

    8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ ϨΠςϯγ஋ tpsͱ͍͍ͩͨಉ͡܏޲ 12೥11݄20೔Ր༵೔
  18. ϨΠςϯγ ʢvhost-net, RPSʣ 0 750.00 1500.00 2250.00 3000.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ ϨΠςϯγ஋ tpsͱ͍͍ͩͨಉ͡܏޲ 12೥11݄20೔Ր༵೔
  19. ϨΠςϯγ ʢSR-IOVʣ 0 750.00 1500.00 2250.00 3000.00 1 2 4

    8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ ϨΠςϯγ஋ tpsͱ͍͍ͩͨಉ͡܏޲ 12೥11݄20೔Ր༵೔
  20. ϨΠςϯγ ʢSR-IOV, RPSʣ 0 750.00 1500.00 2250.00 3000.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 baremetal VM਺ ϨΠςϯγ஋ tpsͱ͍͍ͩͨಉ͡܏޲ 12೥11݄20೔Ր༵೔
  21. CPUෛՙ ʢvhost-netʣ 0 25.00 50.00 75.00 100.00 1 2 4

    8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ VM =< 4ͳΒɺcpu = 1 ͷ࣌ΑΓෛՙΛ෼ࢄͰ ͖͍ͯΔʁ 12೥11݄20೔Ր༵೔
  22. CPUෛՙ ʢvhost-net, RPSʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ RPSͰΑΓෛՙΛ෼ࢄ 12೥11݄20೔Ր༵೔
  23. CPUෛՙ ʢSR-IOVʣ 0 25.00 50.00 75.00 100.00 1 2 4

    8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ 12೥11݄20೔Ր༵೔
  24. CPUෛՙ ʢSR-IOV, RPSʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ 12೥11݄20೔Ր༵೔
  25. VMؒͷภΓ ʢ ඵؒτϥϯβΫγϣϯ਺ʗvhost-netʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ cpu >= 8, VM >=16ͷ࣌ʹ 15%Λ௒͑ΔภΓ 12೥11݄20೔Ր༵೔
  26. VMؒͷภΓ ʢ ඵؒτϥϯβΫγϣϯ਺/vhost-net, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ cpu >= 8, VM >=16ͷ࣌ʹ 15%Λ௒͑ΔภΓ 12೥11݄20೔Ր༵೔
  27. VMؒͷภΓ ʢ ඵؒτϥϯβΫγϣϯ਺ʗSR-IOVʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ૯εϨου਺͕෺ཧCPU਺Λ ௒͑ͨลΓͰେ͖ͳภΓ 12೥11݄20೔Ր༵೔
  28. VMؒͷภΓ ʢ ඵؒτϥϯβΫγϣϯ਺ʗSR-IOV, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ૯εϨου਺͕෺ཧCPU਺Λ ௒͑ͨลΓͰେ͖ͳภΓ 12೥11݄20೔Ր༵೔
  29. VMؒͷภΓ ʢ ϨΠςϯγʗvhost-netʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ tpsͱ͍͍ͩͨಉ͡܏޲ 12೥11݄20೔Ր༵೔
  30. VMؒͷภΓ ʢ ϨΠςϯγʗvhost-net, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ 12೥11݄20೔Ր༵೔
  31. VMؒͷภΓ ʢ ϨΠςϯγʗSR-IOVʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ૯εϨου਺͕෺ཧCPU਺Λ ௒͑ͨลΓͰେ͖ͳภΓ 12೥11݄20೔Ր༵೔
  32. VMؒͷภΓ ʢ ϨΠςϯγʗSR-IOV, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ૯εϨου਺͕෺ཧCPU਺Λ ௒͑ͨลΓͰେ͖ͳภΓ 12೥11݄20೔Ր༵೔
  33. VMؒͷภΓ ʢ CPUෛՙʗvhost-netʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ͋·Γେ͖ͳภ Γ͸ݟΒΕͳ͍ 12೥11݄20೔Ր༵೔
  34. VMؒͷภΓ ʢ CPUෛՙʗvhost-net, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 64 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ ͋·Γେ͖ͳภ Γ͸ݟΒΕͳ͍ 12೥11݄20೔Ր༵೔
  35. VMؒͷภΓ ʢ CPUෛՙʗSR-IOVʣ 0 25.00 50.00 75.00 100.00 1 2

    4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ VM਺ͷ্ঢͰ एׯ૿Ճ 12೥11݄20೔Ր༵೔
  36. VMؒͷภΓ ʢ CPUෛՙʗSR-IOV, RPSʣ 0 25.00 50.00 75.00 100.00 1

    2 4 8 16 32 cpu1 cpu2 cpu4 cpu8 cpu16 VM਺ ύʔηϯςʔδ VM਺ͷ্ঢͰ एׯ૿Ճ 12೥11݄20೔Ր༵೔
  37. ൺֱ݁Ռ • Ծ૝ϚγϯͰ࣮ػੑೳͷ൒෼Ҏ্Λग़ͤͨͷ͸VM਺͕͘͝গͳ͘ vCPU਺͕͔ͳΓଟ͍࣌ͷΈ → ͜ͷ৚݅Ͱ͔͠RPSͰੑೳ͕ग़ͳ͍ • ૯vCPU਺Λଟͯ͘͠΋ҙ֎ͱੑೳ͸Լ͕Βͳ͍ • SR-IOVͷ৔߹ʹVF਺͕෺ཧίΞ਺Λ௒͑ͨลΓͰ͔ͳΓੑೳʹภ

    Γ͕ͰΔ • NIC্ͷΩϡʔʹରͯ͠ड৴ॲཧʹׂΓ౰ͯΔίΞ͕ෆ଍ͯ͠ ͜ͷΑ͏ͳঢ়گʹͳΔͱਪଌ • vhost-netͰ͸ͦͷΑ͏ͳݱ৅͕ݟΒΕͣɺ૯vCPU਺Λ૿΍ͯ͠΋ ൺֱతެฏʹॲཧ͞Ε͍ͯΔΑ͏ʹݟ͑Δ 12೥11݄20೔Ր༵೔
  38. cd /sys/fs/cgroup/cpu $ mkdir grp_a $ echo ‘12254’ > grp_a/tasks

    $ echo ‘2184’ > grp_a/tasks $ echo ‘512’ > grp_a/cpu.shares $ mkdir grp_b $ echo ‘9012’ > grp_b/tasks $ echo ‘1024’ > grp_b/cpu.shares άϧʔϓ࡞੒ άϧʔϓॴଐϓϩη εΛ௥ՃʢPIDʣ άϧʔϓ಺ͷλεΫͰ࢖༻ग़དྷ ΔCPU࣌ؒͷ૬ରత഑෼஋ 12೥11݄20೔Ր༵೔
  39. virshͱcgroup $ virsh schedinfo vm0 Scheduler : posix cpu_shares :

    1024 vcpu_period : 100000 vcpu_quota : -1 $ virsh schedinfo --set cpu_shares=512 vm0 12೥11݄20೔Ր༵೔
  40. άϧʔϓͷ࡞Γํ $ virsh start <VM໊> $ mkdir /sys/fs/cgroup/cpu/grp_a $ cat

    /sys/fs/cgroup/cpu/libvirt/qemu/<VM໊>/tasks > tmp $ cat /sys/fs/cgroup/cpu/libvirt/qemu/<VM໊>/ vcpu*/tasks >> tmp $ cat tmp > /sys/fs/cgroup/cpu/grp_a/tasks 12೥11݄20೔Ր༵೔
  41. net_cls $ tc class add dev virbr0 parent 10: classid

    10:1 htb rate 24Mbit $ echo 0x100001 > /sys/fs/cgroup/net_cls/ grp_a/net_cls.classid ϓϩηε͔Βૹ৴͞ΕΔύέοτʹtcͷ λάΛ෇͚ΔࣄʹΑΓଳҬ੍ݶ͕ग़དྷΔ 12೥11݄20೔Ր༵೔