Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
#29 “I’m Not Dead Yet! The Role of the Operatin...
Search
cafenero_777
June 19, 2023
Technology
0
150
#29 “I’m Not Dead Yet! The Role of the Operating System in a Kernel-Bypass Era”
HotOS '19
https://dl.acm.org/doi/10.1145/3317550.3321422
cafenero_777
June 19, 2023
Tweet
Share
More Decks by cafenero_777
See All by cafenero_777
#51 “Empowering Azure Storage with RDMA”
cafenero_777
3
530
#49 “Gray Failure: The Achilles’ Heel of Cloud-Scale Systems”
cafenero_777
2
120
#50 “Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction”
cafenero_777
0
140
#33 “Destroying networks for fun (and profit)”
cafenero_777
0
100
#34 “MTPSA: Multi-Tenant Programmable Switches”
cafenero_777
0
75
#37 “Bluebird: High-performance SDN for Bare-metal Cloud Services”
cafenero_777
1
150
#39 “Profiling a warehouse-scale computer”
cafenero_777
0
57
#23 “VFP: A Virtual Switch Platform for Host SDN in the Public Cloud”
cafenero_777
0
260
#24 “Ananta: Cloud Scale Load Balancing”
cafenero_777
0
310
Other Decks in Technology
See All in Technology
Redshift認可、アップデートでどう変わった?
handy
1
120
20251203_AIxIoTビジネス共創ラボ_第4回勉強会_BP山崎.pdf
iotcomjpadmin
0
170
1万人を変え日本を変える!!多層構造型ふりかえりの大規模組織変革 / 20260108 Kazuki Mori
shift_evolve
PRO
3
380
AWSの新機能をフル活用した「re:Inventエージェント」開発秘話
minorun365
2
530
Kiro を用いたペアプロのススメ
taikis
4
2.2k
2025年 山梨の技術コミュニティを振り返る
yuukis
0
140
Keynoteから見るAWSの頭の中
nrinetcom
PRO
1
150
まだ間に合う! Agentic AI on AWSの現在地をやさしく一挙おさらい
minorun365
19
3.4k
ESXi のAIOps だ!2025冬
unnowataru
0
460
SES向け、生成AI時代におけるエンジニアリングとセキュリティ
longbowxxx
0
280
Bedrock AgentCore Evaluationsで学ぶLLM as a judge入門
shichijoyuhi
2
310
AWS re:Inventre:cap ~AmazonNova 2 Omniのワークショップを体験してきた~
nrinetcom
PRO
0
120
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
77
5.2k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
0
97
Un-Boring Meetings
codingconduct
0
170
ラッコキーワード サービス紹介資料
rakko
0
1.9M
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
0
980
Designing for Performance
lara
610
70k
Crafting Experiences
bethany
0
25
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
70k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Marketing to machines
jonoalderson
1
4.5k
A better future with KSS
kneath
240
18k
How to train your dragon (web standard)
notwaldorf
97
6.5k
Transcript
Research Paper Introduction #29 “I’m Not Dead Yet! The Role
of the Operating System in a Kernel-Bypass Era” ௨ࢉ#84 @cafenero_777 2021/10/14 1
Agenda • ରจ • ֓ཁͱಡ͏ͱͨ͠ཧ༝ 1. Introduction 2. Kernel-Bypass Accelerators
in the Datacenter 3. Evolving the Datacenter OS for Kernel Bypass 4. The Demikernel 5. Future Work 6. Related Work 7. CONCLUSION 2
ରจ • I’m Not Dead Yet! The Role of the
Operating System in a Kernel-Bypass Era • Irene Zhang, Jing Liu, Amanda Austin, Michael Lowell Roberts, Anirudh Badam • Microsoft Research, University of Wisconsin, University of Texas • HotOS '19 • https://dl.acm.org/doi/10.1145/3317550.3321422 3
֓ཁͱಡ͏ͱͨ͠ཧ༝ • ֓ཁ • DCNW༻్ͰͷOS”ऴᖼ (demise)”͍ͯ͠Δʁʂ • RDMA/DPDKߴ͕ͩநԽΛࡴ͢ • ৽͍͠I/OநԽ:
DemikernelͷఏҊ • ಡ͏ͱͨ͠ཧ༝ • NWߴԽͷͲͷํʁ • ۙະདྷͷ: library OS? • ΩϟονʔͳtitleͩͬͨͷͰ 4
1. Introduction 5 • աڈ10ͷαʔόI/OߴԽ V.S. CPUੑೳ • TCP-o ff
l oad, SmartNIC/SR-IOV, Comp./Enc./ML on FPGA • kernel bypassٕज़ͰI/OΦʔόʔϔουΛݮ • ػೳఏڙ͢Δ͕ɺநԽϨΠϠʔ͕ແ͍ • ʢྫɿsocket, fi le, pipeʣ • ࢄϝϞϦɾࢄετϨʔδ w/ RDMA • systemΛHWʹ߹ΘͤͯΧελϚΠζ -> େมʂ • OSΛͲ͏ม͑Δ͖͔ɻ৽OSΞʔΩςΫνϟDemikernelͰઃܭٞ͠
2. Kernel-Bypass Accelerators in the Datacenter 6 • Kernel Bypass
• KernelΦʔόʔϔουۃখͰ࠷ͷύέοτసૹΛࢦ͢ • I/Fػೳଘࡏ͠ͳ͍ • ϓϩάϥϚ͕OSಉͷػೳՃɾσόΠεຖʹػೳՃ • ྫɿ • DPDK: جຊతͳI/OσόΠεػೳΛنఆ • Arrakis: HWԾԽٕज़(SR-IOV)Ͱ࣮ • RDMA: verbs I/Frdmacm I/F (~socket)༷͋Δ͕ɾɾɾ • FPGA: ԿͰͰ͖Δ͕࣮༻ੑuse case࣍ୈɾɾɾ https://www.dpdk.org/wp-content/uploads/sites/35/2017/04/DPDK-India2017-RamiaJain-ArchitectureRoadmap.pdf
3. Evolving the Datacenter OS for Kernel Bypass 7 •
UserۭؒͰͷIʗOॲཧ࠷దԽ • طଘLibrary OSʹڞ༗ɾଟॏԽͷΈ͕͋Δʢ͕ॏ͍ʣ • ಁաతϝϞϦ֬อʢi.e. DDIO, NIC<->LLCʣ͕ແ͍ͷͰ࠶࣮ • ޮతͳநԽ • I/Oॲཧ͕͔ͬͨࠒͷઃܭʢͷ໊ʣV.S. ݱɿRedisreadͰ2us • நԽͱੑೳͷڱؒ • طଘPOSIX APIҡ࣋ߋʹΦʔόʔϔου͕͔͔Δ • طଘLibrary OSͱͷػೳͷҧ͍ • طଘɿkernel I/FσόΠεػೳ͕ۉҰͰ͋Δલఏ • ࠓճɿKernel-Bypass framework (HWͱͷSW/kernelͷ”ྑ͍ͱ͜औΓ”తͳʣΛೖΕ͍ͨ https://www.dpdk.org/wp-content/uploads/sites/35/2017/04/DPDK-India2017-RamiaJain-ArchitectureRoadmap.pdf
4. The Demikernel (1/3) 8 • Architecture: C/D pathͷ •
C: network/ fi le open, ͯ͘ྑ͍ -> طଘKernel • D: network/storage/memoryͷread/write -> LibOS + accelerator • I/O queueͱͯ͠நԽ • HWී௨queueΛར༻ -> ͜ΕΛͦͷ··நԽ • atomic data unitͱͯ͠ѻ͑Δʢ༨ͳͪൃੜ͠ͳ͍ʣ • σόΠεʹґଘ͠ͳ͍ߴϨϕϧநԽKernel-BypassϨΠϠʔ
4. The Demikernel (2/3) 9 • Syscall interface • C:
socket(): queue descriptorΛฦ͢ (not fi le descriptor) • C: packet typeͰ fi lter(): BPF frameworkΛఆ • C: merge(): I/OΩϡʔͷϚʔδ • C: sort(): ༏ઌʹԠͯ͡I/OΩϡʔΛ͏ • C: map(): P4తͳෳࡶͳpktॲཧ࣮Ͱ͖ͦ͏ • D: push/pop • ૢ࡞ൣғͷࢦఆ • non-blockingॲཧ. wait_*()Ͱfetch
4. The Demikernel (3/3) 10 • qtoken: Ұͭͷqૢ࡞ຖʹݻ༗ • epollΛվળͰ͖Δ
• wait_*()͕σʔλΛฦ͢->ଞͷsyscallݺͣʹʢۭৼΓʣࡁΉ • pop completion: pop͕ྃͨ͠Βthread͕ى͖Δɻbusy pollingཁΒͳ͍ • zero copy: • 1. ಁաతϝϞϦ֬อɿLibOS͕IOMMUϝϞϦొΛߦ͏ • 2. ΞϓϦέʔγϣϯͱI/OσόΠεؒͰͷڞ༗ϝϞϦͷௐΛͳΔ͘ݮΒ͢ • Free protect: ΞϓϦόοϑΝ։์໋ྩ -> LibOS͕I/Oऴྃ·Ͱ͔ͬͯΒ։์ • ʢैདྷಉ༷ʣॻ͖ࠐΈอޢແ͠ -> όοϑΝมߋʢwriteʣI/Oͭඞཁ͋Γ • DCར༻Ͱແ͠ɺͱ͍͏ओுɻྫɿRedisͰput requestຖʹόοϑΝׂޙɺσʔλߏମͰͦͷϙΠϯλʹࢦఆ
5. Future Work • OS Design • ಛఆΞΫηϥϨʔλͷෆػೳΛLibOSͰิʢDPDKͳΒNWελοΫશൠʣ • ΞΫηϥϨʔλͷछྨ͕ଟ͍߹શ෦LibOS͕ίʔυΛ࣋ͭʢʂʣLibOSͱɾɾɾʁ
• Network Protocols • I/OͷΑΓ൚༻తͳdata unitׂΛࢦ͢ɻ • طଘͷϑϨʔϜϫʔΫʢTCPHTTPSͳͲʣͳΒड৴ଆͰ࠶ߏͰ͖Δ͕ɺ൚༻ੑ੍͕ݶ͞Εͯ͠·͏ • File System and Storage • طଘFS (ext4ͳͲ)ΛLibOS (γϯάϧΞϓϦέʔγϣϯ)Ͱ͏ʹΦʔόʔϔου͕େ͖͗͢ • ΞΫηϥϨʔλʹదͨ͠FS? 11
6. Related Work • OS • Arraakis, IXΑΓநͷߴ͍I/F • ϢʔβϨϕϧͷOS֦ுͰHWӅṭ
-> NWελοΫͳͲͷOSػೳແ͍ • I/O Accelerated System • POSIX I/Fʹҡ࣋ͰඇޮԽɻྫɿmTCPͩͱDPDKΑΓlatency͔͔Δ • NW/TCPॲཧΛPMD/NICͰΔɺ੍͘͠ޚΛOS͔Β֎͢ํʢQUICͳͲʣ • I/O Accelerated Application • RDMAΛͬͨϦϞʔτϝϞϦͷϨΠςϯγʔΞϓϦέγϣʔϯ 12
7. Conclusion • I/Oੑೳେ෯্ʹ͍ͭͨ͘ΊʹKernel-bypass acceleratorsΛ͏ • Kernel-bypassͷͨΊOS/kernelͷػೳ͕͑ͳ͍ɺI/OநԽ͕ग़དྷͳ͍ • ্هͷΪϟοϓΛຒΊΒΕΔLibOSΛઃܭ͠ɺI/OநԽΛٞ 13
ࡾߦ·ͱΊ 14 • I/Oੑೳେ෯্ʹ͍ͭͨ͘ΊʹKernel-bypass acceleratorsΛ͏ • Kernel-bypassͷͨΊOS/kernelͷػೳ͕͑ͳ͍ɺI/OநԽ͕ग़དྷͳ͍ • ্هͷΪϟοϓΛຒΊΒΕΔLibOSΛઃܭ͠ɺI/OநԽΛٞ
EoP 15