Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
メルカリのマーケット健全化施策を支えるML基盤
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Hirofumi Nakagawa/中河 宏文
May 23, 2018
Programming
10
9.2k
メルカリのマーケット健全化施策を支えるML基盤
Hirofumi Nakagawa/中河 宏文
May 23, 2018
Tweet
Share
More Decks by Hirofumi Nakagawa/中河 宏文
See All by Hirofumi Nakagawa/中河 宏文
IoTデバイスでMLモデルを動かす技術
hnakagawa
0
220
Kanazawa_AI.pdf
hnakagawa
0
210
メルカリ写真検索における Amazon EKS の活用事例と プロダクトにおけるEdgeAI technologyの展望
hnakagawa
5
9.1k
メルカリの写真検索を支えるバックエンド CCSE 2019 version
hnakagawa
0
350
メルカリ写真検索における Amazon EKS の活用事例
hnakagawa
6
29k
メルカリの写真検索を支えるバックエンド
hnakagawa
1
1.2k
Mercari ML Platform
hnakagawa
1
17k
mlct.pdf
hnakagawa
2
2.1k
機械学習によるマーケット健全化施策を支える技術
hnakagawa
0
270
Other Decks in Programming
See All in Programming
How to stabilize UI tests using XCTest
akkeylab
0
130
へんな働き方
yusukebe
5
2.6k
20260228_JAWS_Beginner_Kansai
takuyay0ne
5
580
AI Assistants for Your Angular Solutions
manfredsteyer
PRO
0
150
AI時代のシステム設計:ドメインモデルで変更しやすさを守る設計戦略
masuda220
PRO
6
1.1k
CSC307 Lecture 15
javiergs
PRO
0
260
What Spring Developers Should Know About Jakarta EE
ivargrimstad
0
440
PHPのバージョンアップ時にも役立ったAST(2026年版)
matsuo_atsushi
0
150
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
680
AI 開発合宿を通して得た学び
niftycorp
PRO
0
150
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
750
Linux Kernelの1文字のミスで 権限昇格ができた話
rqda
0
1.8k
Featured
See All Featured
A designer walks into a library…
pauljervisheath
210
24k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
150
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
150
Java REST API Framework Comparison - PWX 2021
mraible
34
9.2k
Abbi's Birthday
coloredviolet
2
5.4k
A Modern Web Designer's Workflow
chriscoyier
698
190k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
The Cost Of JavaScript in 2023
addyosmani
55
9.8k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
760
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
We Have a Design System, Now What?
morganepeng
55
8k
Transcript
ϝϧΧϦͷϚʔέοτ݈શԽ ࢪࡦΛࢧ͑ΔMLج൫ Mercari ML Ops Night Vol.1 hnakagawa
ࣗݾհ • Hirofumi Nakagawa (hnakagawa) • 20177݄ೖࣾ • ॴଐSRE •
σόΠευϥΠό։ൃ͔Βϑϩϯ τΤϯυ։ൃ·ͰΔԿͰ • NOT MLΤϯδχΞ • https://github.com/hnakagawa
͓ࣄ • ML Platform։ൃ • MLΤϯδχΞͱSREͷεΩϧΪϟοϓΛຒΊ Δ • ML Reliability,
SysML?, MLOps? • SREͷཱ͔ΒMLγεςϜͷࣗಈԽΛߦ͏
ML Platform • ͷML Platform • kubernetesϕʔε • ϩʔΧϧڥͱΫϥελڥͷ ࠩΛநԽ͢Δ
• ศརAPI܈ • طଘͷML FrameworkΛ༻͠ ؆୯ʹTraining/ServingΛߦ͏ ڥΛఏڙ
ͦͷ͏ͪOSSͰެ։༧ఆ(ଟ
ࣄྫ ϦΞϧλΠϜࢹγεςϜ • ௨শ Lovemachine • ML Platform্ʹ࣮͞Ε͍ͯΔ .-1MBUGPSN USBJOJOHDMVTUFS
-PWFNBDIJOF ($4 GKE PubSub .-1MBUGPSN TFSWJOHDMVTUFS -PWFNBDIJOF
Model Training & Serving Workflow
.-1MBUGPSN USBJOJOHDMVTUFS Workflow for Production $* .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ +PC
+PC ɾɾ 3&45 "1* 4USFBNJOH 5' 4FSWJOH ɾɾɾ
.-1MBUGPSN USBJOJOHDMVTUFS Training Workflow $* .PEFM3FHJTUSZ +PC +PC ɾɾɾ 1.
GitHubͷpushΛτϦΨʹtrainingΛىಈ 2. Training͞ΕͨModelModel Registry ্͕Δ
Serving Workflow .-1MBUGPSN TFSWJOHDMVTUFSGPSUFTU .PEFM3FHJTUSZ ɾɾ 3&45 "1* 4USFBNJOH 5'
4FSWJOH ɾɾɾ 1. Model RegistryΛࢹͯࣗ͠ಈͰModel ΛServing 2. Serving&Test͕ޭ͢Δͱຊ൪༻k8s manifestΛग़ྗ
Model Serving APIͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM 5' .PEFM 'MBTL
4, .PEFM 4, .PEFM 4, .PEFM gRPC .FSDBSJ"1* REST FlaskͰલॲཧΛߦ͍ ཪͷTensorFlow Servingʹ͍͛ͯΔ
Model Serving API Streaming ver ͷߏྫ 5FOTPS'MPX 4FSWJOH 5' .PEFM
5' .PEFM .-1MBUGPSN 'SBNFXPSL PS "QBDIF#FBN 4, .PEFM 4, .PEFM 4, .PEFM gRPC PubSub
TensorFlow Serving • TensorFlow project͕ఏڙͯ͠ ͍ΔServingڥ • PythonॲཧܥΛհͣ͞ʹTFͷ modelΛservingͰ͖Δ •
ඪ४ͷ࣮ͰgRPCͰAPIΛ ఏڙ
ModelͱίϯςφɾΠϝʔδ • ڊେͳML ModelΛίϯςφɾΠϝʔδʹؚΊ Δ͔൱͔ • ؚΊͳ͍ͷͰ͋ΕԿॲʹஔ͢Δ͔ • ϙʔλϏϦςΟੑͱϩʔυ࣌ؒͷτϨʔυΦϑ •
ྑ͍ΞΠσΟΞ͕͋Εڭ͑ͯԼ͍͞…
௨ৗͷAPIͱҧ͏ • ѻ͏ϦιʔεɺModelαΠζ͕େ͖͘ͳΔ ߹͕ଟ͍(ඦMBʙGB) • CPUɾϝϞϦϦιʔεͷফඅ͕ܹ͍͠ • ߹ʹΑͬͯGPU͏
ϝϞϦফඅ • LovemachineͷPython࣮෦࣮ߦ࣌ʹ 2GBϝϞϦΛফඅ͢Δˠࠓޙ͞Βʹ૿͑Δ༧ ఆ͋Δ • Scikit-learnͰهड़͞ΕͨTF-IDFͷલॲཧ෦ ͕େ͖͘ͳΔࣄ͕ଟ͍
Pythonͱฒྻੑ • વThread͕͑ͳ͍(GILͷͨΊ) • ϓϩηεຖʹModelΛϩʔυ͢Δͱඞཁͳϝ ϞϦαΠζ͕େ͖͘ͳΔˠ Blue-Green DeployͷোʹͳΔ
ਖ਼PythonͰͷServing Πϯϑϥతʹਏ͍ࣄ͕ଟ͍…
ϝϞϦΛݡ͘͏ • fork͢ΔલʹmodelΛϩʔυ͠Copy on Write Λޮ͔͢ • k8sͷone process per
containerηΦϦ͋ ͑ͯഁ͍ͬͯΔ
Copy On Writeͷ෮श ϝϞϦ ϓϩηε ࢠϓϩηε 2.fork 1BHF" 1.allocation ಉ͡ྖҬΛࢀর
ϓϩηε͕ϝϞϦͷ༰Λ ॻ͖͑Δͱ… ϝϞϦ ϓϩηε ࢠϓϩηε 1BHF" 1BHF# OS͕ผͷྖҬΛAllocationͯ͠ݩσʔλΛίϐʔ͢Δ ผͷྖҬΛࢀর
Current Issues • ਓؒͷߦಈΛ૬खʹ͍ͯ͠Δҝɺσʔλͷ ͕มΘΓ͔ͬͨ͢Γɺ༧֎ͷ͕ൃ ੜͨ͠Γͯ͠ɺରԠ͠ଓ͚Δඞཁ͕͋Δ ˠ ML Model࡞ऀʹෛ୲ֻ͕͔Γଓ͚Δ ˠ
SREͱͯࣗ͠ಈԽΛؚΜͩΈͰղܾ ͍ͨ͠
In Progress • ࣾͷσʔλ͔ΒEmbedding͢Δ࣮Λίϯ ϙʔωϯτԽ • ಛఆͷΛղܾ͢ΔϞσϧߏஙΛ͋Δఔ ࣗಈԽ ˠࣾͷղܾʹಛԽͨ͠ઐ༻ͷAutoMLత ͳԿ͔
AutoFlow(Ծ) 'FBUVSF&YUSBDUJPO $PNQPOFOUT $MBTTJpDBUJPO $PNQPOFOUT $PODBUFOBUJPO $PNQPOFOUT .PEFM #VJMEFS $PNQPOFOUT
3FHJTUSZ Ϋϥελ্ͰϞσϧͷࣗಈߏஙͱϋΠύʔύϥ ϝʔλͷࣗಈௐΛߦ͏
·ͱΊ • MLʹগ͠௨ৗͱҧ͏Πϯϑϥ͕ඞཁʹͳΔ ˠ·ͩϕετɾϓϥΫςΟε͔Βͳ͍ • ͦͦMLͳػೳΛຊ֨ӡ༻͠Α͏ͱ͢Δ ͱɺେ෯ͳࣗಈԽɾΈԽΛਐΊͳ͍ͱ্ ख͘ߦ͔ͳ͍
͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠!!
We are Hiring!!
SRE ML Reliability • SysML? MLOps? ৽͍͠Job description • SREεΩϧ+MLͷجૅࣝ
• MLΠϯϑϥͷࣗಈԽɾΈԽΛਪ͠ਐΊͯ ͘ΕΔਓࡐ • ͪΖΜଞͷ৬छઈࢍืूத!!
ৄࡉͪ͜Β https://careers.mercari.com/