Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Build Image Classification service with Amazon ...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Yuichiro Someya
November 22, 2016
Programming
2.9k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Build Image Classification service with Amazon ECS and GPU instances
Yuichiro Someya
November 22, 2016
More Decks by Yuichiro Someya
See All by Yuichiro Someya
にんげんがさき 基盤はあと / Developers over ML platform
ayemos
0
15k
機械学習をスモールスタートさせる方法 / small machine learning
ayemos
3
2.1k
アットホームな分析基盤の作り方 / Homemade Machine Learning Toolkits
ayemos
1
1k
サービス開発、機械学習、クラウド / the trinity of machine learning
ayemos
0
3.6k
成長を止めない機械学習のやり方 / Don't stop 'til you get enough (data).
ayemos
15
5.3k
AWS で加速する機械学習 / Accelerate Machine Learning with AWS
ayemos
1
360
クックパッドの機械学習基盤 2018 / Machine Learning Platform at Cookpad ~ 2018 ~
ayemos
15
21k
PyTorchとCaffe2とONNXと深層学習モデルのデプロイについて
ayemos
1
3.1k
クックパッドにおけるAWS GPUインスタンスの利用事例 / Powering by AWS GPU Instances in Cookpad Inc
ayemos
0
460
Other Decks in Programming
See All in Programming
ADKを使って簡単にAIエージェントを作ってみよう
k1mu21
0
260
Datadog × OpenTelemetry 入門と実践のあいだ
kn_to_maxpno
1
160
3Dシーンの圧縮
fadis
1
770
Mujeres en SEO Summit 2026 - Greatest Disaster Hits en Web Performance
guaca
0
180
技術記事、AIに書かせるか、自分で書くか? 〜それでも私が自分の手で書く理由〜 / #QiitaConference
jnchito
2
1.4k
肥大化するレガシーコードに立ち向かうためのインターフェース分離と依存の逆転 / JJUG CCC 2026 Spring
hirokunimaeta
0
550
メソッドのジェネリクスでGoの夢は広がるか? / Kyoto.go #65
utgwkk
3
760
Make SRE Operations Easier with Azure SRE Agent
kkamegawa
0
6k
ふつうのFeature Flag実践入門
irof
7
3.9k
Dataformのリポジトリを立ち上げるときにまずやること / dataform-day0-2026
snhryt
0
160
フロントエンドとバックエンドで「1文字」を揃えよう
youkidearitai
PRO
0
680
セキュリティの専門家じゃなくてもできる。「セキュリティ意識」をアップデートして サプライチェーン攻撃への耐性を高めよう。
tk3fftk
5
750
Featured
See All Featured
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.3k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.5k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
220
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
560
Game over? The fight for quality and originality in the time of robots
wayneb77
1
200
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
1
210
Paper Plane
katiecoart
PRO
1
51k
30 Presentation Tips
portentint
PRO
1
320
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
6k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
840
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
140
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Transcript
Build Image Classification service with AWS ECS and GPU instances
Yuichiro Someya @ Cookpad
• છ୩ ༔Ұ [Yuichiro Someya] • ౦େେֶӃ ܭࢉֶઐ߈ म࢜ •
'16 ৽ଔ @ ΫοΫύου • github.com/ayemos • twitter.com/kumasan_com echo `whoami`
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
ΫοΫύου • Ϩγϐɿ 250ສҎ্ • ݄࣍ؒར༻ऀɿ 6,000ສਓҎ্
• εϚϗͷࣸਅ͔Βྉཧ͚ͩΛࣗಈతʹऩू • Ұ෦ͷϢʔβʔ͚ʹݶఆతʹެ։த ྉཧ͖Ζ͘
• CaffeNetΛ ྉཧʗඇྉཧ ఆ͚ʹFine Tuningͨ͠Ϟσϧ • Caffe[1]Ͱֶश͞ΕͨϞσϧΛChainerͷCaffe emulatorͰಡΉ ref: http://docs.chainer.org/en/stable/reference/caffe.html
• ྨΧςΰϦΛ ྉཧʗඇྉཧ ʹมߋ͠ɺΫοΫύου্ͷ ྉཧࣸਅΛֶͬͯश <>IUUQDB⒎FCFSLFMFZWJTJPOPSH CookpadNet
• CookpadNetͲ͜ͰఆΛߦ͍ɺͦͷ݁ՌͲ͜ʹͲ͏͑Δ ͷ͔ʁ • ఆϞσϧΛΫϥΠΞϯτʹஔ͍ͯఆ • ϞσϧαΠζ͕େ͖͍(100MB~)ͷͰɺݱ࣮తͰͳ͍ • (αΠζͷখ͍͞ϞσϧΛݚڀத) •
ఆΛߦ͏ίϯϙʔωϯτΛ֎෦ʹஔ͘ • HTTP Serverʁ σʔλϑϩʔʗϫʔΫϑϩʔ
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO4FSWFS QZUIPO DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS SFTVMU\JT@GPPECPPM^
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS ը૾ͷΞοϓϩʔυ ը૾ॲཧ ఆ SFTVMU\JT@GPPECPPM^
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ 1045DMBTTJGZ\QIPUPCJOBSZ^ $MBTTJpDBUJPO4FSWFS QZUIPO 1045DMBTTJGZ\QIPUPCJOBSZ^ SFTVMU\JT@GPPECPPM^
DIBJOFS ը૾ͷΞοϓϩʔυ ը૾ॲཧ ఆ SFTVMU\JT@GPPECPPM^ >>> 300~500 ms <<<
• ը૾ॲཧͱϞσϧʹinferenceʹֻ͕͔ͦͦ࣌ؒ͜͜Δ (300~500ms) • APIαʔόʔ͔Βಉظతʹୟ͚ͳ͍ (Unicorn ͷ worker͕ਚ͖ͯ͠·͏) • Amazon
S3, SQSΛར༻ͨ͠ඇಉظͳఆॲཧϫʔΫϑϩʔ σʔλϑϩʔʗϫʔΫϑϩʔ
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
"NB[PO424 2VFVF %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS "NB[PO4 4UPSBHF
<6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4
4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPE <%PXOMPBE*NBHF> %#
$MJFOU "OESPJE J04 "1*4FSWFS SVCZ $MBTTJpDBUJPO8PSLFS QZUIPO DIBJOFS SFTVMU\JT@GPPECPPM^ "NB[PO4
4UPSBHF <6QMPBEQIPUPUPDMBTTJGZ> 1045JT@QIPUP\LFZ@PO@TTUSJOH^ "NB[PO424 2VFVF FORVFVF \LFZ@PO@TTUSJOH^ EFRVFVF \LFZ@PO@TTUSJOH^ 1045SFTVMU \LFZ@PO@TTUSJOH SFTVMU\JT@GPPECPPM^^ <%PXOMPBE*NBHF> ඇಉظʹఆॲཧ
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda
• ECS: Amazon EC2 Container Service • Docker ContainerΛEC2Ͱߏ͞ΕͨΫϥελʹஔ(Task) •
github.com/eagletmt/hako • ECSͷߏΛyamlϑΝΠϧͰཧ ECSͱGPUͱDockerͱ…
"8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2
desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS
"8471$ # cookpadnet-worker.yml scheduler: type: ecs region: ap-northeast-1 cluster: hako-production-g2
desired_count: 1 app: image: cookpadnet-worker-gpu cpu: 128 memory: 3072 memory_reservation: 2048 env: AWS_REGION: ap-northeast-1 COOKPADNET_ENV: production ... %PDLFS3FHJTUSZ ։ൃऀ EPDLFSQVTI IBLPEFQMPZ &$4 EPDLFSQVMM 5BTL DPPLQBEOFUXPSLFS DockerԽ͞ΕͨWorkerΛ hakoͰσϓϩΠ & ߏཧ
w XPSLFSͰ(16Λ༻ w ಉՁ֨ଳͷ$16Πϯελϯεͱൺͯ ഒͷੑೳࠩ w %PDLFS (16 GPU
• Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker
Container͔ΒGPU devicesΛૢ࡞͢Δҝ ContainerʹదͳLinux Capabilityͷઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT ESJWFSךQBUIכ04ח״殯ז
NVIDIA Docker • Docker CLIͷബ͍ϥούʔ • `docker run` ࣌ʹඞཁͳvolumeΛࣗಈతʹmount ͯ͘͠ΕΔ
NVIDIA Docker • Docker CLIͷബ͍ϥούʔ • `docker run` ࣌ʹඞཁͳvolumeΛࣗಈతʹmount ͯ͘͠ΕΔ
"NB[PO&$4דכ劢؟ه٦ز
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT (ಉҰόʔδϣϯ)
ubuntu EPDLFSDPOUBJOFS ཧ OWJEJB(16 VTFSMFWFMESJWFS LFSOFMNPEVMFT 㣐⡤鍑寸 (ಉҰόʔδϣϯ)
• Driver͕ඞཁ • nvidia-driverͷkernel module • ಉ͡όʔδϣϯͷuser-level drivers • Docker
Container͔ΒGPU devicesΛૢ࡞͢Δҝ ContainerʹదͳLinux Capabilityͷઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS &$4ͷ5BTLఆٛʹ͓͍ͯEFWJDFΦϓγϣϯະαϙʔτ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOa EFWJDFEFWOWJEJBEFWOWJEJBa EFWJDFEFWOWJEJBVWNEFWOWJEJBVWNa
HQVXPSLFS &$4ͷ5BTLఆٛʹ͓͍ͯEFWJDFΦϓγϣϯະαϙʔτ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS
ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS • capability શ։์ • rootͰ࣮ߦ͞Ε͍ͯΔdockerd্ͷcontainerͷதͰrootΛ औ͍ͬͯΔͷͰ৭ʑग़དྷΔ
EPDLFSSVOQSJWJMFHFEBMQJOFMBUFTUEBUFT • GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ
• GPUσόΠεಛघͳϑΝΠϧͱͯ͠ଘࡏ • ΞΫηε͢ΔͨΊʹಛఆͷCapabilityઃఆ͕ඞཁ ԾԽ v.s. Χʔωϧ EPDLFSSVOQSJWJMFHFEHQVXPSLFS • rootҎ֎ͷϢʔβʔͰ࣮ߦ͢Δ͜ͱʹ͢Δ
• DockerFileͰ `USER runner`
• ྉཧࣸਅͷࣗಈऩूαʔϏεΛ • CaffeNet[1]Λݩʹ࡞ͬͨྉཧը૾ೝࣝϞσϧͱ • Amazon SQS/S3 Ͱߏங͞Εͨσʔλϑϩʔͱ • Amazon
ECS (GPU instance) Λར༻ͯ͠ӡ༻͍ͯ͠Δ <>IUUQTHJUIVCDPN#7-$DB⒎F Agenda