Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20180701_CVPR2018_reading_YoheiKIKUTA
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
yoppe
July 01, 2018
Science
1.3k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
20180701_CVPR2018_reading_YoheiKIKUTA
Event HP:
https://kantocv.connpass.com/event/88613
yoppe
July 01, 2018
More Decks by yoppe
See All by yoppe
20211023_recsys2021_paper_reading_YoheiKikuta
diracdiego
1
520
20201121_oldpaperreading_computing_machinery_and_intelligence
diracdiego
0
190
20200906_ACL2020_metric_for_ordinal_classification_YoheiKikuta
diracdiego
1
1.3k
20191102_ACL2019_adversarial_examples_in_NLP_YoheiKIKUTA
diracdiego
2
1.5k
20190223_nlpaperchallenge_CV_4.3to5.5
diracdiego
2
860
20180414_WSDM2018_reading_YoheiKIKUTA
diracdiego
0
750
20180306_NIPS2017_DeepLearning
diracdiego
4
6k
20180215_MLKitchen7_YoheiKIKUTA
diracdiego
0
480
20180210_Cookpad_TechConf2018_YoheiKIKUTA
diracdiego
5
1.3k
Other Decks in Science
See All in Science
Non-Gaussian, nonlinear causal discovery with hidden variables and application
sshimizu2006
0
140
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
280
機械学習 - SVM
trycycle
PRO
2
1.1k
大黒市で発生した大規模インシデント の ポストモーテムから読み解く、 記憶媒体消去の大切さ
shucho0103
0
200
水耕栽培:古代の知恵から宇宙農業まで
grow_design_lab
0
140
MATSUO Makiko
genomethica
0
150
フィードフォワードニューラルネットワークを用いた記号入出力制御系に対する制御器設計 / Controller Design for Augmented Systems with Symbolic Inputs and Outputs Using Feedforward Neural Network
konakalab
0
140
俺たちは本当に分かり合えるのか? ~ PdMとスクラムチームの “ずれ” を科学する
bonotake
2
2.4k
20251212_LT忘年会_データサイエンス枠_新川.pdf
shinpsan
0
290
Physical AIを支えるWeights & Biases
olachinkei
1
390
医療 LLM ベンチマークの現在地:多面的評価 と日本ローカライズ
analokmaus
1
530
ハミルトン・ヤコビ方程式の解の性質と物理的意味
enakai00
0
690
Featured
See All Featured
Designing for Timeless Needs
cassininazir
1
260
AI: The stuff that nobody shows you
jnunemaker
PRO
8
730
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
260
Documentation Writing (for coders)
carmenintech
77
5.4k
Color Theory Basics | Prateek | Gurzu
gurzu
0
370
Prompt Engineering for Job Search
mfonobong
0
350
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
23k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.8k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.3k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
620
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
620
Transcript
MobileNetV2: Inverted Residuals and Linear Bo!lenecks ୈ46ճ ίϯϐϡʔλϏδϣϯษڧձ@ؔ౦ 20180701 ٠ా
ངฏ (@yohei_kikuta) Event URL: https://kantocv.connpass.com/event/88613/, paper: https://arxiv.org/abs/1801.04381
ࣗݾհ name: Yohei KIKUTA company: Cookpad Inc. twitter: @yohei_kikuta GitHub:
yoheikikuta resume: https://github.com/yoheikikuta/resume blog: ݪཧతʹՄೳ https://yoheikikuta.github.io/ 2
·ͱΊ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕͷؒʹνϟϯωϧΛ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building
block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟͨ݁͠Ռͷߏ 3. ࣮ݧͰ NASNet ΑΓߴͰಉҎ্ͷ݁Ռ 4. ML Kit Λͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά Blog URL: [https://techlife.cookpad.com/entry/2018/07/05/090000 3
ML Kit Λͬͨ MobileNetV2 ʹΑΔྉཧ/ඇྉཧఆ https://techlife.cookpad.com/entry/2018/07/05/090000 GIF file: https://i.imgur.com/DRHVejp.gifv 4
എܠ 5
mobile ʹࡌΔܰྔͳϞσϧΛ࡞Γ͍ͨ ػցֶशͷ mobile Ҡߦ͕ਐΜͰ͍͖ͦ͏ ML Kit Create ML
ͳͲͷొͰػӡ͕ߴ·͍ͬͯΔ εϐʔυϓϥΠόγʔεέʔϥϏϦςΟͷ؍Ͱॏཁ deep learning ʹ͓͍ͯ architecture ୳ࡧͷҰͭͷํੑ ML Kit: https://developers.google.com/ml-kit/, Create ML: https://developer.apple.com/documentation/create_ml 6
mobile ༻ͷϞσϦϯάͷํੑ architecture ʹۭؔͯؒ͠ํͱνϟϯωϧํͷऔΓѻ͍ Λ͢Δͷ͕Ұͭͷைྲྀ → {separable, group, shuffle} convolution
ͳͲ͕දత ଞʹܰྔԽͷٕज़͕͋Δ͕ຊจͷείʔϓ֎ʢซ༻Մʣ - ਫ਼Λམͱ͢͜ͱʹΑΔσʔλαΠζͷݮ - ྔࢠԽූ߸ԽʹΑΔσʔλαΠζͷݮ - ৠཹͳͲΛ༻͍ͨΑΓখ͞ͳϞσϧͷస 7
mobile ༻ͷϞσϦϯάͷํੑ architecture ͷ୳ࡧ൚༻తͳͷΛରͱ͢Δ߹͕ଟ͍ ʢ࣮ݧʹΑΔݕূΛܦͯزڐ͔ଞͷཁૉʹ tune ͞ΕΔ͕ʣ ଞͷཁૉɺຊจͰ׆ੑԽؔɺʹґڌͨ͠ architecture Ͱ͋ΕͦͷҙຯͰಛघ͕ͩΑΓޮతʹͳΓಘΔ
ຊจͰ ReLU ͷಛੑʹجͮ͘ building block ΛఏҊ 8
MobileNetV2 ʹࢸΔྲྀΕͱͦͷपล 9
architecture ͷมભ ಛతͳߏΛͨΒͨ͠ CNN ͷϞσϧΛҰ෦հ - Network In Network: Gloval
average pooling - VGG: Stacking of 3 3 convolution - ResNet: Residual connection - Inception(V3): Inception module - SqueezeNet: Fire module - ENet: Early stage down sampling - DenseNet: Dense convolution - Xception: Separable convolution - SENet: Squueze and excitation block 10
architecture ͷࣗಈ୳ࡧ convolution ͳͲͷجຊతͳԋࢉͷύλʔϯΛ͍͔ͭ͘४උ ͦΕΒΛ߹ͤͯ࠷దͳ building block Λ୳ࡧ - NASNet
ڧԽֶशͷΈͰ࠷దԽ - AmoebaNet ਐԽܭࢉͷΈͰ࠷దԽ - DARTS ࿈ଓ؇ͱͯ͠ޯ๏ϕʔεͰ࠷దԽ NASNet: https://arxiv.org/abs/1707.07012, AmoebaNet: https://arxiv.org/abs/1802.01548, DARTS: https://arxiv.org/abs/1806.09055 11
architecture ͷࣗಈ୳ࡧ ࣮ݧʹΑΔൺֱʢImageNet classificationʣ ਤ https://arxiv.org/abs/1806.09055 ͔ΒҾ༻ 12
ܰྔͳ architecture ͷ୳ࡧ ࣗಈ୳ࡧڧྗ͕ܾͩΊΒΕͨԋࢉͷͰͷ߹ͤ → ܭࢉྔతʹઙ͍ͰΉ architecture ௐ͍͢ ͜ΕΑΓߴੑೳͳϞσϧΛ࡞Δʹ৽͍͠ΞΠσΞ͕ඞཁ MobileNetV2
Ͱ ReLU ͷಛʹணͨ͠ߏΛݕ౼ → ReLU ʹಛԽͨ͠৽͍͠ building block ΛߟҊ ൚༻తͰͳ͍͔͠ΕΜ͕ߏΘΜʂͱ͍͏͍Λײ͡Δ MobileNetV1 ͷվળΛߟ͑ͨ݁Ռͱͯ͠ḷΓண͍ͨΑ͏ʹࢥΘΕΔ 13
ReLU ͷಛ 14
ReLU ͱ ReLU6 ͷఆٛ ReLU: ReLU6: ਤ https://www.desmos.com/calculator/865rohnewg Ͱ࡞ 15
ReLU ͷදݱྗ ReLU( ) ͳΔมͰඇθϩͷ volume ͕Δ߹Λߟ͑Δ ͷ෦ʹ map ͞ΕΔ
ͱ͍͏ઢܗมͦͷͷ → ग़ྗͷඇθϩྖҬʹ͓͚Δදݱྗઢܗม ReLU Ͱ௵ΕΔྖҬ͋ΔͷͰҰൠʹใ͕ࣦ͢Δ͕ɺ ௵ΕΔྖҬԼݶ ͳΔ ReLU มͰ ͷͱ͖ ূ໌ݪจͷ Appendix A ͷ Theorem 1 ͷ Proof 16
ReLU ͷલޙͰνϟϯωϧΛेେ͖͘͢Εใࣦ͠ͳ͍ ࣍ݩ mfd. Λ dim = m ࣍ݩʹมͯ͠ ReLU
ͯ͠ݩʹ͢ m ͕খ͍͞ͱใ͕ࣦ͢Δ͕ɺେ͖͚Εࣦ͠ͳ͍ ҰํͰେ͖͗͢Δͱมܗ͕ஶ͍͠෦ݱΕΔ → νϟϯωϧͷ֦େదਖ਼͕͋ΔʢจͰ 6 ഒʣ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻, 6 ഒ͜ͷਤͰݴ͑ dim=12 ͱͳΔ͜ͱʹҙʢͦͷ߹ਤ͓͖ࣔͯ͘͠ؾ͢Δ͕ʣ 17
ReLU Λͬͨߏͷॏཁͳͷ·ͱΊ ͜Ε·Ͱͷ؍ଌΛৼΓฦΔͱҎԼͷೋ͕ॏཁ - ReLU ʹΑΔมޙʹඇθϩͱͳΔྖҬઢܗมʹରԠ → linear layer ΛೖΕͯಛྔΛநग़͢Δͷ͕ྑͦ͞͏
- ReLU ʹΑΔใࣦมޙͷνϟϯωϧ૿ՃͰ͛Δ → ී௨ͷ residual ͷνϟϯωϧมԽͱٯύλʔϯ ͜ΕΛͬͯ৽͍͠ building block ΛఏҊ͢Δ 18
MobileNetV2 ͷߏ 19
Inverted residuals and linear bo!lenecks residual connection ͷ͋Δࣼઢͷ linear activation
෯νϟϯωϧͰதؒͰେ͖͘ͳΔΑ͏ʹઃܭ தؒͷͰ separable convolution Λ༻ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 20
Inverted residuals and linear bo!lenecks skip connection s=1 ͷͱ͖ͷΈషΒΕΔͷͰɺs≠1
Ͱແ͠ 21
Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ k×k convolution with
s=1 Ͱ Λߟ͑Δ normal convolution: separable convlution: → bottleneck: = 22
Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ ௨ৗͷ residual block
ͱൺͯܭࢉྔ͕গͳ͍Θ͚Ͱͳ͍ bottleneck ͷೖྗͷνϟϯωϧൺֱతখ͘͞Ͱ͖Δ͕ɺ தؒͷͰνϟϯωϧΛ֦େ͢ΔͷͰҰൠʹඇࣗ໌ ʢೖྗ͕ݮΒͤΔͷ࣍ݩ mfd. ʹใ͕ॅΉͱ͍͏Ծఆʣ → ݁Ռͱͯ͠ܭࢉྔ multipy-adds (MAdd) ͕ݮΒͤΔ ύϥϝλௐΛ͠ͳ͕Β architecture Λ࡞ͬͨΒܭࢉྔΛ্͑ͨͰਫ਼͕ߴ͍ͷ࡞Εͨͱ͍͏ఔͱࢥ͏ 23
Inverted residuals and linear bo!lenecks (bo!leneck) ͷϝϞϦޮ ೖྗνϟϯωϧΛݮΒͤΔͷϝϞϦͷ؍͔Β༗ར ਪଌ࣌ʹඞཁͱͳΔ max
ͷϝϞϦҎԼͷࣜͰܾ·Δ ೖྗͱग़ྗͷ࣍ݩ͕ॏཁʹͳΔͷͰ MobileNetV2 ͕༗ར ʢதؒͷ͍ࣺͯͩ͠νϟϯωϧຖʹॲཧ͕Ͱ͖Δ ʣ ฒྻʹܭࢉ͢ΔͳΒෳνϟϯωϧͷใΛಉ࣌ʹඞཁͱࢥ͏͕ɺCPU ͰͷܭࢉΛఆ͍ͯ͠Δͦ͠͏͍͏͜ͱͬΆ͍ʁ 24
MobileNetV2 ͷϞσϧ architecture σϑΥϧτͷϞσϧͷશମ૾ҎԼ( ܁Γฦ͠) νϟϯωϧ width multiplier Ͱ
ͱมߋ͠ಘΔ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 25
MobileNetV2 ʹΑΔਪଌ࣌ͷඞཁϝϞϦαΠζ νϟϯωϧ / ϝϞϦ[kb] ͷ࠷େͷൺֱ ಉछϞσϧͱൺֱ͔ͯ͠ͳΓখ͘͞ͳ͍ͬͯΔ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻
26
ੑೳධՁ 27
ImageNet classification ͷ݁Ռ NASNet ShuffleNet ͱൺͯಉҎ্ͷੑೳͰ͔͍ͭ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻,
V2 ͷ 1.0 1.4 ͱ͍͏ͷ width multiplier 28
COCO σʔληοτͰͷ object detection ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे͕ͩ...ʣ ਤ https://arxiv.org/abs/1801.04381
ΑΓҾ༻ 29
PASCAL VOC 2012 σʔληοτͰͷ semantic segmentation ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे͕ͩ...ʣ
ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 30
ઢܕੑͷॏཁੑͱ skip connection ͷுΓํʹؔ͢Δ࣮ݧʢImageNet classification) linear bottleneck ʹඇઢܗੑΛೖΕΔͱਫ਼͕མͪΔ skip connection
bottleneck ؒʹுΔͷ͕ྑ͍ ʢલऀ: ReLU ʹؔ͢Δߟͱ߹கɺޙऀ: ୯ʹ࣮ݧͷ݁Ռʣ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 31
ML Kit Λ༻͍࣮ͨ 32
ML Kit ͱ Mobile app. ͚ʹػցֶशػೳΛΈࠐΉͨΊͷ SDK Firebase ͷػೳͱͯ͠ఏڙ͞Ε͍ͯΔ ݱ࣌ͰαϯϓϧͰͳͯࣗ͘Ͱ४උͨ͠
custom model Λಈ͔͢ͷ݁ߏେม... ؤுͬͯಈ͘Α͏ʹͨ͠ɿFirebase ML KitͰࣗ࡞ͷΧελϜ ϞσϧΛͬͯྉཧɾඇྉཧը૾ΛఆͰ͖ΔΑ͏ʹͨ͠ ML Kit: https://developers.google.com/ml-kit/, Firebase: https://firebase.google.com/ 33
·ͱΊ 34
·ͱΊʢ࠶ܝʣ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕͷؒʹνϟϯωϧΛ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building
block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟͨ݁͠Ռͷߏ 3. ࣮ݧͰ NASNet ΑΓߴͰಉҎ্ͷ݁Ռ 4. ML Kit Λͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά 35