$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Food Image Object Detection and Classification
Search
Leszek Rybicki
February 16, 2017
Research
2
15k
Food Image Object Detection and Classification
Part 1: Detection
Leszek Rybicki
February 16, 2017
Tweet
Share
More Decks by Leszek Rybicki
See All by Leszek Rybicki
Let's talk about Fakes
lunardog
0
150
How to Patch Image Classifiers
lunardog
0
2.4k
Towards Realistic Predictors - EN
lunardog
0
2.3k
Towards Realistic Predictors
lunardog
1
2.3k
Deep Learning Hot Dog Detector
lunardog
0
280
Finding beans in burgers: paper reading notes
lunardog
0
1.7k
Kelner: Serve Your Models
lunardog
0
130
Image Analysis at Cookpad
lunardog
1
1.8k
Kelner: serve your models
lunardog
1
400
Other Decks in Research
See All in Research
Tiaccoon: Unified Access Control with Multiple Transports in Container Networks
hiroyaonoe
0
220
Satellites Reveal Mobility: A Commuting Origin-destination Flow Generator for Global Cities
satai
3
260
Language Models Are Implicitly Continuous
eumesy
PRO
0
360
Community Driveプロジェクト(CDPJ)の中間報告
smartfukushilab1
0
110
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
480
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
230
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
satai
3
230
湯村研究室の紹介2025 / yumulab2025
yumulab
0
270
Stealing LUKS Keys via TPM and UUID Spoofing in 10 Minutes - BSides 2025
anykeyshik
0
170
Open Gateway 5GC利用への期待と不安
stellarcraft
2
170
AIグラフィックデザインの進化:断片から統合(One Piece)へ / From Fragment to One Piece: A Survey on AI-Driven Graphic Design
shunk031
0
580
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
190
Featured
See All Featured
The untapped power of vector embeddings
frankvandijk
1
1.5k
Making the Leap to Tech Lead
cromwellryan
135
9.7k
My Coaching Mixtape
mlcsv
0
13
The Spectacular Lies of Maps
axbom
PRO
1
400
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.1k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Chasing Engaging Ingredients in Design
codingconduct
0
85
Information Architects: The Missing Link in Design Systems
soysaucechin
0
720
We Are The Robots
honzajavorek
0
120
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
580
Scaling GitHub
holman
464
140k
Joys of Absence: A Defence of Solitary Play
codingconduct
1
260
Transcript
Food Image Object Detection and Classification Challenges and Solutions
Part 1: Detection
自己紹介 • リビツキ レシェック • ポーランド出身 • 2016~ クックパッド • github:
lunardog
Warning! This presentation contains images that may cause severe drooling
and stomach grumbling. @cookpad
History 歴史
ImageNet KWWSLPDJHQHWRUJ
ImageNet Large Scale Visual Recognition Competition KWWSZZZLPDJHQHWRUJFKDOOHQJHV/695&
ILSVRC 2010 task Classification )RUHDFKLPDJHDOJRULWKPV ZLOOSURGXFHDOLVWRIDWPRVW REMHFWFDWHJRULHVLQWKH GHVFHQGLQJRUGHURI FRQILGHQFH KWWSZZZLPDJHQHWRUJFKDOOHQJHV/695&
ILSVRC 2011 tasks 1. Classification 2. *Classification with localization *tester
task
KWWSFVQVWDQIRUGHGXV\OODEXVKWPO Classification + Localization
ILSVRC 2012 tasks 1. Classification 2. Classification with localization 3.
Fine-grained classification
Fine-grained classification KWWSZZZLPDJHQHWRUJFKDOOHQJHV/695&
AlexNet ,PDJHQHWFODVVLILFDWLRQZLWKGHHSFRQYROXWLRQDOQHXUDOQHWZRUNV $.UL]KHYVN\,6XWVNHYHU*(+LQWRQ$GYDQFHVLQQHXUDOLQIRUPDWLRQ SURFHVVLQJV\VWHPV
ILSVRC 2013 tasks 1. Detection 2. Classification 3. Classification with
localization
ILSVRC 2014 tasks 1. Detection 2. Classification 3. Classification with
localization
Object Detection KWWSFVQVWDQIRUGHGXV\OODEXVKWPO
Deep Learning KWWSVGHYEORJVQYLGLDFRP
ILSVRC 2015 tasks 1. Object detection 2. Object localization 3.
*Object detection from video 4. *Scene classification
ILSVRC 2016 tasks 1. Object localization 2. Object detection 3.
Object detection from video 4. Scene classification 5. Scene parsing
Cookpad 2016
画像データセット 1997年~ レシピ数:国内約260万 + 国外 + つくれぽ + 手順写真 17言語、60カ国
※数字は2017年02月時点のものです
画像解析の研究関心 • これは料理ですか? • どの料理ですか? • 料理はどこですか? • 。。。 Part
2
Where is the food? 料理はどこですか?
ゴール )LQGIRRGLQWKHLPDJHGUDZ DERXQGLQJER[DURXQGWKH IRRGLWHPLQFOXGLQJWKH GLVKLIYLVLEOH
,IWKHUHDUHPXOWLSOHLWHPV GUDZDERXQGLQJER[ DURXQGHDFKRQH ゴール
ground truth bounding box > 0.9 We count it as
a positive detection if Intersection over Union ratio is greater than 0.9. ƴ
QXPEHURIWUXHSRVLWLYHV QXPEHURIJURXQGWUXWKER[HV ƴ ƴ ƴ QXPEHURIWUXHSRVLWLYHV QXPEHURIJHQHUDWHGER[HV 再現率 (precision) (recall)
ƴ ƴ
Methods
1. Build a classifier 2. Pick Regions of Interest 3.
Run classifier on each region 4. Remove duplicate detections IDEA
Fast, Faster R-CNN 5LFKIHDWXUHKLHUDUFKLHVIRUDFFXUDWHREMHFWGHWHFWLRQDQGVHPDQWLFVHJPHQWDWLRQ 5RVV*LUVKLFN-HII'RQDKXH7UHYRU'DUUHOO-LWHQGUD0DOLN )DVWHU5&117RZDUGV5HDO7LPH2EMHFW'HWHFWLRQZLWK5HJLRQ3URSRVDO1HWZRUNV 6KDRTLQJ5HQ.DLPLQJ+H5RVV*LUVKLFN-LDQ6XQ
)DVW5&11 5RVV*LUVKLFN
問題 1. Computational cost 2. Context is important 3. ...but
context can be confusing. KDQG IRRG JUDVV IRRG KWWSSL[DED\FRP
Single Shot Detector 66'6LQJOH6KRW0XOWL%R['HWHFWRU :HL/LX'UDJRPLU$QJXHORY'XPLWUX(UKDQ&KULVWLDQ6]HJHG\ 6FRWW5HHG&KHQJ<DQJ)X$OH[DQGHU&%HUJ
Either The Least Or Most Employable Person Ever 7KH+XIILQJWRQ3RVW JLWKXEFRPSMUHGGLH
SMUHGGLHFRPGDUNQHW ZZZNDJJOHFRPSMUHGGLH Joseph Redmon
You Only Look Once <RX2QO\/RRN2QFH8QLILHG 5HDO7LPH2EMHFW'HWHFWLRQ -RVHSK5HGPRQ6DQWRVK'LYYDOD5RVV *LUVKLFN$OL)DUKDGL 'HF
<2/2%HWWHU)DVWHU 6WURQJHU -RVHSK5HGPRQ$OL)DUKDGL
<RX2QO\/RRN2QFH8QLILHG5HDO7LPH2EMHFW'HWHFWLRQ -RVHSK5HGPRQ6DQWRVK'LYYDOD5RVV*LUVKLFN$OL)DUKDGL YOLO in Context
None