Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Object Detector: WHY Do You Need and HOW Can Yo...

ABEJA
March 04, 2019

Object Detector: WHY Do You Need and HOW Can You Own

SIX 2019 dev-e-2
Yaping Sun @ABEJA, Inc.

Object Detector: WHY Do You Need and HOW Can You Own

An object detector is designed to discover the presence and location of an object within an image or video frame. Its applications expand from automatic driving to video surveillance. Thanks to the fast evolution of Deep Learning, the recent techniques make it easier to train a customized object detector while achieving rather good performance.

ABEJA

March 04, 2019
Tweet

More Decks by ABEJA

Other Decks in Technology

Transcript

  1. DAY 1 “技” Developer Day Object Detector: WHY Do You

    Need and HOW Can You Own Yaping Sun ABEJA.Inc
  2. Self-Introduction Yaping Sun http://muchuanyun.github.io/ • Majored in Computer Engineering and

    Microelectronics • Data Engineer @ABEJA, Inc • Interested in applications of Deep Learning in use-cases
  3. Object Detection and Applications Object Detection in Machine Learning Experience

    with ABEJA Platform Datasets Evaluation Criteria Representative Architectures
  4. Object Detection in Machine Learning Experience with ABEJA Platform Datasets

    Evaluation Criteria Representative Architectures Object Detection and Applications
  5. Computer Vision Tasks CAT CAT DOG, DOG, CAT DOG, DOG,

    CAT http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture11.pdf Classification Object Detection Instance Segmentation
  6. Object Detection • A basic concept from Human Intelligence •

    Cornerstone of true AI • Initial step for tracking, identification, human-computer interaction etc. https://github.com/tensorflow/models/tree/master/research/object_detection
  7. • Face Detection • People Counting • Self-Driving Cars •

    Pedestrian/Vehicle detection • Video Surveillance • Anomaly Detection • … Why Do You Need?
  8. Case 1: Visual Search • Users upload photo to discover

    similar-looking products • Usually multiple objects exist in one image • Object Detection reduces computational cost and improves accuracy in visual search system. • Wide application in Fashion business https://labs.pinterest.com/assets/paper/visual_search_at_pinterest.pdf
  9. Case 2: Analysis in Drone Imagery • Remote monitoring of

    a housing construction project through Drone • Routine Inspection of solar farms • Early plant disease detection in agriculture https://medium.com/nanonets/how-we-flew-a-drone-to-monitor- construction-projects-in-africa-using-deep-learning-b792f5c9c471
  10. Case 3: Behavior Observation • Analysis of users’ behavior helps

    improving product • Use object detection to track the movement of items in a kitchen • 1/100 time cost of manual work https://six2018.abejainc.com/docs/b3_six2018.pdf
  11. Practice: Deconstruct a Problem Example: Unmanned Store • Required Functions

    • e.g. track what customer picks from a shelf • e.g. checkout within shopping carts • Possible Approach • e.g. track hands • e.g. detect products • Feasibility Evaluation • cameras (resolution, position, …) • accuracy expectation • cost vs. RFID?
  12. Object Detection in Machine Learning Experience with ABEJA Platform Datasets

    Evaluation Criteria Representative Architectures Object Detection and Applications
  13. Object Detection: Problem Definition Input: • Image (RGB) Output: •

    class 0, (x1, y1, w1, h1), p1 • class 0, (x2, y2, w2, h2), p2 • class 1, (x3, y3, w3, h3), p3 • … (x, y) w h ‘cat’ (cj , bj , pj ) This image is CC0 public domain.
  14. Famous Challenges PASCAL VOC (2007) ImageNet ILSVRC (2013) MS COCO

    (2015) Open Images (2018) # Classes 20 200 80 500 # Training Images 11K 476K 200K 1.7M # Objects 27K 534K 1.5M 12M Note standard scaled up version of PASCAL VOC more difficult than VOC broader range of classes http://host.robots.ox.ac.uk/pascal/VOC/ http://www.image-net.org/challenges/LSVRC/ http://cocodataset.org/#home https://www.kaggle.com/c/google-ai-open-images-object-detection-track
  15. Object Detection: Evaluation (1) AP (Average Precision) average of the

    maximum precisions at different recall values. mAP (mean Average Precision) mean of AP over all categories AP@IoU average precision over all IoU thresholds [0.5:0.05:0.95]. AP@Scales average precision for different object sizes [small, medium, large]. AR (Average Recall) averaged maximum recall given a fixed number of detections per image
  16. Object Detection: Evaluation (2) TP (True Positive): correct class and

    IoU > 0.5 FP(False Positive): wrong class or IoU < 0.5 FN (False Negative): missed object Ground truth Prediction IoU = area of overlap area of union Precision = TP TP + FP Recall = TP TP + FN AP = ∑ r∈Recall([0,1]) Precision(tr ) |Recall([0,1])| AP: average of maximum precision at all recall levels Intersection over Union:
  17. Object Detection: Evaluation (3) Example: For category ‘cat’: # Ground

    truth = 5 # Prediction = 10 Rank Correct? Precision Recall 1 TRUE 1.0 0.2 2 3 4 5 6 7 8 9 10 Precision = TP TP + FP Recall = TP TP + FN TP=1, FP=0, FN=4 Precision = 1/1 Recall = 1/5
  18. Object Detection: Evaluation (3) Example: For category ‘cat’: # Ground

    truth = 5 # Prediction = 10 Rank Correct? Precision Recall 1 TRUE 1.0 0.2 2 TRUE 1.0 0.4 3 4 5 6 7 8 9 10 Precision = TP TP + FP Recall = TP TP + FN TP=2, FP=0, FN=3 Precision = 2/2 Recall = 2/5
  19. Object Detection: Evaluation (3) Example: For category ‘cat’: # Ground

    truth = 5 # Prediction = 10 Rank Correct? Precision Recall 1 TRUE 1.0 0.2 2 TRUE 1.0 0.4 3 FALSE 0.67 0.4 4 5 6 7 8 9 10 Precision = TP TP + FP Recall = TP TP + FN TP=2, FP=1, FN=3 Precision = 2/3 Recall = 2/5
  20. Object Detection: Evaluation (3) Example: For category ‘cat’: # Ground

    truth = 5 # Prediction = 10 Rank Correct? Precision Recall 1 TRUE 1.0 0.2 2 TRUE 1.0 0.4 3 FALSE 0.67 0.4 4 FALSE 0.5 0.4 5 FALSE 0.4 0.4 6 TRUE 0.5 0.6 7 TRUE 0.57 0.8 8 FALSE 0.5 0.8 9 FALSE 0.44 0.8 10 TRUE 0.5 1.0 Precision = TP TP + FP Recall = TP TP + FN AP: average of maximum precision at all recall levels
  21. Object Detection: Evaluation (3) Rank Correct? Precision Recall 1 TRUE

    1.0 0.2 2 TRUE 1.0 0.4 3 FALSE 0.67 0.4 4 FALSE 0.5 0.4 5 FALSE 0.4 0.4 6 TRUE 0.5 0.6 7 TRUE 0.57 0.8 8 FALSE 0.5 0.8 9 FALSE 0.44 0.8 10 TRUE 0.5 1.0 Recall* Precision* 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Example: For category ‘cat’: # Ground truth = 5 # Prediction = 10
  22. Object Detection: Evaluation (3) AP = (5x1.0+4x0.57+2x0.5)/11 Rank Correct? Precision

    Recall 1 TRUE 1.0 0.2 2 TRUE 1.0 0.4 3 FALSE 0.67 0.4 4 FALSE 0.5 0.4 5 FALSE 0.4 0.4 6 TRUE 0.5 0.6 7 TRUE 0.57 0.8 8 FALSE 0.5 0.8 9 FALSE 0.44 0.8 10 TRUE 0.5 1.0 Recall* Precision* 0 1.0 0.1 1.0 0.2 1.0 0.3 1.0 0.4 1.0 0.5 0.57 0.6 0.57 0.7 0.57 0.8 0.57 0.9 0.5 1.0 0.5 Example: For category ‘cat’: # Ground truth = 5 # Prediction = 10
  23. Object Detection in Machine Learning Experience with ABEJA Platform Datasets

    Evaluation Criteria Representative Architectures Object Detection and Applications
  24. Think Intuitively… • Use a sliding window to go over

    the full image • Crop the area and do classification • Repeat for different window size But… • Return multiple detections • Too slow
  25. Non-Maximum-Suppression (NMS) • Start with detection with highest confidence score

    • Measure its IoUs with other detections • Remove detections with IoU > threshold (e.g. 0.5) • Repeat the steps with the remaining detections
  26. Milestones of Object Detection • Before 2012: Handcrafted features •

    After 2012: benefit from DCNNs https://arxiv.org/abs/1809.02165
  27. Representative Object Detection Architectures • Two-Stage Detector • RCNN series

    • R-FCN • One-Stage Detector • YOLO series • SSD
  28. RCNN / Fast RCNN / Faster RCNN Highlights • Region

    proposal (‘blob-like’) • CNN based classifier • SOTA of 2014 Problems • Multi-stage pipeline • Training is too heavy • Detection is slow (47s/image on GPU) https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1311.2524
  29. RCNN / Fast RCNN / Faster RCNN Highlights • Feature

    is calculated only once • Multi-task loss of classification and regression • Faster than RCNN Problems • Region proposal is still the bottleneck. https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1504.08083
  30. RCNN / Fast RCNN / Faster RCNN Highlights • Use

    CNN to do region proposal (RPN), other parts are just like Fast RCNN • Introduce Anchors • Joint training Problems: • Still slow https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1506.01497
  31. R-FCN (Region-based Fully Convolutional Network) Highlights • Shared RoI subnet

    • Position sensitive RoI pooling • Faster than Faster RCNN Problems: • More computational cost than single stage detector https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1605.06409
  32. YOLO (You-Only-Look-Once) Highlights • Super fast • Use features from

    entire image Problems: • Weak on small objects • A lot localization errors https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1506.02640
  33. SSD (Single-Shot-Detector) Highlights • Use multiple CONV feature maps •

    Competitive accuracy with Faster RCNN • Faster than YOLO-v1 Problems • Poor performance on small objects https://arxiv.org/abs/1809.02165 https://arxiv.org/abs/1512.02325
  34. Which is the best? Given application&platform: Tradeoff of speed, memory

    and accuracy Examples: • Mobile device: small memory footprint • Realtime applications: test-time inference speed • Server-side system: accuracy (subject to throughput constraint)
  35. Before getting hands dirty… • Prepare a proper dataset •

    Collect good quality images • Annotation work is necessary • Understand the data • Clarify the deployment environment • Edge device / Local machine / Cloud • Real-time? • Pick a model
  36. Object Detection in Machine Learning Experience with ABEJA Platform Datasets

    Evaluation Criteria Representative Architectures Object Detection and Applications
  37. • Data • Accumulation • Management • Annotation • ML/DL

    Model • Training • Deployment • Serving and Inference • Version Management A Glimpse into ABEJA Platform
  38. Technical Tutorials • Sample codes for classification, object detection, semantic

    segmentation https://github.com/abeja-inc/abeja-platform-samples • Tech Blogs on ABEJA Platform https://qiita.com/advent-calendar/2018/abejaplatform • ABEJA’s General Tech Blog: https://tech-blog.abeja.asia/
  39. Object Detection in Machine Learning Experience with ABEJA Platform Datasets

    Evaluation Criteria Representative Architectures Object Detection and Applications
  40. After the lecture is over, we are waiting at the

    Ask the Speaker section of the exhibition area. If you have any questions, please come to this corner after the session ends. See you Ask the Speaker ABEJA 17 6 5 4 3 1 2 9 10 11 12 7 8 16 15 ABEJA Ask the Speaker 14 3F Hall ABEJAծ ABEJA Deep Learning ABEJA
  41. The contents introduced today and the products and services that

    support the backside of these, We have prepared a booth at the 3F exhibition hall. Please drop by during the session. GO EXPO 2F 3F Room A Room B Room C Room D Hall ٖؒك٦ة٦ WC ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي Room E ٖؒك٦ة٦ WC ♧菙勻㜥罏「➰ ٝ؟٦ أؙ 闌怴罏 「➰ 1F 2F 3F Floor Maps Room A Room B Room C Room D Hall ٖؒك٦ة٦ WC ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي ㉀锑 ٕ٦ي Room E ٖؒك٦ة٦ WC Room W ♧菙勻㜥罏「➰ أهٝ؟٦ رأؙ 闌怴罏 「➰ WC ٖؒك٦ة٦ Here
  42. Tomorrow will be announced in many sessions how the technology

    introduced today is actually used by clients. Please come tomorrow GO Day2 !! - for ABEJA Platform
  43. Please give us feedback on this session if you like

    ID of this session dev-e-2 Object Detector: WHY Do You Need and HOW Can You Own Feedback will be used to develop products and deliver more information https://goo.gl/forms/erEBAsrQK4XKEv352