Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The evolution of CNN

Avatar for yoppe yoppe
June 27, 2017

The evolution of CNN

Avatar for yoppe

yoppe

June 27, 2017
Tweet

More Decks by yoppe

Other Decks in Science

Transcript

  1. Deep Learning & CNN # of papers   

      2010 2011 2012 2013 2014 2015 2016 2017 1 1 0 13 74 293 653 476 6 7 22 89 188 651 1,304 1,147 deep learning CNN source: https://arxiv.org/ on 20170626 arXiv papers including “deep learning” (“CNN”) in titles or abstracts in Computer Science 2/17
  2. The evolution of CNN *This is a very limited chronology

    Neocognitoron: http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf LeNet-5: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf AlexNet: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks Network in Network: https://arxiv.org/abs/1312.4400 VGG: https://arxiv.org/abs/1409.1556 Inception(V3): https://arxiv.org/abs/1512.00567 ResNet: https://arxiv.org/abs/1512.03385 SqueezeNet: https://arxiv.org/abs/1602.07360 ENet: https://arxiv.org/abs/1606.02147 Deep Complex Networks: https://arxiv.org/abs/1705.09792 1980 1998 2012 2013 2014 2015 2015 2016 2016 2017 3/17
  3. Neocognitron source: www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf • Prototype of CNN • Hierarchical structure

    • S-cells (convolution)
 feature extraction • C-cells (avg. pooling)
 robustness to positional deviation • Self-organizing like training • NOT back propagation 4/17
  4. LeNet-5 source: yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf • Non-linearity sigmoid, tanh 2 8 -2

    3 1 7 2 1 -1 2 9 2 -2 0 3 3 4.5 2 -0.5 4.25 • Convolution feature extraction • Subsampling (avg. pooling) positional invariance, size reduction 5/17
  5. AlexNet source: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks • ReLU keep gradient alive • Max

    pooling better than average • Dropout generalization • GPU computation accelerate computation 2 8 -2 3 1 7 2 1 -1 2 9 2 -2 0 3 3 8 3 2 9 6/17
  6. Network In Network source: https://arxiv.org/abs/1312.4400 • MLP after convolution efficient

    non-linear combinations
 of feature maps • Global Average Pooling one feature map for each class
 no Fully Connected layers • Small model size 29 [MB] for ImageNet MLP Softmax 7/17
  7. VGG source: https://arxiv.org/abs/1409.1556 • Deep Model with basic building blocks

    • convolution • max pooling • activation (ReLU) • Fully Connected • softmax • Sequence of small convolutions • 3*3 spatial convolutions • Relatively large parameters • large channels at the early stages • many Fully Connected layers 8/17
  8. InceptionV3 source: https://arxiv.org/abs/1512.00567 • Inception module • parallel operations and

    concatenation
 capture different features efficiently • mainly 3*3 convolution
 coming from the VGG architecture • 1*1 convolution
 reduce the number of channels • Global Average Pooling and Fully Connected • balance accuracy and model size • Good performance! 9/17
  9. ResNet source: https://arxiv.org/abs/1512.03385 • Residual structure • shortcut (by-pass) connection


    keep gradient alive • 1*1 convolution
 reduce the number of channels • Very deep model • total 152 layers • more than 1000 layers 10/17
  10. SqeezeNet source: https://arxiv.org/abs/1602.07360 11/17 55*55*96 55*55*16 55*55*64 55*55*128 55*55*64 •

    Fire module squeeze channels to reduce 
 computational costs • Deep compression lighten model size
 sparse weight, weight quantization, Huffman coding • Small model for 6 bit data, the model size is 0.47 [MB] ! 1*1 squeeze 1*1 expand 3*3 expand
  11. ENet source: https://arxiv.org/abs/1606.02147 • Realtime segmentation model • downsampling at

    the early stages • asymmetric encoder-decoder structure • PReLU • small model ~ 1[MB] • Encoder can be used as CNN • Global Max Pooling encoder decoder input 3 × 512 × 512 12/17
  12. Deep Complex Networks 13/17 source: https://arxiv.org/abs/1705.09792 • Complex structure •

    convolution
 
 • batch normalization • Advantages of Complex value • biological & signal processing aspects
 can express firing rate & relative timing
 detailed description of objects • parameter efficient 
 2^(depth) efficient than real value
  13. [Review Papers] • On the Origin of Deep Learning: 


    https://arxiv.org/abs/1702.07800 • Recent Advances in Convolutional Neural Networks: 
 https://arxiv.org/abs/1512.07108 • Understanding Convolutional Neural Networks: 
 http://davidstutz.de/wordpress/wp-content/uploads/2014/07/seminar.pdf • AN ANALYSIS OF DEEP NEURAL NETWORK MODELS FOR PRACTICAL APPLICATIONS
 https://arxiv.org/abs/1605.07678 [Slides & Web pages] • Recent Progress on CNNs for Object Detection & Image Compression
 https://berkeley-deep-learning.github.io/cs294-131-s17/slides/sukthankar-UC-Berkeley-InvitedTalk-2017-02.pdf • CS231n: Convolutional Neural Networks for Visual Recognition: 
 http://cs231n.github.io/convolutional-networks/ [Blog posts] • Training ENet on ImageNet: 
 https://culurciello.github.io/tech/2016/06/20/training-enet.html • Neural Network Architectures:
 https://medium.com/towards-data-science/neural-network-architectures-156e5bad51ba 17/17