Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

DeepCluster 論文の紹介

Yuki Ishikawa
August 08, 2018

DeepCluster 論文の紹介

Facebook AI Research による論文「Deep Clustering
 for Unsupervised Learning
 of Visual Features」の解説資料
https://arxiv.org/abs/1807.05520

Yuki Ishikawa

August 08, 2018
Tweet

More Decks by Yuki Ishikawa

Other Decks in Science

Transcript

  1. ࠓճͷ࿦จɿ
 Deep Clustering for Unsupervised Learning
 of Visual Features •

    https://arxiv.org/abs/1807.05520 • Facebook AI Research • Accepted at ECCV 2018
  2. ֓ཁ • CNN Ͱը૾ͷΫϥελϦϯάΛ͢Δख๏ • CNN ͷग़ྗΛ k-means ͰΫϥελϦϯάͨ݁͠ՌΛ
 “ِϥϕϧ”

    ͱͯ͠ѻ͍ɺωοτϫʔΫͷॏΈΛߋ৽͢Δ • ͱͯ΋ྑ͍ੑೳ͕ग़ͨ • Pascal VOC ʹΑΔධՁͰଞͷΞϧΰϦζϜΛ௒͑Δੑೳ • ؤ݈ੑ͕͋Δ • σʔληοτΛม͑ͯ΋େৎ෉ (ImageNet → YFCC100) • ωοτϫʔΫߏ଄Λม͑ͯ΋େৎ෉ (AlexNet → VGG16) • k-means Ҏ֎ͷΫϥελϦϯάΞϧΰϦζϜͰ΋େৎ෉
  3. ڭࢣ (͋Γ|ͳ͠) ֶश • ڭࢣ͋Γֶश (Supervised Learning) • ֶशσʔλʹڭࢣϥϕϧ͕෇͍͍ͯΔ •

    ෼ྨ΍ճؼͳͲ • ڭࢣͳֶ͠श (Unsupervised Learning) • ֶशσʔλʹڭࢣϥϕϧ͕෇͍͍ͯͳ͍ • ΫϥελϦϯά • AutoEncoder
  4. epoch ͷྲྀΕ 3. k-means ͰΫϥελϦϯά 1. ೖྗΛ CNN ͰϑΥϫʔυ 4.

    ΫϥελϦϯά݁ՌΛ
 “ِϥϕϧ” ͱͯ͠ޡࠩΛܭࢉ 5. ωοτϫʔΫͷॏΈΛߋ৽ 2. CNN ͷग़ྗ݁ՌΛ PCA Ͱѹॖ
  5. ผͷݚڀ [26] • ύϥϝʔλ͕ϥϯμϜͳ AlexNet ʹ
 ImageNet σʔληοτͰ෼ྨΛߦͬͨ • ग़ྗ΋ϥϯμϜʹͳΔͱ͢Ε͹ɺ


    (ImageNet ͸1000Ϋϥε෼ྨͳͷͰ)
 ਫ਼౓ͷظ଴஋͸ 0.1 %ͱͳΔ • ͔࣮͠͠ࡍʹ͸ɺظ଴஋Λང͔ʹ௒͑Δ
 12 %ͷਫ਼౓Λग़ͨ͠
  6. ࣮૷ͷৄࡉ (1/2) • CNN ʹ͸ඪ४తͳ AlexNet Λ༻͍ͨ • Local Response

    Normalization ૚͸
 Batch Normalization ૚ʹೖΕସ͑ͨ • ৭৘ใΛͦͷ··ѻ͏ͷ͕೉͍͠ • Sobel filter (※ ྠֲநग़) ʹجͮ͘ઢܗม׵ʹΑͬͯ
 ৭Λ࡟আ͠ίϯτϥετΛڧௐ͍ͯ͠Δ • ImageNet ͷը૾͸ Data Augmentation ͯ͠ೖྗͨ͠ • mini batch size ͸ 256 ʹͨ͠
  7. ࣮૷ͷৄࡉ (2/2) • 500 epoch Λֶश͢Δͷʹ P100 GPU Λ࢖ͬͯ
 12

    ೔͔͔ͬͨ • ࣮ߦ࣌ؒશମͷ 1/3 ͸ k-means ͷॲཧ࣌ؒ • ΫϥελϦϯά͢ΔલʹશσʔλΛ Forward ͢Δඞཁ͕
 ͋ΔͷͰͲ͏ͯ͠΋͕͔͔࣌ؒΔ
  8. ิ଍ɿਖ਼نԽ૬ޓ৘ใྔ (NMI) • Normalized Mutual Information • ͋ΔΫϥελϦϯά݁Ռ A ͱ


    ผͷΫϥελϦϯά݁Ռ B ͕
 ͲΕ͚ͩࣅ௨͍ͬͯΔ͔ΛදݱͰ͖Δ
  9. Ϋϥελͷ҆ఆੑ • ͋Δը૾͕ɺ࣍ͷ epoch Ͱ΋ಉ͡Ϋϥελʹ
 ׂΓ౰ͯΒΕΔׂ߹ (= ҆ఆੑ) • epoch

    ͕ਐΉʹͭΕ
 ҆ఆੑ͕૿͢ • 0.8 ҎԼͰ๞࿨͢Δ • ͦΕҎ্ͷֶश͸
 ҙຯ͕ͳ͍
  10. Ϋϥελ਺ʹΑΔੑೳͷҧ͍ • mAP ͱ͍͏ํ๏ (ʁ) Ͱ෼ྨੑೳΛܭଌͨ͠ • k = 10,000

    Ͱ࠷΋ੑೳ͕ྑ͔ͬͨ • ImageNet Ͱ͋Ε͹ k = 1,000 ͕
 ྑ͍ͷͰ͸ͳ͍͔ͱߟ͕͕͑ͪͩɺ
 ա৒ͳηάϝϯςʔγϣϯͷ
 ΄͏͕͍͍݁ՌΛग़ͨ͠
  11. CNN ͷ֤૚͝ͱͷߟ࡯ • Լͷը૾͸ɺ֤૚Ͱ࠷΋൓Ԡͷྑ͔ͬͨը૾ TOP 9 • ਂ͍૚ʹͳΔ΄Ͳେ͖ͳύλʔϯΛೝ͍ࣝͯ͠Δ (༧૝௨Γ) •

    ࠷ޙͷ૚ (conv5) ͸ɺલͷ૚·ͰͰೝࣝͨ͜͠ͱΛ
 ࠶౓ೝࣝ͠௚͍ͯ͠ΔΑ͏ʹ΋ݟ͑Δ • (AlexNet ʹ͓͍ͯ) ࠷ޙͷ૚ (conv5) ͸ଞͷ૚ͱ͸
 ಛ௃͕ҟͳΔͱ͍͏ผͷݚڀ݁Ռ [43] Λཪ෇͚͍ͯΔ
  12. ֤૚ͷ෼ྨੑೳ (3/3) • DeepCluster ͸ߴ͍ϨΠϠͰͷੑೳ͕ྑ͍ • conv3 ͷੑೳ͕ͱͯ΋ྑ͍ • ͳΜͱ

    conv5 ΑΓ΋ྑ͍ • ҰํͰ conv1 ͷੑೳ͕શ͘ྑ͘ͳ͍ • DeepCluster Ͱ͸ɺconv3-conv4 Ͱ ImageNet ͷ
 ϥϕϧʹ૬౰͢Δ΋ͷΛೝ͍ࣝͯ͠ΔͷͰ͸ͳ͍͔
  13. Pascal VOC ʹΑΔධՁ (1/3) • Pascal VOC: ෼ྨɾ෺ମݕग़ɾϥϕϧ෇͚ Λߦ͏ίϯϖ •

    DeepCluster Λ࢖ͬͯ໰୊Λղ͘͜ͱͰੑೳΛධՁ͢Δ • ෺ମݕग़ͷ࣮૷ʹ͸ Fast R-CNN Λ༻͍ͨ
  14. Pascal VOC ʹΑΔධՁ (3/3) • ෼ྨɾ෺ମݕग़ɾϥϕϧ෇͚ ͢΂ͯʹ͓͍ͯੑೳ͕ྑ͍ • ڵຯਂ͍఺ͱͯ͠ɺfine-tuned (?)

    ͳϥϯμϜωοτϫʔΫ͸ ͦΕͳΓͷਫ਼౓Λग़͕͢ɺશ݁߹૚ 6-8 ͷΈΛֶशͨ͠৔߹ ͷੑೳ͸͔ͳΓ௿͘ͳΔ • ͜ΕΒͷλεΫ͸ fine-tuning Ͱ͖ͳ͍৔߹Ͱݱ࣮ͷ
 ΞϓϦέʔγϣϯͱۙ͘ͳΔ • ͦͷ৔߹ɺ࠷৽ͷख๏ͱͷࠩ͸ߋʹେ͖͘ͳΔͩΖ͏ (෼ྨͰ࠷େ 9%) ( ˘ω˘) .oO ( ͪΐͬͱԿݴͬͯΔ͔Θ͔ΒΜ͔ͬͨ )
  15. ֓ཁ (࠶ܝ) • CNN Ͱը૾ͷΫϥελϦϯάΛ͢Δख๏ • CNN ͷग़ྗΛ k-means ͰΫϥελϦϯάͨ݁͠ՌΛ


    “ِϥϕϧ” ͱͯ͠ѻ͍ɺωοτϫʔΫͷॏΈΛߋ৽͢Δ • ͱͯ΋ྑ͍ੑೳ͕ग़ͨ • Pascal VOC ʹΑΔධՁͰଞͷΞϧΰϦζϜΛ௒͑Δੑೳ • ؤ݈ੑ͕͋Δ • σʔληοτΛม͑ͯ΋େৎ෉ (ImageNet → YFCC100) • ωοτϫʔΫߏ଄Λม͑ͯ΋େৎ෉ (AlexNet → VGG16) • k-means Ҏ֎ͷΫϥελϦϯάΞϧΰϦζϜͰ΋େৎ෉