Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CBoW入門

Avatar for Kento Nozawa Kento Nozawa
April 21, 2016

 CBoW入門

2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです

Avatar for Kento Nozawa

Kento Nozawa

April 21, 2016
Tweet

More Decks by Kento Nozawa

Other Decks in Research

Transcript

  1. ࠓ೔࿩͢͜ͱ • ଟ૚ύʔηϓτϩϯ (MLP) • Continuous Bag of Words •

    word2vecʹ͋ΔยํͷϞσϧ • ߴ଎Խ΍NGʹ͍ͭͯ͸ݴٴ͠·ͤΜ
  2. ଟ૚ύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ

    • ग़ྗɿ֬཰ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
  3. ۩ମྫɿೖྗ૚ ͦΕͧΕ୯ޠͷස౓͕ೖྗ૚ͷೖྗ஋ • doc0: [win8, win8, ms, ms, ms, jobs]

    -> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
  4. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
  5. ۩ମྫɿӅΕ૚ ೖྗ-ӅΕؒͷॏΈߦྻW͸ɼ3x4ͷߦྻ ӅΕ૚͸ɼ(ೖྗ૚ͷग़ྗ)x(ॏΈ)ͷ࿨hΛड͚औΔ doc0 2 4 1 2 3 0

    1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
  6. ۩ମྫɿӅΕ૚ ׆ੑԽؔ਺ f(x) Λ௨ͯ͠ӅΕ૚͔Βग़ྗ doc0 Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔ਺ྫɿγάϞΠυؔ਺
  7. ۩ମྫɿग़ྗ૚ ӅΕ૚-ग़ྗ૚ͷॏΈW’͸ɼ2x3ͷߦྻ ग़ྗ͸ɼ(ӅΕ૚ͷग़ྗ)x(ॏΈ)ͷ࿨Λड͚औΔ doc0 Input layer hidden layer output layer

    (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1  1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 =  1.0 1.0 W0f(h) = u o
  8. ग़ྗ૚ͷ׆ੑԽؔ਺ ग़ྗ૚ͷ׆ੑԽؔ਺ɿ֬཰஋Λग़ྗ͢Δsoftmaxؔ਺ doc0(=[win8, win8, ms, ms, ms, jobs])͸0.54Ͱwinͷจॻ Input layer

    hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
  9. ֶश • ޡࠩٯ఻೻๏Λ࢖ͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬཰ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷ͸ਖ਼ղϥϕϧ [0, 1]

    Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
  10. Continuous Bag of Wordsɿೖྗ૚ MLPͷೖྗ૚͕ਤͷೖྗ૚ͷശ1ͭʹ૬౰ Input layer hidden layer output

    layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
  11. Continuous Bag of Wordsɿೖྗ૚ • ശ1ͭ͸one-hotදݱΛड͚औΔ • I drink coffee

    everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕੺͍෦෼ͷͱΔ஋ coffee
  12. Continuous Bag of Wordsɿೖྗ૚ I = [0, 1, 0, 0]

    drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
  13. Continuous Bag of Wordsɿೖྗ૚-ӅΕ૚ͷॏΈ • ໼ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻ͸ڞ༗ WN⇥V 2

    4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
  14. Continuous Bag of WordsɿӅΕ૚ • ୯ޠϕΫτϧͷฏۉ͕ӅΕ૚ͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔ਺ͳ͠ ut 2

    + ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
  15. Continuous Bag of WordsɿӅΕ૚-ग़ྗ૚ ॏΈߦྻ ͱӅΕ૚ͷग़ྗ஋ʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6

    6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
  16. Continuous Bag of Wordsɿग़ྗ૚ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗ૚ͷϢχοτ਺ = ޠኮ਺ =

    V • ׆ੑԽؔ਺ɿsoftmaxؔ਺ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
  17. ୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •

    ୯ޠͷಛ௃ྔ • ਂ૚ֶशͷॳظ஋ • ྨࣅ౓ܭࢉ • nzwͷ࠷ॳͷ࿦จ͸͜Ε
  18. ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼؔ਺͕͍Ζ͍Ζ͋ͬͯศར • chainer :

    https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮૷ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ΍͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷ΋ͱ࿦จɽεϥΠυͷਤͷCBoW͸ͪ͜Β͔Β • ਂ૚ֶश Deep Learning. ਓ޻஌ೳֶձ. • ೔ຊޠɼॻ੶