Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
CBoW入門
Search
Kento Nozawa
April 21, 2016
Research
4
3.6k
CBoW入門
2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです
Kento Nozawa
April 21, 2016
Tweet
Share
More Decks by Kento Nozawa
See All by Kento Nozawa
Analysis on Negative Sample Size in Contrastive Unsupervised Representation Learning
nzw0301
0
120
[IJCAI-ECAI 2022] Evaluation Methods for Representation Learning: A Survey
nzw0301
0
570
[NeurIPS Japan meetup 2021 talk] Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
160
[IBIS2021] 対照的自己教師付き表現学習おける負例数の解析
nzw0301
0
150
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
450
Introduction of PAC-Bayes and its Application for Contrastive Unsupervised Representation Learning
nzw0301
2
760
NLP Tutorial; word representation learning
nzw0301
0
180
Analyzing Centralities of Embedded Nodes
nzw0301
0
140
Paper Reading: Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics
nzw0301
2
1.1k
Other Decks in Research
See All in Research
Weekly AI Agents News! 12月号 プロダクト/ニュースのアーカイブ
masatoto
0
210
Whoisの闇
hirachan
3
210
Zipf 白色化:タイプとトークンの区別がもたらす良質な埋め込み空間と損失関数
eumesy
PRO
8
1.2k
研究の進め方 ランダムネスとの付き合い方について
joisino
PRO
58
23k
[ECCV2024読み会] 衛星画像からの地上画像生成
elith
1
980
Optimal and Diffusion Transports in Machine Learning
gpeyre
0
730
メールからの名刺情報抽出におけるLLM活用 / Use of LLM in extracting business card information from e-mails
sansan_randd
2
340
20240918 交通くまもとーく 未来の鉄道網編(太田恒平)
trafficbrain
0
420
Weekly AI Agents News!
masatoto
30
44k
[依頼講演] 適応的実験計画法に基づく効率的無線システム設計
k_sato
0
210
2024/10/30 産総研AIセミナー発表資料
keisuke198619
1
400
Weekly AI Agents News! 11月号 論文のアーカイブ
masatoto
0
260
Featured
See All Featured
RailsConf 2023
tenderlove
29
970
Designing for humans not robots
tammielis
250
25k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
19
2.3k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
127
18k
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
The Cost Of JavaScript in 2023
addyosmani
46
7.2k
VelocityConf: Rendering Performance Case Studies
addyosmani
327
24k
GitHub's CSS Performance
jonrohan
1030
460k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Side Projects
sachag
452
42k
Transcript
Continuous Bag of Wordsೖ @ػցֶशษڧձ 201604݄22ʢۚʣ M1
ࠓ͢͜ͱ • ଟύʔηϓτϩϯ (MLP) • Continuous Bag of Words •
word2vecʹ͋ΔยํͷϞσϧ • ߴԽNGʹ͍ͭͯݴٴ͠·ͤΜ
ଟύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷΛड͚ͯɼؔΛద༻ͯ͠1ͭͷΛग़ྗ ʢؙ1ͭΛϢχοτɼؔΛ׆ੑԽؔʣ • ҹɿϢχοτͷग़ྗͱॏΈʢʣͷੵΛ࣍ͷʹ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden
layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3
ଟύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ
• ग़ྗɿ֬ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
۩ମྫɿೖྗ ͦΕͧΕ୯ޠͷස͕ೖྗͷೖྗ • doc0: [win8, win8, ms, ms, ms, jobs]
-> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
۩ମྫɿӅΕ ׆ੑԽؔ f(x) Λ௨ͯ͠ӅΕ͔Βग़ྗ doc0 Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔྫɿγάϞΠυؔ
۩ମྫɿग़ྗ ӅΕ-ग़ྗͷॏΈW’ɼ2x3ͷߦྻ ग़ྗɼ(ӅΕͷग़ྗ)x(ॏΈ)ͷΛड͚औΔ doc0 Input layer hidden layer output layer
(softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 = 1.0 1.0 W0f(h) = u o
ग़ྗͷ׆ੑԽؔ ग़ྗͷ׆ੑԽؔɿ֬Λग़ྗ͢Δsoftmaxؔ doc0(=[win8, win8, ms, ms, ms, jobs])0.54Ͱwinͷจॻ Input layer
hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
ֶश • ޡࠩٯ๏ΛͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷਖ਼ղϥϕϧ [0, 1]
Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
CBoWͷΞϧΰϦζϜ MLP͕Θ͔Εָͳͣɽɽɽɽ
one—hotදݱ • ୯ޠΛޠኮ࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼΓ0 ྫɿ͠{I, drink, coffee, everyday} ͳΒ
I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
จ຺૭෯ ͋Δจʹ͓͍ͯ͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of
Wordsʁ A. [I, drink, everyday]
Continuous Bag of Wordsɿ֓ཁ • 3ͷχϡʔϥϧωοτ • ೖྗɿจ຺૭෯ҎԼͰڞى͢Δ୯ޠ • ग़ྗɿ1୯ޠͷ֬
Continuous Bag of Wordsɿೖྗ MLPͷೖྗ͕ਤͷೖྗͷശ1ͭʹ૬ Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
Continuous Bag of Wordsɿೖྗ • ശ1ͭone-hotදݱΛड͚औΔ • I drink coffee
everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕͍෦ͷͱΔ coffee
Continuous Bag of Wordsɿೖྗ I = [0, 1, 0, 0]
drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ WN⇥V 2
4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ • ೖྗone–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕʹ
WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of WordsɿӅΕ • ୯ޠϕΫτϧͷฏۉ͕ӅΕͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔͳ͠ ut 2
+ ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
Continuous Bag of WordsɿӅΕ-ग़ྗ ॏΈߦྻ ͱӅΕͷग़ྗʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6
6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
Continuous Bag of Wordsɿग़ྗ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗͷϢχοτ = ޠኮ =
V • ׆ੑԽؔɿsoftmaxؔ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
Continuous Bag of Wordsɿग़ྗ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬ 2 6 6
4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬
ֶश݁Ռͷ୯ޠϕΫτϧ • ೖྗͱӅΕؒͷॏΈߦྻ͕୯ޠϕΫτϧͷू߹ • 1୯ޠɿ100࣍ݩͱ͔200࣍ݩͰີͳϕΫτϧ
୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •
୯ޠͷಛྔ • ਂֶशͷॳظ • ྨࣅܭࢉ • nzwͷ࠷ॳͷจ͜Ε
ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼ͕͍ؔΖ͍Ζ͋ͬͯศར • chainer :
https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷͱจɽεϥΠυͷਤͷCBoWͪ͜Β͔Β • ਂֶश Deep Learning. ਓೳֶձ. • ຊޠɼॻ੶