Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
CBoW入門
Search
Kento Nozawa
April 21, 2016
Research
4
3.6k
CBoW入門
2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです
Kento Nozawa
April 21, 2016
Tweet
Share
More Decks by Kento Nozawa
See All by Kento Nozawa
Analysis on Negative Sample Size in Contrastive Unsupervised Representation Learning
nzw0301
0
140
[IJCAI-ECAI 2022] Evaluation Methods for Representation Learning: A Survey
nzw0301
0
590
[NeurIPS Japan meetup 2021 talk] Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
170
[IBIS2021] 対照的自己教師付き表現学習おける負例数の解析
nzw0301
0
160
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
470
Introduction of PAC-Bayes and its Application for Contrastive Unsupervised Representation Learning
nzw0301
2
790
NLP Tutorial; word representation learning
nzw0301
0
190
Analyzing Centralities of Embedded Nodes
nzw0301
0
150
Paper Reading: Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics
nzw0301
2
1.1k
Other Decks in Research
See All in Research
移動ビッグデータに基づく地理情報の埋め込みベクトル化
tam1110
0
270
ラムダ計算の拡張に基づく 音楽プログラミング言語mimium とそのVMの実装
tomoyanonymous
0
440
博士学位論文予備審査 / Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage, and Mining
yuukit
1
1.8k
Segment Any Change
satai
3
270
Data-centric AI勉強会 「ロボットにおけるData-centric AI」
haraduka
0
510
ドローンやICTを活用した持続可能なまちづくりに関する研究
nro2daisuke
0
200
Intrinsic Self-Supervision for Data Quality Audits
fabiangroeger
0
440
3D Gaussian Splattingによる高効率な新規視点合成技術とその応用
muskie82
0
230
チュートリアル:Mamba, Vision Mamba (Vim)
hf149
6
3.2k
ウッドスタックチャン:木材を用いた小型エージェントロボットの開発と印象評価 / ec75-sato
yumulab
0
120
SpectralMamba: Efficient Mamba for Hyperspectral Image Classification
satai
3
200
Building Height Estimation Using Shadow Length in Satellite Imagery
satai
3
240
Featured
See All Featured
Statistics for Hackers
jakevdp
798
220k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
22
2.6k
The Language of Interfaces
destraynor
157
24k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
21k
A designer walks into a library…
pauljervisheath
205
24k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
135
33k
4 Signs Your Business is Dying
shpigford
183
22k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
28
1.6k
VelocityConf: Rendering Performance Case Studies
addyosmani
328
24k
Designing for Performance
lara
606
69k
Transcript
Continuous Bag of Wordsೖ @ػցֶशษڧձ 201604݄22ʢۚʣ M1
ࠓ͢͜ͱ • ଟύʔηϓτϩϯ (MLP) • Continuous Bag of Words •
word2vecʹ͋ΔยํͷϞσϧ • ߴԽNGʹ͍ͭͯݴٴ͠·ͤΜ
ଟύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷΛड͚ͯɼؔΛద༻ͯ͠1ͭͷΛग़ྗ ʢؙ1ͭΛϢχοτɼؔΛ׆ੑԽؔʣ • ҹɿϢχοτͷग़ྗͱॏΈʢʣͷੵΛ࣍ͷʹ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden
layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3
ଟύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ
• ग़ྗɿ֬ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
۩ମྫɿೖྗ ͦΕͧΕ୯ޠͷස͕ೖྗͷೖྗ • doc0: [win8, win8, ms, ms, ms, jobs]
-> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
۩ମྫɿӅΕ ׆ੑԽؔ f(x) Λ௨ͯ͠ӅΕ͔Βग़ྗ doc0 Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔྫɿγάϞΠυؔ
۩ମྫɿग़ྗ ӅΕ-ग़ྗͷॏΈW’ɼ2x3ͷߦྻ ग़ྗɼ(ӅΕͷग़ྗ)x(ॏΈ)ͷΛड͚औΔ doc0 Input layer hidden layer output layer
(softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 = 1.0 1.0 W0f(h) = u o
ग़ྗͷ׆ੑԽؔ ग़ྗͷ׆ੑԽؔɿ֬Λग़ྗ͢Δsoftmaxؔ doc0(=[win8, win8, ms, ms, ms, jobs])0.54Ͱwinͷจॻ Input layer
hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
ֶश • ޡࠩٯ๏ΛͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷਖ਼ղϥϕϧ [0, 1]
Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
CBoWͷΞϧΰϦζϜ MLP͕Θ͔Εָͳͣɽɽɽɽ
one—hotදݱ • ୯ޠΛޠኮ࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼΓ0 ྫɿ͠{I, drink, coffee, everyday} ͳΒ
I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
จ຺૭෯ ͋Δจʹ͓͍ͯ͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of
Wordsʁ A. [I, drink, everyday]
Continuous Bag of Wordsɿ֓ཁ • 3ͷχϡʔϥϧωοτ • ೖྗɿจ຺૭෯ҎԼͰڞى͢Δ୯ޠ • ग़ྗɿ1୯ޠͷ֬
Continuous Bag of Wordsɿೖྗ MLPͷೖྗ͕ਤͷೖྗͷശ1ͭʹ૬ Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
Continuous Bag of Wordsɿೖྗ • ശ1ͭone-hotදݱΛड͚औΔ • I drink coffee
everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕͍෦ͷͱΔ coffee
Continuous Bag of Wordsɿೖྗ I = [0, 1, 0, 0]
drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ WN⇥V 2
4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ • ೖྗone–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕʹ
WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of WordsɿӅΕ • ୯ޠϕΫτϧͷฏۉ͕ӅΕͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔͳ͠ ut 2
+ ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
Continuous Bag of WordsɿӅΕ-ग़ྗ ॏΈߦྻ ͱӅΕͷग़ྗʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6
6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
Continuous Bag of Wordsɿग़ྗ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗͷϢχοτ = ޠኮ =
V • ׆ੑԽؔɿsoftmaxؔ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
Continuous Bag of Wordsɿग़ྗ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬ 2 6 6
4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬
ֶश݁Ռͷ୯ޠϕΫτϧ • ೖྗͱӅΕؒͷॏΈߦྻ͕୯ޠϕΫτϧͷू߹ • 1୯ޠɿ100࣍ݩͱ͔200࣍ݩͰີͳϕΫτϧ
୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •
୯ޠͷಛྔ • ਂֶशͷॳظ • ྨࣅܭࢉ • nzwͷ࠷ॳͷจ͜Ε
ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼ͕͍ؔΖ͍Ζ͋ͬͯศར • chainer :
https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷͱจɽεϥΠυͷਤͷCBoWͪ͜Β͔Β • ਂֶश Deep Learning. ਓೳֶձ. • ຊޠɼॻ੶