Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
CBoW入門
Search
Kento Nozawa
April 21, 2016
Research
4
3.6k
CBoW入門
2016年4月22日の機械学習勉強会の資料
Continuous Bag of Wordsの入門スライドです
Kento Nozawa
April 21, 2016
Tweet
Share
More Decks by Kento Nozawa
See All by Kento Nozawa
Analysis on Negative Sample Size in Contrastive Unsupervised Representation Learning
nzw0301
0
170
[IJCAI-ECAI 2022] Evaluation Methods for Representation Learning: A Survey
nzw0301
0
610
[NeurIPS Japan meetup 2021 talk] Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
200
[IBIS2021] 対照的自己教師付き表現学習おける負例数の解析
nzw0301
0
180
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
nzw0301
0
490
Introduction of PAC-Bayes and its Application for Contrastive Unsupervised Representation Learning
nzw0301
2
820
NLP Tutorial; word representation learning
nzw0301
0
220
Analyzing Centralities of Embedded Nodes
nzw0301
0
170
Paper Reading: Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics
nzw0301
2
1.2k
Other Decks in Research
See All in Research
最適化と機械学習による問題解決
mickey_kubo
0
170
問いを起点に、社会と共鳴する知を育む場へ
matsumoto_r
PRO
0
610
引力・斥力を制御可能なランダム部分集合の確率分布
wasyro
0
240
Delta Airlines® Customer Care in the U.S.: How to Reach Them Now
bookingcomcustomersupportusa
0
110
AIによる画像認識技術の進化 -25年の技術変遷を振り返る-
hf149
7
4k
まずはここから:Overleaf共同執筆・CopilotでAIコーディング入門・Codespacesで独立環境
matsui_528
2
470
EOGS: Gaussian Splatting for Efficient Satellite Image Photogrammetry
satai
4
500
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
100
数理最適化と機械学習の融合
mickey_kubo
16
9.3k
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
1.7k
NLP Colloquium
junokim
1
200
最適決定木を用いた処方的価格最適化
mickey_kubo
4
1.9k
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
19k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
BBQ
matthewcrist
89
9.8k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
44
2.5k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
111
20k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Designing Experiences People Love
moore
142
24k
What's in a price? How to price your products and services
michaelherold
246
12k
Raft: Consensus for Rubyists
vanstee
140
7.1k
Transcript
Continuous Bag of Wordsೖ @ػցֶशษڧձ 201604݄22ʢۚʣ M1
ࠓ͢͜ͱ • ଟύʔηϓτϩϯ (MLP) • Continuous Bag of Words •
word2vecʹ͋ΔยํͷϞσϧ • ߴԽNGʹ͍ͭͯݴٴ͠·ͤΜ
ଟύʔηϓτϩϯͷ͓͞Β͍ • ؙɿ1ͭͷΛड͚ͯɼؔΛద༻ͯ͠1ͭͷΛग़ྗ ʢؙ1ͭΛϢχοτɼؔΛ׆ੑԽؔʣ • ҹɿϢχοτͷग़ྗͱॏΈʢʣͷੵΛ࣍ͷʹ Ͱ͖Δ͚ͩਖ਼ղ͢ΔΑ͏ͳॏΈΛٻΊΔ Input layer hidden
layer output layer (soft max) x1 h3 h1 h2 x2 x3 x4 0.2 0.5 0.3
ଟύʔηϓτϩϯͷ۩ମྫ • 4୯ޠ͔͠ͳ͍ੈքΛߟ͑Δ • [jobs, mac, win8, ms] • ೖྗɿจॻ
• ग़ྗɿ֬ʢೖྗจॻ͕”mac”͔”windowns”ʣ Input layer hidden layer output layer (softmax) jobs h3 h1 h2 mac win8 ms p(mac)=0.2 p(win)=0.8
۩ମྫɿೖྗ ͦΕͧΕ୯ޠͷස͕ೖྗͷೖྗ • doc0: [win8, win8, ms, ms, ms, jobs]
-> ms • doc1: [jobs, mac, mac, mac, mac, mac, mac] -> mac Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=0 win8=2 ms=3 Input layer hidden layer output layer (softmax) jobs=1 h3 h1 h2 mac=6 win8=0 ms=0 doc0 doc1
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 Wx = h
۩ମྫɿӅΕ ೖྗ-ӅΕؒͷॏΈߦྻWɼ3x4ͷߦྻ ӅΕɼ(ೖྗͷग़ྗ)x(ॏΈ)ͷhΛड͚औΔ doc0 2 4 1 2 3 0
1 2 1 2 1 1 1 1 3 5 2 6 6 4 1 0 2 3 3 7 7 5 = 2 4 7 9 5 3 5 Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3
۩ମྫɿӅΕ ׆ੑԽؔ f(x) Λ௨ͯ͠ӅΕ͔Βग़ྗ doc0 Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 By Chrislb - created by Chrislb, CC දࣔ-ܧঝ 3.0, https://commons.wikimedia.org/w/index.php?curid=223990 ؔྫɿγάϞΠυؔ
۩ମྫɿग़ྗ ӅΕ-ग़ྗͷॏΈW’ɼ2x3ͷߦྻ ग़ྗɼ(ӅΕͷग़ྗ)x(ॏΈ)ͷΛड͚औΔ doc0 Input layer hidden layer output layer
(softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 1 1 1.01 1 1 1.01 2 4 0.99 0.99 0.99 3 5 = 1.0 1.0 W0f(h) = u o
ग़ྗͷ׆ੑԽؔ ग़ྗͷ׆ੑԽؔɿ֬Λग़ྗ͢Δsoftmaxؔ doc0(=[win8, win8, ms, ms, ms, jobs])0.54Ͱwinͷจॻ Input layer
hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54 exi P n exn e0.1 e0.1 + e 0.1 = 0.54 e 0.1 e0.1 + e 0.1 = 0.46
ֶश • ޡࠩٯ๏ΛͬͯॏΈW, W’ Λௐઅ͠ɼdoc0͕win ʹͳΔ֬ΛߴΊΔΑ͏ʹֶश • doc0ͱ͖ɼޡࠩͷݩʹͳΔͷਖ਼ղϥϕϧ [0, 1]
Input layer hidden layer output layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 -0.1 0.1 p(mac)=0.46 p(win)=0.54
CBoWͷΞϧΰϦζϜ MLP͕Θ͔Εָͳͣɽɽɽɽ
one—hotදݱ • ୯ޠΛޠኮ࣍ݩVͷϕΫτϧͰදݱ • ରԠ͢Δ࣍ݩ͚ͩ1ɼΓ0 ྫɿ͠{I, drink, coffee, everyday} ͳΒ
I = [1, 0, 0, 0] drink = [0, 1, 0, 0] coffee = [0, 0, 1, 0] everyday = [0, 0, 0, 1]
จ຺૭෯ ͋Δจʹ͓͍ͯ͢Δ1୯ޠͷपғn୯ޠΛѻ͏ ͜ͷͱ͖ɼnΛจ຺૭෯ͱ͍͏ Q. I drink coffee everydayͰจ຺૭෯2ҎԼʹग़ݱ͢Δ Bog of
Wordsʁ A. [I, drink, everyday]
Continuous Bag of Wordsɿ֓ཁ • 3ͷχϡʔϥϧωοτ • ೖྗɿจ຺૭෯ҎԼͰڞى͢Δ୯ޠ • ग़ྗɿ1୯ޠͷ֬
Continuous Bag of Wordsɿೖྗ MLPͷೖྗ͕ਤͷೖྗͷശ1ͭʹ૬ Input layer hidden layer output
layer (softmax) jobs=1 f(5)=0.99 f(7)=0.99 f(9)=0.99 mac=0 win8=2 ms=3 MLP
Continuous Bag of Wordsɿೖྗ • ശ1ͭone-hotදݱΛड͚औΔ • I drink coffee
everyday Ͱw(t)=coffee drink= [0, 1, 0, 0] ͕͍෦ͷͱΔ coffee
Continuous Bag of Wordsɿೖྗ I = [0, 1, 0, 0]
drink= [0, 1, 0, 0] everyday = [0, 0, 0, 1] coffee
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ WN⇥V 2
4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of Wordsɿೖྗ-ӅΕͷॏΈ • ҹ1ͭʹରͯ͠ɼॏΈߦྻ • ͜ͷॏΈߦྻڞ༗ • ೖྗone–hotΑΓɼ୯ޠϕΫτϧ͕ӅΕʹ
WN⇥V 2 4 1 2 3 0 1 2 1 2 1 1 1 1 3 5 2 6 6 4 0 1 0 0 3 7 7 5 = 2 4 2 2 1 3 5 Wx = ut 1
Continuous Bag of WordsɿӅΕ • ୯ޠϕΫτϧͷฏۉ͕ӅΕͷೖྗʢN࣍ݩϕΫτϧʣ • ׆ੑԽؔͳ͠ ut 2
+ ut 1 + ut+1 3 = h 1 3 0 @ 2 4 1 1 1 3 5 + 2 4 2 2 1 3 5 + 2 4 0 2 1 3 5 1 A = 2 4 1 1.67 0.33 3 5
Continuous Bag of WordsɿӅΕ-ग़ྗ ॏΈߦྻ ͱӅΕͷग़ྗʢฏۉϕΫτϧʣͷੵ W0V ⇥N 2 6
6 4 1 2 1 1 2 1 1 2 2 0 2 0 3 7 7 5 2 4 1.00 1.67 0.33 3 5 = 2 6 6 4 4.01 2.01 5.00 3.34 3 7 7 5 W0h = u o
Continuous Bag of Wordsɿग़ྗ 1୯ޠͷ༧ଌΛ͍ͨ͠ • ग़ྗͷϢχοτ = ޠኮ =
V • ׆ੑԽؔɿsoftmaxؔ softmax (u o ) = y softmax 0 B B @ 2 6 6 4 4 . 01 2 . 01 5 . 00 3 . 34 3 7 7 5 1 C C A = 2 6 6 4 0 . 23 0 . 03 0 . 62 0 . 12 3 7 7 5
Continuous Bag of Wordsɿग़ྗ I, drink, everydayΛೖΕͯಘΒΕͨ୯ޠͷ֬ 2 6 6
4 0.23 0.03 0.62 0.12 3 7 7 5 coffeeͷ֬
ֶश݁Ռͷ୯ޠϕΫτϧ • ೖྗͱӅΕؒͷॏΈߦྻ͕୯ޠϕΫτϧͷू߹ • 1୯ޠɿ100࣍ݩͱ͔200࣍ݩͰີͳϕΫτϧ
୯ޠϕΫτϧͷخ͍͠ಛੑ • analogy • king-man+woman=queen • Japan-Tokyo+Paris=France • eats-eat+run=runs •
୯ޠͷಛྔ • ਂֶशͷॳظ • ྨࣅܭࢉ • nzwͷ࠷ॳͷจ͜Ε
ࢀߟจݙͳͲ • gensim : https://radimrehurek.com/gensim/ • pythonɼ͕͍ؔΖ͍Ζ͋ͬͯศར • chainer :
https://github.com/pfnet/chainer/tree/master/examples/word2vec • PythonɼχϡʔϥϧωοτͰͷ࣮ྫ • word2vec : https://code.google.com/archive/p/word2vec/ • CɼΦϦδφϧ • word2vec Parameter Learning Explained : http://arxiv.org/pdf/1411.2738v3.pdf • ӳޠɼΘ͔Γ͍͢ղઆ • Efficient Estimation of Word Representations in Vector Spaceɿhttp://arxiv.org/pdf/ 1301.3781.pdf • ӳޠɼCBoWͷͱจɽεϥΠυͷਤͷCBoWͪ͜Β͔Β • ਂֶश Deep Learning. ਓೳֶձ. • ຊޠɼॻ੶