Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PRML(ニューラルネット編)
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
gucchi
September 20, 2019
Science
340
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
PRML(ニューラルネット編)
gucchi
September 20, 2019
More Decks by gucchi
See All by gucchi
PRML(分類編)
gucchi
2
510
PRML(回帰編)
gucchi
2
600
PRML第10章
gucchi
1
350
PRMLセミナー(第9章)
gucchi
3
430
PRMLセミナー
gucchi
2
330
PRML第11章
gucchi
1
360
PRMLセミナー
gucchi
1
410
PRMLセミナー
gucchi
1
600
PRML第6章
gucchi
1
67
Other Decks in Science
See All in Science
機械学習 - SVM
trycycle
PRO
1
1.1k
Bリーグのショットデータを活用した得点期待値モデルの構築 / Construction of expected points model using shot data of B.LEAGUE
konakalab
0
140
コミュニティサイエンスの実践@日本認知科学会2025
hayataka88
0
170
SpatialRDDパッケージによる空間回帰不連続デザイン
saltcooky12
0
240
力学系から見た現代的な機械学習
hanbao
4
4.2k
検索と推論タスクに関する論文の紹介
ynakano
1
230
KISHIMOTO Atsuo
genomethica
0
140
次代のデータサイエンティストへ~スキルチェックリスト、タスクリスト更新~
datascientistsociety
PRO
3
42k
ITTF卓球世界ランキングのポイント比を用いた試合結果予測モデルの性能評価 / Performance evaluation of match result prediction models using the point ratio of the ITTF Table Tennis World Ranking
konakalab
0
130
20260220 OpenIDファウンデーション・ジャパン ご紹介 / 20260220 OpenID Foundation Japan Intro
oidfj
0
360
データベース05: SQL(2/3) 結合質問
trycycle
PRO
0
1.2k
Algorithmic Aspects of Quiver Representations
tasusu
0
360
Featured
See All Featured
The Cost Of JavaScript in 2023
addyosmani
55
10k
Practical Orchestrator
shlominoach
191
11k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Chasing Engaging Ingredients in Design
codingconduct
0
210
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
360
30k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
310
Code Reviewing Like a Champion
maltzj
528
40k
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
480
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
22k
Transcript
PRML ΛࡐʹػցֶशΛਂ͘ཧղ͢Δηϛφʔ ʲχϡʔϥϧωοτฤʳ ࡔޱ ྒี 1 / 38
0. ࠓճͷηϛφʔʹ͍ͭͯ ࠓճͷηϛφʔͰɺPRML ͷୈ 5 ষͷχϡʔϥϧωοτΛத৺ʹ͓ ͍ͨ͠͠ͱࢥ͍·͢ɻ ·ͨࠓճͷηϛφʔͰɺਂֶशͷຊʹΑ͘ॻ͍ͯ͋Δχϡʔϥϧ ωοτΛϊʔυͱΤοδ (ล)
Λ༻͍ͨάϥϑͰදݱ͢Δ͜ͱ͔Β࢝Ί ͯɺߦྻԋࢉͷޡࠩؔͷΛ͠ɺٯޡࠩൖ๏ͷઆ໌ʹҠΔྲྀΕ ͷ͠ͳ͍ɻ χϡʔϥϧωοτΛઢܗճؼϞσϧ (PRML 3 ষ) ϩδεςΟοΫճ ؼ (PRML 4 ষ) Λ֦ுͨ͠Ϟσϧͱͯ͠ಋೖ͢ΔΛ͢Δɻ(εϥΠ υ 2 ষ) ಋೖޙɺχϡʔϥϧωοτͷॏΈͷରশੑ (εϥΠυ 3 ষ) ଛࣦؔ ͱਖ਼ଇԽͷ (εϥΠυ 4 ষ) Λߦ͏ɻ ͦͷͨΊɺઢܗճؼϞσϧϩδεςΟοΫճؼطͱ͠·͢ɻ ͳ͓ҙͱͯ͠ɺຊεϥΠυͷࣜ൪߸ͱ PRML ͷࣜ൪߸ҟͳΓ· ͢ͷͰɺ͝ҙ͍ͩ͘͞ɻ 2 / 38
࣍ 1. ಋೖ 2. χϡʔϥϧωοτϫʔΫؔ (PRML 5.1) 3. ॏΈͷۭؒରশੑ (PRML
5.1.1) 4. ଛࣦؔͱਖ਼ଇԽ (PRML 5.2, 1.2.5) 3 / 38
1. ಋೖ ·ͣɺຊεϥΠυશମΛ௨ͯ͠ɺ܇࿅σʔλͷೖྗϕΫτϧͷू߹Λ {x1 , x2 , · · ·
, xN } ͱॻ͖ɺೖྗϕΫτϧ xn D ࣍ݩͷϕΫτϧͱ ͢Δɻ ·ͨɺͦͷೖྗϕΫτϧʹରԠ͢ΔඪϕΫτϧͷू߹Λ {t1 , t2 , · · · , tN } ͱॻ͖ɺtn K ࣍ݩͷϕΫτϧͱ͢Δɻ (χϡʔϥϧωοτʹݶΒͣ) ڭࢣ͋ΓػցֶशͰͷզʑͷత༻ҙ ͨ͠܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔ΒඪϕΫτϧΛ༧ଌ͢Δؔ y(x) Λ࡞ͬͯɺະͷσʔλ x ͷඪϕΫτϧ t Λ y(x) Ͱ༧ଌ͢Δ͜ ͱͰ͋Δɻ 4 / 38
1. ಋೖ ࣮ͨͩ͠ࡍɺ܇࿅σʔλΛͬͯ༧ଌؔ y(x) ΛҰ͔Β࡞Γ্͛Δ ͜ͱ͠ͳ͍ɻ PRML ͷ 3 ষ
(ઢܗճؼ) ͰɺK = 1 ͱͯ͠ɺҎԼͷΑ͏ͳܗΛͨ͠ ؔ y(x, w) y(x, w) = w0 + M−1 ∑ j=1 wj ϕj (x) = wTϕ(x) (1.1) ʹݶఆͯٞ͠Λͨ͠ɻ ͜͜Ͱɺw = (w0 , w1 , · · · , wM−1 )T ύϥϝʔλϕΫτϧͰ͋Δɻ ؔ y(x) ΛҰ͔Β࡞ΔΘΓʹɺ܇࿅σʔλΛͬͯύϥϝʔλϕΫ τϧ w Λௐઅ (w = w⋆) ͠ɺඪมͷ༧ଌؔ y(x) ͱͯ͠ɺ y(x, w = w⋆) Λ༻͢Δɻ 5 / 38
1. ಋೖ ͪͳΈʹɺಛϕΫτϧͱݺΕΔϕΫτϧؔ ϕ(x) ϕ(x) = (ϕ0 (x), ϕ1
(x), · · · , ϕM−1 (x))T ͱఆٛ͞Εɺϕ0 (x) = 1ɺͦΕҎ ֎ͷ ϕj (x) (j = 1, · · · , M − 1) Կ͔͠Βͷඇઢܗͳؔ (جఈؔ) Ͱ͋Δɻ ྫ͑ɺجఈؔͷྫͱͯ͠Ψεجఈ͕ؔ͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (1.2) ͜ͷجఈؔ x = µj Λத৺ʹͯ͠ɺࢄ s2 ʹΑͬͯࢧ͞ΕΔ͕ ΓΛ࣋ͭΨεجఈؔͰ͋Δɻ 6 / 38
1. ಋೖ ҰํɺPRML ͷ 4 ষͰٞͨ͠ϩδεςΟοΫճؼͰɺK = 1 ͱ͠ ͯɺҎԼͷΑ͏ͳܗΛͨؔ͠
y(x, w) y(x, w) = σ(wTϕ(x)) (1.3) ʹݶఆͯٞ͠Λͨ͠ɻ ͜͜Ͱɺσ(x) ϩδεςΟοΫγάϞΠυؔͱݺΕɺҎԼͰఆٛ ͞ΕΔɻ σ(x) = 1 1 + e−x (1.4) ਤͰॻ͘ͱҎԼͷΑ͏ʹͳΔɻ 7 / 38
1. ಋೖ ճؼͰɺ༧ଌؔ y(x) Λͦͷ··ඪมͷ༧ଌ݁Ռʹ͑Δ͕ɺ ྨͰ͋ΔϩδεςΟοΫճؼͰɺ͋ΔೖྗϕΫτϧ x ͕༩͑ ΒΕͨ࣌ʹ y(x)
≥ 0 Ͱ͋Ε x Ϋϥε 1 ʹॴଐ͠ (t = 1)ɺy(x) < 0 Ͱ͋Ε x Ϋϥε 2 ʹॴଐ͢Δ (t = 0) ͱ͢Δɻ ·ͱΊΔͱɺઢܗճؼͰϩδεςΟοΫճؼͰ༧ଌؔ y(x) ΛҎ ԼͷΑ͏ͳಛఆͷܗʹԾఆ͓͍ͯͯ͠ɺ y(x, w) = f(wTϕ(x)) (1.5) ܇࿅σʔλΛ༻͍ͯɺύϥϝʔλ w Λௐઅ͢ΔࣄʹΑΓɺ༧ଌؔ y(x) Λੜͨ͠ɻ ͜͜Ͱɺؔ f(·) ҙͷඇઢܗؔͰ͋Δɻ(ઢܗճؼͷ࣌߃ؔ ɺϩδεςΟοΫճؼͷ࣌ϩδεςΟοΫγάϞΠυؔΛ༻ ͨ͠ɻ) ϕΫτϧؔ ϕ(x) ΛಛఆͷؔʹऔΔ͜ͱͰϞσϧ͕χϡʔϥϧωο τϫʔΫϞσϧʹͳΔɻ 8 / 38
2. χϡʔϥϧωοτϫʔΫؔ ͜Ε·ͰͷٞͰɺઢܗճؼϩδεςΟοΫճؼ༧ଌؔ y(x, w) y(x, w) = f(wTϕ(x))
(2.1) ͷؔͷܗΛԾఆ͢Δ͜ͱΛઆ໌ͨ͠ɻ ۩ମྫͱͯ͠ɺϕ(x) ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T Ͱఆٛ͞ Ε͍ͯͯɺϕ0 (x) = 1 ͱ͠ɺͦΕҎ֎ͷ ϕj (x) (j = 1, · · · , M − 1) Ҏ ԼͷΑ͏ʹΨεجఈؔͱԾఆ͢Δํ๏͕͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (2.2) ͜ͷΨεجఈؔͷύϥϝʔλ µj (j = 1, · · · , M − 1) ͱ s2 ύϥ ϝʔλɺ܇࿅σʔλΛ༻͍ͯௐઅ͞ΕΔύϥϝʔλ w ͱҟͳΓɺ y(x, w) ͷܗΛܾΊΔ࣌ʹखಈͰܾΊΔϋΠύʔύϥϝʔλͰ͋Δɻ (͜Εֶ͕͠शύϥϝʔλͰ͋ͬͨΒɺ ʮઢܗʯճؼͰͳ͘ͳΔ) 9 / 38
2. χϡʔϥϧωοτϫʔΫؔ χϡʔϥϧωοτͰɺಛϕΫτϧ ϕ(x) ֶ͕ࣗशύϥϝʔλʹґ ଘ͢ΔΑ͏ʹબͿɻ ύϥϝʔλʹ͍ͭͯɺΨεجఈؔͷ࣌ͷ µj (j =
1, · · · , M − 1) ͱ s2 ͱಉ͡Α͏ʹجఈؔ ϕj (x) (j = 0, · · · , M − 1) ͦΕͧΕʹಠཱ ͳύϥϝʔλ w(1) j Λ༻ҙ͢Δɻ ·ͨɺ͜ΕΒͷύϥϝʔλ w(1) j (ॎϕΫτϧ) Λసஔͯ͠ɺॎʹฒͨ ҎԼͷΑ͏ͳߦྻ W(1) Λߟ͑Δɻ W(1) = ( w(1) 0 , w(1) 1 , · · · , w(1) M−1 )T (2.3) ಛϕΫτϧ ϕ(x) ߦྻ W(1) ґଘ͓ͯ͠Γɺϕ(x; W(1)) ͱ͔͘͜ͱ ʹ͢Δɻ 10 / 38
2. χϡʔϥϧωοτϫʔΫؔ ֶशύϥϝʔλʹґଘͨ͠ϕΫτϧؔ ϕ(x; W(1)) Λ༻͍Δͱɺ༧ଌ ؔ y(x, w) ҎԼͷΑ͏ʹͳΔɻ
y(x, w) = f ( w(2)T ϕ(x; W(1)) ) (2.4) ͜͜Ͱɺw ύϥϝʔλϕΫτϧ w(2) ͱ W(1) Λ߹Θͤͨશͯͷύϥ ϝʔλΛҙຯ͠ɺͭ·Γ w(2) w ͷதͰ W(1) Ҏ֎ͷύϥϝʔλͰ ͋Δɻ ͜͜ͰɺಛϕΫτϧ ϕ(x; W(1)) Λɺh(x) ΛͳΜΒ͔ͷඇઢܗؔͱ ͯ͠ɺҎԼͷܗʹݶఆ͢Δɻ ϕ(x; W(1)) =h ( W(1)x ) = ( h ( D ∑ i=0 w(1) 0i xi ) , h ( D ∑ i=0 w(1) 1i xi ) , · · · , h ( D ∑ i=0 w(1) M−1,i xi )) T (2.5) ߦྻ W(1) ͷ (j, i) Λ w(1) ji ͱॻ͘͜ͱʹ͢Δɻ 11 / 38
2. χϡʔϥϧωοτϫʔΫؔ ͜͜ͰɺεΧϥʔͷҾΛ࣋ͭؔ h(x) ʹϕΫτϧͷҾΛ༩͑Δͱɺ ҎԼͷΑ͏ʹҾΛಉ࣍͡ݩͷϕΫτϧΛฦ͢ͱ͢Δɻ h(a) = (h(a1 ),
h(a2 ), · · · , h(aD ))T (2.6) (2.5) ͷΑ͏ʹϕΫτϧؔͷܗΛݶఆͨ͠Βɺ༧ؔ y(x, w) த ؒ 1 ͭͰग़ྗϢχοτ 1 ͭͰதؒͱग़ྗͷ׆ੑԽ͕ؔͦΕͧ Ε h ͱ f Ͱ͋ΔχϡʔϥϧωοτϫʔΫؔͱͳΔ͜ͱ͕Θ͔Δɻ y(x, w) = f ( w(2)T h ( W(1)x )) (2.7) 12 / 38
2. χϡʔϥϧωοτϫʔΫؔ ͞ΒʹҰൠԽͱͯ͠ɺ༧ଌؔ y(x, w) Λ K ͷϕΫτϧ༧ଌؔ y(x, w)
ʹ֦ு͠ɺy(x, w) ͷ k Λ yk (x, w) ͱॻ͘ɻ ͜ͷҰൠԽχϡʔϥϧωοτͷग़ྗϢχοτͷΛ 1 ͔ͭΒ K ݸ ͷ֦ுʹରԠ͢Δɻ ͜ͷ࣌ɺॏΈύϥϝʔλ (2.7) ʹؚ·ΕΔ w(2) ϕΫτϧ༧ଌؔ y(x, w) ͷ͝ͱʹಠཱͨ͠ύϥϝʔλ w(2) k Λ༻ҙ͢ΔͱɺϕΫτ ϧؔ y(x, w) ͷ k yk (x, w) = f ( w(2) k T h ( W(1)x )) (2.8) ͱͳΔɻ 13 / 38
2. χϡʔϥϧωοτϫʔΫؔ W(1) ͱಉ͡Α͏ʹɺw(2) k (ॎϕΫτϧ) Λసஔͯ͠ɺॎʹฒͨҎԼͷ Α͏ͳߦྻ W(2) W(2)
= ( w(2) 1 , w(2) 2 , · · · , w(1) K )T (2.9) Λߟ͑ΔͱɺϕΫτϧؔ y(x, w) ҎԼͷΑ͏ʹͳΓɺ͜Εதؒ 1 ͭͰग़ྗϢχοτ K ݸͷχϡʔϥϧωοτϫʔΫؔͱͳΔɻ y(x, w) = f ( W(2)h ( W(1)x )) (2.10) ߦྻ W(1) ͷ (j, i) Λ w(1) ji ɺߦྻ W(2) ͷ (k, j) Λ w(2) kj ͱ͢Δ ͱɺ༧ଌؔ yk (x, w) ҎԼͷΑ͏ͳ (ݟ׳Εͨ) ܗʹͳΔɻ yk (x, w) = f ( M−1 ∑ j=0 w(2) kj h ( D ∑ i=0 w(1) ji xi )) (2.11) 14 / 38
3. ॏΈͷۭؒରশੑ ࣍ʹॏΈύϥϝʔλͷۭؒରশੑʹ͍ͭͯઆ໌͢Δɻ ͜͜ͰɺχϡʔϥϧωοτϫʔΫؔͷ׆ੑԽؔ f ͱ h ΛͦΕͧ ΕϩδεςΟοΫγάϞΠυؔͱλϯδΣϯτϋΠύϘϦοΫؔ ͱ͠ɺҎԼͷΑ͏ͳؔΛߟ͑Δɻ
y(x, w) = σ ( W(2) tanh ( W(1)x )) (3.1) ͜͜ͰɺλϯδΣϯτϋΠύϘϦοΫؔҎԼͷΑ͏ͳؔͰ͋Δɻ tanh(x) = ex − e−x ex + e−x (3.2) 15 / 38
3. ॏΈͷۭؒରশੑ λϯδΣϯτϋΠύϘϦοΫͷॏཁͳੑ࣭ͱͯ͠ɺحؔੑ͕͋Δɻ tanh(−x) = e−x − e−(−x) e−x +
e−(−x) = − ex − e−x ex + e−x = − tanh(x) (3.3) ·ͨɺߦྻΛΘͳ͍Ͱॻ͘ͱɺy(x, w) ͷ k yk (x, w) yk (x, w) = σ ( M−1 ∑ j=0 w(2) kj tanh ( D ∑ i=0 w(1) ji xi )) (3.4) ͱͳΔɻ 16 / 38
3. ॏΈͷۭؒରশੑ ͜͜Ͱ (3.4) ͷӈลͰɺj = 1 ͷશͯͷ i ʹରͯ͠
w(1) j(=1)i → −w(1) j(=1)i ͱ͍͏ූ߸సͷมΛߦͬͯΈΔɻ ͢Δͱɺ(3.4) ͷӈล yk (x, w) =σ ( M−1 ∑ j=0 w(2) kj tanh ( D ∑ i=0 w(1) ji xi )) =σ ( w(2) k0 tanh ( D ∑ i=0 w(1) 0i xi ) + w(2) k1 tanh ( D ∑ i=0 w(1) 1i xi ) + · · · ) →σ ( w(2) k0 tanh ( D ∑ i=0 w(1) 0i xi ) − w(2) k1 tanh ( D ∑ i=0 w(1) 1i xi ) + · · · ) (3.5) ͱมԽ͢Δɻ Αͬͯɺશͯͷ i ʹରͯ͠ w(1) 1i → −w(1) 1i ͳΔมΛߦͬͯɺಉ࣌ʹ શͯͷ k ʹରͯ͠ w(2) k1 → −w(2) k1 ͱ͍͏มԽΛߦ͑ɺؔ yk (x, w) ෆมʹอͨΕΔɻ 17 / 38
3. ॏΈͷۭؒରশੑ j j = 0, 1, · ·
· , M − 1 ͷ M ݸͷΛͱΔͷͰɺ͋Δ j ʹର͢Δ {(w(1) ji , w(2) kj )}i,k → {(−w(1) ji , −w(2) kj )}i,k ͳΔؔ yk (x, w) Λෆมʹ͢ Δม M ݸଘࡏ͢Δɻ ͜ΕΑΓɺֶशʹΑͬͯ࠷దԽ͞ΕͨॏΈ W(1), W(2) ͕ಘΒΕͨ࣌ɺ ҙͷೖྗʹ͓͍ͯՁͳग़ྗ yk (x, w) Λ༩͑ΔॏΈɺॏΈ W(1), W(2) ΛؚΊͯ 2M ݸଘࡏ͢Δ͜ͱ͕Θ͔Δɻ 18 / 38
3. ॏΈͷۭؒରশੑ ·ͨɺ͏Ұछྨͷରশੑͱͯ͠ɺؔ yk (x, w) yk (x, w) =
σ ( M−1 ∑ j=0 w(2) kj tanh ( D ∑ i=0 w(1) ji xi )) (3.6) ͷ͋Δ j = j1 ͷॏΈͷू߹ {(w(1) j1i , w(2) kj1 )}i,k ͱ j = j2 ͷॏΈͷू߹ {(w(1) j2i , w(2) kj2 )}i,k ΛೖΕସ͑ͨͱͯ͠ɺҙͷೖྗ x Ͱग़ྗ yk (x, w) มԽ͠ͳ͍ɻ(ަରশੑ) ͜Εɺ(3.6) ͷӈลͷ j ͷͷॱংΛม͑Δ͜ͱʹ૬͢Δɻ ͭ·ΓɺֶशʹΑͬͯ࠷దԽ͞ΕͨॏΈ W(1), W(2) ͕ಘΒΕͨ࣌ɺ͜ ͷަෆมੑʹΑΓɺҙͷೖྗʹ͓͍ͯՁͳग़ྗ yk (x, w) Λ༩͑ ΔॏΈɺॏΈ W(1), W(2) ΛؚΊͯ M! ݸଘࡏ͢Δ͜ͱ͕Θ͔Δɻ 19 / 38
3. ॏΈͷۭؒରশੑ ූ߸సରশੑͱަରশੑΛ߹ΘͤΔͱɺֶशʹΑͬͯ࠷దԽ͞Εͨ ॏΈ W(1), W(2) ͕ಘΒΕͨ࣌ɺ͜ͷަෆมੑʹΑΓɺҙͷೖྗʹ ͓͍ͯՁͳग़ྗ yk (x,
w) Λ༩͑ΔॏΈɺॏΈ W(1), W(2) ΛؚΊͯ 2M · M! ݸଘࡏ͢Δ͜ͱ͕Θ͔Δɻ 20 / 38
4. ଛࣦؔͱਖ਼ଇԽ Ұൠతʹதؒ 1 ͭͷχϡʔϥϧωοτϫʔΫͷ k ݸͷϢχοτͷ ग़ྗ yk (x,
w) = f ( M−1 ∑ j=0 w(2) kj h ( D ∑ i=0 w(1) ji xi )) (4.1) Ͱ༩͑ΒΕΔ͜ͱ͕Θ͔ͬͨɻ ͜͜Ͱɺؔ h ͱ f ׆ੑԽؔͱݺΕΔඇઢܗؔͰ͋Γɺw(1) ji ͱ w(2) kj ֤ͷॏΈͰ͋Δɻ ܇࿅σʔλͷೖྗϕΫτϧͷू߹Λ {x1 , x2 , · · · , xN } ͱॻ͖ɺͦͷೖ ྗϕΫτϧʹରԠ͢ΔඪϕΫτϧͷू߹Λ {t1 , t2 , · · · , tN } ͱॻ͘ ͱɺΑ͘ߦΘΕΔύϥϝʔλͷ࠷దԽͷํ๏ͱͯ͠ɺճؼͷ࣌ʹҎԼͷ ೋޡࠩΛ࠷খʹ͢ΔΑ͏ʹύϥϝʔλΛܾΊΔํ๏͕͋Δɻ E(w) = 1 2 N ∑ n=1 ∥y(xn , w) − tn ∥2 (4.2) ͜͜Ͱɺy(x, w) = (y1 (x, w), y2 (x, w), · · · , yK (x, w))T Ͱ͋Δɻ 21 / 38
4. ଛࣦؔͱਖ਼ଇԽ χϡʔϥϧωοτϫʔΫͷग़ྗ yk (x, w) Λ֬తʹղऍ͢Δͱɺೋ ޡࠩͷ࠷খԽ࠷ਪఆͷ݁ՌͰ͋Δ͜ͱ͕Θ͔Δɻ ͜͜Ͱɺ؆୯ͷͨΊχϡʔϥϧωοτͷग़ྗϢχοτͷ 1
ͭͰ͋ Δ࣌ͷ͜ͱΛߟ͑Δɻ y(x, w) = f ( M−1 ∑ j=0 w(2) j h ( D ∑ i=0 w(1) ji xi )) (4.3) ·ͣճؼ͔Β࢝ΊΔɻͭ·Γɺඪม {t1 , t2 , · · · , tN } ͦΕ ͧΕ࿈ଓతͳΛ࣋ͭɻ ճؼͰɺ׆ੑԽؔ f ͱ h ΛͦΕͧΕ߃ؔͱλϯδΣϯτϋΠ ύϘϦοΫؔͱ͢Δɻ y(x, w) = M−1 ∑ j=0 w(2) j tanh ( D ∑ i=0 w(1) ji xi ) (4.4) 22 / 38
4. ଛࣦؔͱਖ਼ଇԽ ·ͣɺԾఆͱͯ͠ɺ܇࿅σʔλͷೖྗ {x1 , x2 , · · ·
, xN } ͕ͳΜΒ͔ͷํ ๏Ͱੜ͞Ε (αϯϓϦϯά๏ͷٞ PRML 11 ষ)ɺͦͷೖྗϕΫτ ϧʹରԠ͢Δඪม {t1 , t2 , · · · , tN } ҎԼͷฏۉ͕ग़ྗ y(x, w) Ͱ ͋ΔΨεͰͦΕͧΕಠཱʹੜ͞ΕΔͱ͢Δɻ p(t|x, w, β) = N(t|y(x, w), β−1) (4.5) ͜͜Ͱɺw, β ֶ͕शʹΑͬͯௐઅ͞ΕΔύϥϝʔλͰ͋Δɻ 23 / 38
4. ଛࣦؔͱਖ਼ଇԽ ΨεҎԼͰఆٛ͞ΕΔɻ(ύϥϝʔλฏۉ µ ͱࢄ σ2 ͷ 2 ͭ) N(x|µ,
σ2) = 1 (2πσ2)1/2 exp { − 1 2σ2 (x − µ)2 } (4.6) ճؼͷ߹֬ม࿈ଓมͳͷͰɺ͜ͷΨΠεͷԾఆऔΓ ͏Δͷൣғʹؔͯࣗ͠વͰ͋Δɻ(ྨͰผͷΛԾఆ ͢Δɻ) 24 / 38
4. ଛࣦؔͱਖ਼ଇԽ ܇࿅σʔλ (4.5) ͔Βಠཱʹੜ͞ΕΔͷͰɺؔҎԼͷΑ͏ ʹͦΕͧΕͷσʔλͷੵͰ͔͚Δɻ p(t|X, w, β) =
N ∏ n=1 N(tn |y(xn , w), β−1) (4.7) ͜ͷؔΛ࠷େʹ͢Δ w, β ΛٻΊΔ͜ͱΛߟ͑Δɻ(࠷ਪఆ๏) ͦ͜Ͱɺp(t|X, w, β) Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔΘΓʹ ؔͷରΛ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔɻ 25 / 38
4. ଛࣦؔͱਖ਼ଇԽ ·ͣɺ ln { N(tn |y(xn , w), β−1)
} = ln [ β1/2 (2π)1/2 exp { − β 2 (tn − y(xn , w))2 }] = 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 (4.8) ΑΓɺln p(t|X, w, β) ҎԼͷΑ͏ʹͳΔɻ ln p(t|X, w, β) = N ∑ n=1 ln N(tn |y(xn , w), β−1) = N ∑ n=1 [ 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 ] = N 2 ln β − N 2 ln (2π) − β 2 N ∑ n=1 (tn − y(xn , w))2 (4.9) 26 / 38
4. ଛࣦؔͱਖ਼ଇԽ ͜͜Ͱɺೋޡࠩ E(w) Λ E(w) = 1 2 N
∑ n=1 (tn − y(xn , w))2 (4.10) ͱఆٛ͢Δͱɺln p(t|X, w, β) ln p(t|X, w, β) = N 2 ln β − N 2 ln (2π) − E(w) (4.11) ͱͳΔɻ ࠷ਪఆղ wML , βML ΛٻΊΔͨΊʹର ln p(t|X, w, β) ͷޯ ΛٻΊΔɻ ରͷ w ʹର͢Δޯ β ʹґଘ͠ͳ͍ͷͰɺઌʹ wML ΛٻΊ ͯɺͦͷ͋ͱʹ ln p(t|X, wML , β) Λ༻͍ͯ βML ΛٻΊΔ͜ͱ͕Ͱ ͖Δɻ 27 / 38
4. ଛࣦؔͱਖ਼ଇԽ ·ͣɺର (4.11) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δͱɺ (4.11) ͷӈลͷ 1,
2 ߲ w ʹґଘ͠ͳ͍ͷͰɺ3 ߲ͷ −βED (w) Λ࠷େԽ͢Δ͜ͱͱՁͰ͋Δɻ β > 0 ΑΓɺର (4.11) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱೋޡ ࠩ ED (w)(4.10) Λ w ʹؔͯ͠࠷খʹ͢Δ͜ͱͱՁͰ͋Δɻ ͜ΕΑΓɺೋޡࠩͷ࠷খԽ֬Λ༻͍ΔͱؔΛΨε ͱԾఆͨ͠ͱ͖ͷ࠷ਪఆͷ݁ՌͰ͋Δࣄ͕Θ͔Δɻ ࣮ࡍͷ࠷খԽ (͝ଘͷ௨Γ) ٯޡࠩൖ๏ͳͲΛ༻͍ͯ෮తʹ࣮ ࢪ͢Δɻ 28 / 38
4. ଛࣦؔͱਖ਼ଇԽ ࣍ʹྨΛऔΓѻ͏ɻͭ·Γɺඪม {t1 , t2 , · · ·
, tN } ͕ࢄత ͳΛ࣋ͪɺ0 ͔ 1 ͷ 2 ΛऔΓ͏Δͱ͢Δɻ ྨͰɺ׆ੑԽؔ f ͱ h ΛͦΕͧΕϩδεςΟοΫγάϞΠ υؔͱλϯδΣϯτϋΠύϘϦοΫؔͱ͢Δɻ y(x, w) = σ ( M−1 ∑ j=0 w(2) j tanh ( D ∑ i=0 w(1) ji xi )) (4.12) ग़ྗͷ׆ੑԽؔΛϩδεςΟοΫγάϞΠυؔʹ͍ͯ͠ΔͷͰɺ y(x, w) 0 < y(x, w) < 1 ͷൣғʹΛͱΔɻ 29 / 38
4. ଛࣦؔͱਖ਼ଇԽ ྨͰԾఆͱͯ͠ɺ܇࿅σʔλͷೖྗ {x1 , x2 , · · ·
, xN } ͕ͳΜ Β͔ͷํ๏Ͱੜ͞Ε (αϯϓϦϯά๏ͷٞ PRML 11 ষ)ɺͦͷೖ ྗϕΫτϧʹରԠ͢Δඪม {t1 , t2 , · · · , tN } ҎԼͷϕϧψʔΠ ͰͦΕͧΕಠཱʹੜ͞ΕΔͱ͢Δɻ p(t|x, w) = (y(x, w))t(1 − y(x, w))1−t (4.13) ͜͜Ͱɺw ֶ͕शʹΑͬͯௐઅ͞ΕΔύϥϝʔλͰ͋Δɻ t = 1 ͷ֬ y(x, w) ͱͳΓɺt = 0 ͷ֬ 1 − y(x, w) ͱͳΔɻ 0 < y(x, w) < 1 ʹΛͱΔͷͰɺͲͪΒͱ֬ͷऔΓ͏Δͷൣғ ͷ݅Λຬͨ͢ɻ 30 / 38
4. ଛࣦؔͱਖ਼ଇԽ ܇࿅σʔλ (4.5) ͔Βಠཱʹੜ͞ΕΔͷͰɺؔҎԼͷΑ͏ ʹͦΕͧΕͷσʔλͷੵͰ͔͚Δɻ p(t|X, w) = N
∏ n=1 (y(xn , w))tn (1 − y(xn , w))1−tn (4.14) ͜ͷؔΛ࠷େʹ͢Δ w ΛٻΊΔ͜ͱΛߟ͑Δɻ p(t|X, w) Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔΘΓʹؔͷ ରΛ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔɻ 31 / 38
4. ଛࣦؔͱਖ਼ଇԽ ln p(t|X, w) ҎԼͷΑ͏ʹͳΔɻ ln p(t|X, w) =
N ∑ n=1 ln { (y(xn , w))tn (1 − y(xn , w))1−tn } = N ∑ n=1 {tn ln y(xn , w) + (1 − tn ) ln (1 − y(xn , w))} = − E(w) (4.15) ͜͜ͰɺE(w) ަࠩΤϯτϩϐʔޡࠩͰ͋Δɻ E(w) = − N ∑ n=1 {tn ln y(xn , w) + (1 − tn ) ln (1 − y(xn , w))} (4.16) ͜ΕΑΓɺަࠩΤϯτϩϐʔޡࠩͷ࠷খԽɺ֬Λ༻͍Δͱɺ ؔΛϕϧψʔΠͱԾఆͨ͠ͱ͖ͷ࠷ਪఆͷ݁ՌͰ͋Δࣄ͕Θ ͔Δɻ 32 / 38
4. ଛࣦؔͱਖ਼ଇԽ Λճؼʹ͢ͱɺճؼͰҎԼͷೋޡࠩΛ࠷খԽ͢ΔΑ͏ʹ ύϥϝʔλ w ΛܾΊΔͷͰ͋ͬͨɻ E(w) = 1 2
N ∑ n=1 (tn − y(xn , w))2 (4.17) Α͘ΒΕ͍ͯΔݱͱͯ͠ɺχϡʔϥϧωοτͷΑ͏ͳෳࡶͳϞσϧ Ͱσʔλ͕গͳ͍࣌ɺύϥϝʔλ͕܇࿅σʔλʹ fit ͗͢͠Δͱ͍͏ աֶशͱݺΕΔݱ͕͋Δɻ Ұൠతʹաֶश͕ى͍ͬͯ͜Δͱ͖ɺύϥϝʔλͷͷͷઈର ͕େ͖͘ͳΔʹ͋ΔͨΊɺաֶशΛ͙ͨΊʹೋޡࠩʹҎԼͷ Α͏ͳ߲ΛՃ͑ͨਖ਼ଇԽ͞ΕͨೋޡࠩͰֶशΛߦ͏͜ͱ͕Α͘ ͋Δɻ E(w; λ) = 1 2 N ∑ n=1 (tn − y(xn , w))2 + λ 2 ∥w∥2 (4.18) 33 / 38
4. ଛࣦؔͱਖ਼ଇԽ ͜͜Ͱɺλ ਖ਼ͷϋΠύʔύϥϝʔλͰ͋ΓֶशύϥϝʔλͰͳ͍ɻ λ ͕ਖ਼Ͱ͋ΔͨΊɺਖ਼ଇ߲ΛՃ͢Δ͜ͱͰɺύϥϝʔλͷͷͷ ઈର͕େ͖͘ͳΔ͜ͱΛ͙͜ͱ͕Ͱ͖Δɻ(ৄ͘͠ PRML 1.1 ࢀর)
࠷ޙʹɺ͜ͷਖ਼ଇ߲͕֬Λ༻͍ͨ࣌ʹ MAP ਪఆ (࠷େࣄޙ֬ਪ ఆ) ͷ݁Ռͱͯ͠ɺਖ਼ଇ߲͕ొ͢Δ͜ͱΛݟΔɻ ͦͷͨΊʹɺࣄޙ֬ͱϕΠζਪఆΛܰ͘આ໌͢Δɻ(ৄ͘͠ PRML 1.2.3 ࢀর) 34 / 38
4. ଛࣦؔͱਖ਼ଇԽ ͜Ε·Ͱ (࠷ਪఆ) ͰɺؔΛ࠷େʹ͢ΔΑ͏ͳύϥϝʔλ w Λਪఆ͖ͯͨ͠ɻ ϕΠζਪఆͰɺڭࢣσʔλΛ༻͍ͯύϥϝʔλ w ͷ֬
(Ͱͳ ͘෯ΛͭɺࣄޙͱݺΕΔ) ΛٻΊΔɻ ͦͷࣄޙΛ༻͍ͯɺະͷσʔλͷೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ t ͷ༧ଌ p(t|x, t, X) ΛٻΊΔɻ(PRML 1.68 ࣜࢀর) ࣄޙͷʮࣄޙʯͱ܇࿅σʔλ͕؍ଌ͞Εͨঢ়ଶͰͷύϥϝʔλ w ͷ֬ͱ͍͏ҙຯͰ͋ΓɺҎԼͷ͖݅֬Ͱ͋Δɻ p(w|t, X) (4.19) 35 / 38
4. ଛࣦؔͱਖ਼ଇԽ ҰํͰɺ֬ͷ๏ఆཧ (PRML 1.11 ࣜ) Λ༻͍Δͱɺࣄޙ ؔ p(t|X, w)
ͱࣄલ p(w) ͷੵʹൺྫ͢Δɻ(ϕΠζͷఆཧ) p(w|t, X) ∝ p(t|X, w)p(w) (4.20) ճؼͷ࣌ͷؔ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (4.21) Ͱ༩͍͑ͯͨͨΊɺࣄޙΛٻΊΔʹࣄલ p(w) ΛԾఆ͢Δඞ ཁ͕͋Δɻ 36 / 38
4. ଛࣦؔͱਖ਼ଇԽ ࠓճࣄલͱͯ͠ɺฏۉ͕ 0 Ͱڞࢄ͕ α−1I ͷΨεΛԾఆ ͢Δɻ p(w) =
N(w|0, α−1I) (4.22) ͜ΕΒͷ݁ՌΑΓɺࣄޙ p(w|t, X) ҎԼͷΑ͏ʹͳΔɻ p(w|t, X) ∝ p(t|X, w, β)p(w) ∝ exp ( − β 2 N ∑ n=1 (tn − y(xn , w))2 ) · exp ( − α 2 ∥w∥2 ) = exp ( − β 2 E(w; α/β) ) (4.23) ͜͜ͰɺE(w; λ) (4.18) Ͱఆٛͨ͠ਖ਼ଇԽ͞ΕͨޡࠩؔͰ͋Δɻ 37 / 38
4. ଛࣦؔͱਖ਼ଇԽ ͜ΕΑΓɺࣄޙΛ࠷େʹ͢Δύϥϝʔλ w ਖ਼ଇԽ͞Εͨޡࠩؔ E(w; λ) Λ࠷খʹ͢Δύϥϝʔλ w
Ͱ͋Δɻ 38 / 38