Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PRMLセミナー

gucchi
February 25, 2019

 PRMLセミナー

gucchi

February 25, 2019
Tweet

More Decks by gucchi

Other Decks in Science

Transcript

  1. 0. ࠓճͷηϛφʔʹ͍ͭͯ ࠓճͷηϛφʔͰ͸ɺPRML ͷୈࡾষͷઢܗճؼϞσϧͱୈ࢛ষͷઢܗ ࣝผϞσϧ (ಛʹϩδεςΟοΫճؼ) Λத৺ʹ͓࿩͍ͨ͠͠ͱࢥ͍ ·͢ɻ ·ͨɺ͜ΕΒͷ࿩୊Λઆ໌͢ΔͨΊʹඞཁͳ༧උ஌ࣝΛղઆ͠·͢ɻ (PRML

    ͷୈҰষͱୈೋষʹରԠ) ͳͷͰɺ݁Ռతʹ PRML ্רͷ΋ͬͱ΋ॏཁͳ࿩୊Λղઆ͢Δ͜ͱʹ ͳΓ·͢ɻ ͳ͓஫ҙ఺ͱͯ͠ɺຊεϥΠυͷࣜ൪߸ͱ PRML ͷࣜ൪߸͸ҟͳΓ· ͢ͷͰɺ͝஫ҙ͍ͩ͘͞ɻ 2 / 74
  2. ໨࣍ 1. ༧උ஌ࣝ 1-1. ؆୯ͳճؼͷྫ 1-2. ֬཰࿦ͱ֬཰෼෍ 1-3. ࠷໬ਪఆͱϕΠζਪఆ 2.

    ճؼ໰୊ 2-1. ઢܗجఈؔ਺Ϟσϧ 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ 2-3. ϕΠζઢܗճؼ 3. ෼ྨ໰୊ 3-1. ϩδεςΟοΫճؼ 3-2. ϩδεςΟοΫճؼͷ࠷໬ਪఆ 3-3. ϕΠζϩδεςΟοΫճؼ 3 / 74
  3. 1. ༧උ஌ࣝ ػցֶशɺಛʹͦͷதͰ΋ڭࢣ͋ΓֶशͰ͸ɺ·ͣೖྗσʔλͷू߹ {x1 , x2 , · · ·

    , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ(܇࿅σʔλɺ·ͨ͸ڭࢣσʔλ) ༻ҙͨ͠܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔Β໨ඪϕΫτϧΛ༧ଌ͢Δ ؔ਺ y(x) Λ࡞Δɻ(ֶश) ֶशऴྃޙɺະ஌ͷσʔλ x ͷ໨ඪϕΫτϧΛ y(x) Ͱ༧ଌ͢Δ ֤ೖྗϕΫτϧΛ༗ݶݸͷ཭ࢄతͳΧςΰϦʹׂΓ౰ͯΔ৔߹ (ྫ͑ ͹ɺखॻ͖਺ࣈͷೝࣝ) ΛΫϥε෼ྨͱ͍͍ɺग़ྗ͕࿈ଓม਺ͷ৔߹ Λճؼͱ͍͏ɻ ·ͣ͸ճؼͷ؆୯ͳྫʹ͍ͭͯߟ͑Δ 4 / 74
  4. 1-1. ؆୯ͳճؼͷྫ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x = (x1 , x2 ,

    · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ(ճؼͳͷͰɺ ग़ྗ tn ͸࿈ଓతͳ஋ΛͱΔ) tn ͸ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨ΢ε෼෍ (ޙ΄Ͳઆ໌͢Δ) ʹै͏ϥ ϯμϜϊΠζ ϵ ΛՃ͑ͨ΋ͷͱ͢Δɻ(෇࿥ A ࢀߟ) tn = sin(2πxn ) + ϵ (1.1) ճؼͷ໨త͸܇࿅σʔλ (x, t) Λ࢖ͬͯɺ৽ͨͳೖྗ ˆ x ͕༩͑ΒΕͨ࣌ ͷग़ྗ ˆ t ΛٻΊΔ͜ͱͰ͋Δɻ 5 / 74
  5. 1-1. ؆୯ͳճؼͷྫ Լͷਤ͸܇࿅σʔλͷ਺ N = 10 ͷ৔߹ͷྫͰ͋Δɻ(੨ؙ͕܇࿅ σʔλ) ·ͨɺ྘ͷۂઢ͸ sin(2πx)

    Ͱ͋Δɻ զʑͷ໨త͸܇࿅σʔλ͔ΒͰ͖Δ͚ͩਖ਼͘͠྘ͷۂઢΛ࠶ݱ͢Δࣄ Ͱ͋Δɻ 6 / 74
  6. 1-1. ؆୯ͳճؼͷྫ ͜Ε͔Β܇࿅σʔλΛ༻͍ͯɺະ஌ͷೖྗʹର͢Δग़ྗΛ༧ଌΛߦ͏ɻ ͔͠͠༗ݶݸ (N ݸ) Ͱ͋Δ͕Ώ͑ɺ༧ଌ஋ ˆ t ʹ͸ෆ࣮֬ੑ͕͋Γɺͦ

    ͷෆ࣮֬ੑͷఆྔతͳදݱΛ༩͑Δ࿮૊Έ͸ޙ΄Ͳಋೖ͢Δɻ ͱΓ͋͑ͣ͜ͷઅͰ͸ɺҎԼͷΑ͏ͳଟ߲ࣜΛ࢖ͬͯϑΟοςΟϯάΛ ߦ͍ɺ༧ଌΛߦ͏͜ͱΛߟ͑Δɻ y(x, w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (1.2) ܇࿅σʔλ (x, t) Λ࢖ͬͯɺଟ߲ࣜͷύϥϝʔλ w = (w0 , w1 , · · · , wM )T Λద੾ʹνϡʔχϯά͢Δɻ ͦ͜ͰɺҎԼͷޡࠩؔ਺ E(w) Λ࠷খʹ͢ΔΑ͏ͳ w(= w⋆) ΛٻΊΔ ͜ͱΛߟ͑Δɻ(ޙ΄Ͳ֬཰࿦Λ༻͍Δ͜ͱͰɺޡࠩؔ਺ͷ࠷খԽ͸࠷ ໬ਪఆͷ݁ՌͰ͋Δ͜ͱΛݟΔ) E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 (1.3) 7 / 74
  7. 1-1. ؆୯ͳճؼͷྫ ্ͷਤ͸ଟ߲ࣜͷ࣍ݩ M = 0, 1, 3, 9 ͷϑΟοςΟϯά݁ՌͰ͋Δɻ(྘

    ͕ sin(2πx) Ͱɺ੺͕ y(x, w⋆)) ͜ͷதͰ͸ɺM = 3 ͕Ұ൪ sin(2πx) ʹ౰ͯ͸·͍ͬͯΔΑ͏ʹݟ͑Δɻ M = 9 Ͱ͸ɺE(w⋆) = 0 ͕ͩɺsin(2πx) ʹ͸౰ͯ͸·͍ͬͯͳ͍ɻ(ա ֶश) 8 / 74
  8. 1-1. ؆୯ͳճؼͷྫ ෳࡶͳϞσϧ (ͨͱ͑͹ M = 9) ΛݶΒΕͨ܇࿅σʔλ਺ (ྫ͑͹ N

    = 10) Λ༻͍ͯɺաֶश͕ى͖ͳ͍Α͏ʹ͢ΔͨΊʹਖ਼ଇԽΛߦ͏ɻ աֶश͕ى͖͍ͯΔ࣌͸ɺύϥϝʔλ w⋆ ͷ੒෼͕େ͖ͳਖ਼ෛͷ਺ʹͳ Δ܏޲͕͋ΔͨΊɺҎԼͷΑ͏ͳޡࠩؔ਺Λߟ͑Δɻ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 + λ 2 ∥w∥2 (1.4) ͜͜ͰɺϊϧϜ ∥w∥2 = wT w = w2 0 + w2 1 + · · · w2 M ɺλ ͸ਖ਼ͷύϥϝʔ λɻ(ਖ਼ଇԽ߲ͱೋ৐ޡࠩͷ࿨ͷ߲ͷ૬ରతͳॏཁ౓Λௐઅ) ͜ͷޡࠩؔ਺Λ࢖༻͢Δͱɺύϥϝʔλͷઈର஋͕େ͖͘ͳΔͱଛࣦؔ ਺΋େ͖͘ͳΔͷͰɺύϥϝʔλͷઈର஋͕େ͖͘ͳΒͳ͍Α͏ʹ ϑΟοςΟϯά͞ΕΔɻ 9 / 74
  9. 1-1. ؆୯ͳճؼͷྫ ্ͷਤ͸ɺM = 9 Ͱ (1.4) ͷޡࠩؔ਺Λ༻͍ͯɺϑΟοςΟϯάͨ݁͠ ՌͰ͋Δɻ(λ =

    e−18 ͱ λ = 1) λ = e−18 Ͱ͸ɺλ = 0 ͷ࣌ʹൺ΂ͯաֶश͕཈੍͞Ε͍ͯΔ͜ͱ͕Θ ͔Δɻ ·ͨɺλ = 1 Ͱ͸ (1.4) ͷӈล 2 ߲໨ͷॏཁ౓্͕͕Γ͗ͯ͢ɺύϥ ϝʔλ w⋆ ͷ੒෼͕ 0 ʹ͖͍ۙͮ͗ͯ͢Δɻ 10 / 74
  10. 1-2. ֬཰࿦ͱ֬཰෼෍ ύλʔϯೝࣝʹ͓͍ͯɺॏཁͳෆ࣮֬ੑΛఆྔతʹධՁ͢ΔͨΊʹ֬཰ ࿦Λಋೖ͢Δɻ ֬཰ม਺ X, Y Λߟ͑ɺ͜ΕΒ͸ X =

    xi (i = 1, 2, · · · , M)ɺ Y = yj (j = 1, 2, · · · , L) ΛͱΔͱ͠ɺX = xi , Y = yj ͱͳΔ֬཰ (ಉ࣌ ֬཰) Λ p(X = xi , Y = yj ) ͱ͔͘ɻ X = xi ͱͳΔ֬཰ p(X = xi ) ͸ɺp(X = xi , Y = yj ) Λ༻͍ͯҎԼͷ Α͏ʹ͔͚Δɻ(Ճ๏ఆཧ) p(X = xi ) = L ∑ j=1 p(X = xi , Y = yj ) (1.5) ·ͨɺX = xi ͕༩͑ΒΕ্ͨͰɺY = yj ͱͳΔ֬཰ (৚݅෇͖֬཰) Λ p(Y = yj |X = xi ) ͱ͢ΔͱɺҎԼͷΑ͏ͳؔ܎͕ࣜ੒ཱ͢Δɻ(৐๏ ఆཧ) p(X = xi , Y = yj ) = p(Y = yj |X = xi )p(X = xi ) (1.6) 12 / 74
  11. 1-2. ֬཰࿦ͱ֬཰෼෍ ৐๏ఆཧͱಉ࣌֬཰ͷରশੑ p(X, Y ) = p(Y, X) Λ༻͍ΔͱɺϕΠζͷ

    ఆཧ͕ಋ͚Δɻ p(Y |X) = p(X|Y )p(Y ) p(X) (1.7) ͜͜Ͱɺp(Y ) Λࣄલ֬཰ (X ͕༩͑ΒΕΔલͷ֬཰) ͱ͍͍ɺp(Y |X) Λࣄޙ֬཰ (X ͕༩͑ΒΕͨޙͷ֬཰) ͱ͍͏ɻ ϕΠζͷఆཧ͸ࣄલ֬཰ p(Y ) ʹ໬౓ p(X|Y ) Λ͔͚Δͱɺࣄޙ֬཰ p(X|Y ) ʹͳΔͱ͍͏͜ͱΛද͢ (p(X) ͸ p(Y |X) ͕ Y ʹରͯ͠ن֨ Խ͞Ε͍ͯΔ͜ͱΛอূ͢Δن֨Խఆ਺)ɻ ͞Βʹɺಉ࣌෼෍ p(X, Y ) ͕ҎԼͷΑ͏ʹपล෼෍ͷੵͰදͤΔ࣌ɺX ͱ Y ͸ಠཱͰ͋Δͱ͍͏ɻ p(X, Y ) = p(X) p(Y ) (1.8) 13 / 74
  12. 1-2. ֬཰࿦ͱ֬཰෼෍ ͜Ε·Ͱ͸཭ࢄతͳ֬཰ม਺ʹ͍ͭͯߟ͖͑ͯͨɻ࣍ʹ࿈ଓతͳ֬཰ ม਺ͷ෼෍ʹ͍ͭͯߟ͑Δɻ ֬཰ม਺ x ͕ (x, x +

    δx) ͷൣғʹೖΔ֬཰͕ δx → 0 ͷ࣌ʹ p(x) δx ͱ ༩͑ΒΕΔ࣌ɺp(x) Λ֬཰ີ౓ͱ͍͏ɻ ͜ͷ࣌ɺม਺ x ͕۠ؒ (a, b) ʹ͋Δ֬཰͸ҎԼͷࣜͰ༩͑ΒΕΔɻ p(x ∈ (a, b)) = ∫ b a p(x) dx (1.9) ·ͨɺ֬཰ͷඇෛੑͱن֨ԽΑΓɺp(x) ͸ҎԼͷੑ࣭Λ࣋ͭɻ p(x) ≥ 0 (1.10) ∫ ∞ −∞ p(x) dx = 1 (1.11) 14 / 74
  13. 1-2. ֬཰࿦ͱ֬཰෼෍ ֬཰࿦Ͱͷॏཁͳܭࢉͱͯ͠ɺॏΈ෇͖ฏۉ͕͋Δɻ ࿈ଓతͳ֬཰ม਺ x ʹରͯ͠ɺؔ਺ f(x) ͷ֬཰෼෍ p(x) ͷԼͰͷฏۉ

    ஋͸ҎԼͷΑ͏ʹͳΔɻ E[f] = ∫ p(x)f(x) dx (1.12) ͜͜Ͱه๏ͱͯ͠ɺͲͷม਺ʹ͍ͭͯ࿨ (΋͘͠͸ੵ෼) Λͱ͍ͬͯΔ ͷ͔ΛఴࣈͰද͢͜ͱʹ͢Δɻྫ͑͹ɺҎԼͷྔ͸ x ͍ͭͯ࿨ (΋͘͠ ͸ੵ෼) Λͱͬͨ΋ͷͰ͋Δɻ Ex [f(x, y)] (1.13) 15 / 74
  14. 1-2. ֬཰࿦ͱ֬཰෼෍ ҎԼ͕ؔ਺ f(x) ͷ֬཰෼෍ p(x) ͷԼͰͷ෼ࢄͰ͋Δɻ(ؔ਺ f(x) ͕ͦ ͷฏۉ஋

    E[f(x)] ͷपΓͰͲΕ͚ͩόϥ͍͍ͭͯΔͷ͔Λද͢) var[f] = E [ (f(x) − E[f(x)])2 ] (1.14) ಛʹ f(x) = x ͷ࣌͸ҎԼ͕੒ཱ͢Δɻ var[x] = E[x2] − E[x]2 (1.15) ·ͨɺ2 ͭͷ֬཰ม਺ x ͱ y ͷؒͷڞ෼ࢄ (2 ͭͷ֬཰ม਺ͷґଘੑΛ ද͢) ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ cov[x, y] = Ex,y [ {x − E[x]}{y − E[y]} ] = Ex,y [xy] − E[x]E[y] (1.16) 2 ͭͷ֬཰ม਺ x ͱ y ͕ಠཱͷ࣌ɺcov[x, y] = 0 ͱͳΔɻ 16 / 74
  15. 1-2. ֬཰࿦ͱ֬཰෼෍ Ψ΢ε෼෍ͷॏཁͳੑ࣭ͱͯ͠ɺx ͷฏۉ஋Λ෼ࢄ͕ͦΕͧΕ µ ͱ σ2 Ͱ༩͑ΒΕΔ͜ͱͰ͋Δɻ E[x] =

    ∫ ∞ −∞ N(x|µ, σ2)x dx = µ (1.18) var[x] = E[x2] − E[x]2 = σ2 (1.19) 18 / 74
  16. 1-2. ֬཰࿦ͱ֬཰෼෍ ࣍ʹɺҎԼͷ D ࣍ݩͷϕΫτϧ x ʹର͢ΔଟมྔΨ΢ε෼෍Λಋೖ ͢Δɻ N(x|µ, Σ)

    = 1 (2π)D/2 1 |Σ|1/2 exp { − 1 2 (x − µ)TΣ−1(x − µ) } (1.20) ͜͜Ͱɺµ Λ D ࣍ݩͷฏۉϕΫτϧͱ͠ɺΣ Λ D × D ͷڞ෼ࢄߦྻͱ ͢Δɻ ͜ͷ৔߹Ͱ΋ฏۉͱڞ෼ࢄ͸ҎԼͷੑ࣭Λຬͨ͢ɻ E[x] = ∫ N(x|µ, Σ)x dx = µ (1.21) cov[x] = E[(x − E[x])(x − E[x])T] = Σ (1.22) 19 / 74
  17. 1-2. ֬཰࿦ͱ֬཰෼෍ Ҏ߱ͷٞ࿦ͰΑ͘࢖͏Ψ΢ε෼෍ͷެࣜΛ঺հ͢Δɻ ҎԼͷपล֬཰ p(x) ͱ৚݅෇͖֬཰ p(y|x) ͕༩͑ΒΕ͍ͯΔͱ͢Δɻ p(x) =

    N(x|µ, Λ−1) (1.23) p(y|x) = N(y|Ax + b, L−1) (1.24) ͜͜Ͱɺµ, A, b ͸ฏۉʹؔ͢ΔύϥϝʔλͰɺΛ, L ͸ਫ਼౓ߦྻͰ ͋Δɻ ͜ͷ࣌ɺपล֬཰ p(y) ͱ৚݅෇͖֬཰ p(x|y) ͸ҎԼͷΑ͏ʹͳΔɻ p(y) = N(y|Aµ + b, L−1 + AΛ−1AT) (1.25) p(x|y) = N(x|Σ{ATL(y − b) + Λµ}, Σ) (1.26) ͜͜ͰɺΣ ͸ҎԼͰఆٛ͞ΕΔɻ Σ = (Λ + ATLA)−1 (1.27) (ৄ͍͠ಋग़͸ PRML ͷ 2.3.3 Λࢀߟ) 20 / 74
  18. 1-2. ֬཰࿦ͱ֬཰෼෍ ·ͨɺಉ࣌෼෍ p(xa , xb ) ͕ҎԼͰ༩͑ΒΕ͍ͯͨͱ͢Δɻ p(xa ,

    xb ) = N(x|µ, Σ) (1.28) ͜͜Ͱɺx = (xa , xb )T Ͱ͋Δɻ ͜ͷͱ͖ɺपล෼෍ p(xa ) ͸ҎԼͷΑ͏ͳΨ΢ε෼෍ʹͳΔ͜ͱ͕஌Β Ε͍ͯΔɻ(ৄ͍͠ಋग़͸ PRML ͷ 2.3.2 Λࢀߟ) p(xa ) = ∫ p(xa , xb ) dxb = N(xa |µa , Σaa ) (1.29) ͜͜Ͱɺµa ͱ Σaa ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔɻ µ = ( µa µb ) , Σ = ( Σaa Σab Σba Σbb ) (1.30) 21 / 74
  19. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ϕΠζਪఆΛଟ߲ࣜۂઢϑΟοςΟϯάΛྫʹઆ໌͢Δɻ ϕΠζతͳ֬཰ղऍͰ͸ɺ·ͣσʔλΛ؍ଌ͢Δલʹɺզʑͷύϥϝʔ λ w ΁ͷԾઆΛࣄલ֬཰ p(w) ͷܗͰऔΓࠐΜͰ͓͘ɻ ࣮ࡍʹೖྗσʔλ

    x = (x1 , x2 , · · · , xN )T ͱ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻͍ͯɺ໬౓ؔ਺ p(t|x, w) ΛٻΊΔɻ ϕΠζͷఆཧΑΓɺࣄޙ֬཰ p(w|t, x) ΛٻΊΔɻ p(w|t, x) = p(t|x, w)p(w) p(t) (1.31) 22 / 74
  20. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ϕΠζਪఆͰ͸ɺ܇࿅σʔλ x, t ͱະ஌ͷೖྗσʔλ x ͕༩͑ΒΕͨ ࣌ͷ༧ଌ t

    ͷ֬཰ p(t|x, t, x) ͕ҎԼͷΑ͏ʹٻ·Δɻ p(t|x, t, x) = ∫ p(t|x, w)p(w|t, x) dw (1.32) (͜ͷ༧ଌ෼෍ͷಋग़ํ๏͸ҎԼͷ Qiita هࣄͰ·ͱΊͯ·͢ɻ͝ཡ͘ ͍ͩ͞ɻ͍͍ͦͯ͠Ͷ͍ͩ͘͞ɻ) https://qiita.com/gucchi0403/items/bfffd2586272a4c05a73 23 / 74
  21. 1-3. ࠷໬ਪఆͱϕΠζਪఆ ස౓ओٛతͳ֬཰ղऍͱϕΠζతͳ֬཰ղऍͰɺ໬౓ؔ਺ p(D|w) ͷ໾ ׂ͕มΘΔɻ ස౓ओٛతͳ֬཰ղऍͰ͸ɺw ͸͋Δݻఆ͞Εͨύϥϝʔλͱͯ͠ଊ ͑ɺ໬౓ؔ਺ p(D|w)

    Λ࠷େʹ͢ΔΑ͏ͳ w Λਪఆྔͱͯ͠ఆΊΔɻ (w ͸ 1 ͭʹఆ·Δ) ϕΠζతͳ֬཰ղऍͰ͸ɺ໬౓ؔ਺͸ࣄલ෼෍Λ؍ଌσʔλ D ʹΑͬ ͯɺࣄޙ෼෍ʹߋ৽͢ΔͨΊʹ࢖͏ (ࣄޙ෼෍ p(w|D) ͸ w ͷ֬཰෼෍ Ͱ͋Γɺw ͸ෆ࣮֬ੑΛ΋ͭ) ޙऀͷ໬౓ؔ਺ͷ࢖༻ํ๏ͷ۩ମྫ͸ޙ΄Ͳ঺հ͢Δɻ 24 / 74
  22. 2-1. ઢܗجఈؔ਺Ϟσϧ ͸͡Ίʹઆ໌ͨ͠؆୯ͳճؼϞσϧ͸ɺग़ྗ y(x, w) ΛҎԼͷΑ͏ʹೖ ྗม਺ x ͷଟ߲ࣜͱ͢Δ΋ͷͰ͋ͬͨɻ y(x,

    w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (2.1) ͜͜Ͱɺw = (w0 , w1 , · · · , wM )T ͸ύϥϝʔλϕΫτϧͰ͋Δɻ ͜ͷষͰ͸ɺҰൠԽͱͯ͠ೖྗΛϕΫτϧ x ͱ͠ɺඇઢܗͳجఈؔ਺ ϕj (x) (j = 1, · · · , M − 1) Ͱؔ਺ y(x, w) ΛҎԼͷΑ͏ʹల։͢Δ͜ͱ Λߟ͑Δɻ y(x, w) = w0 + M−1 ∑ j=1 wj ϕj (x) (2.2) 26 / 74
  23. 2-1. ઢܗجఈؔ਺Ϟσϧ ·ͨࣜΛ୹ॖ͢ΔͨΊɺϕ0 (x) = 1 ͱ͠ɺ ϕ(x) = (ϕ0

    (x), ϕ1 (x), · · · , ϕM−1 (x))T ͱఆٛ͢Δͱɺ(2.2) ͸ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (2.3) ͱॻ͚Δɻ ྫ͑͹ɺجఈؔ਺ ϕj (x) ͱͯ͠ҎԼͷΨ΢εجఈؔ਺͕͋Δɻ ϕj (x) = exp { − (x − µj )2 2s2 } (2.4) ͜ͷجఈؔ਺͸ x = µj Λத৺ʹͯ͠ɺ෼ࢄ s2 ʹΑͬͯࢧ഑͞ΕΔ޿͕ ΓΛ࣋ͭΨ΢εجఈؔ਺Ͱ͋Δɻ Ҏ߱͸Ұൠͷجఈؔ਺ ϕj (x) Λ༻͍ͯٞ࿦͢Δɻ 27 / 74
  24. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ॳΊͷষͰઆ໌ͨ͠ճؼ໰୊Ͱ͸ɺೋ৐࿨ޡࠩΛ࠷খʹ͢ΔΑ͏ʹσʔ λ఺Λଟ߲ࣜؔ਺ʹϑΟοςΟϯάͤͨ͞ɻ ࠓճ͸ɺ໨ඪม਺ t ͕ҎԼͷΑ͏ʹܾఆ࿦తͳؔ਺ y(x, w) ͱظ଴஋͕

    0 Ͱਫ਼౓͕ β > 0 ͷΨ΢ε෼෍ N(ϵ|0, β−1) ʹै͏ ϵ ͷ࿨Ͱॻ͚Δͱ ͢Δɻ t = y(x, w) + ϵ (2.5) ϵ = t − y(x, w) ΑΓɺҎԼͷΑ͏ʹ໨ඪม਺ t ΋Ψ΢ε෼෍ʹै͏ɻ p(t|x, w, β) = N(t − y(x, w)|0, β−1) = N(t|y(x, w), β−1) (2.6) 28 / 74
  25. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜Ͱɺೖྗσʔλͷू߹ X = {x1 , x2 , ·

    · · , xN } ͱͦΕͧΕʹରԠ͢ Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺ໨ඪม਺Λॎʹฒ΂ͨϕ Ϋτϧ t = (t1 , t2 , · · · , tN )T Λఆٛ͢Δɻ ؍ଌ఺ {t1 , t2 , · · · , tN } ͕෼෍ (2.6) ͔Βಠཱʹੜ੒͞Εͨͱ͢Δͱɺ໬ ౓ؔ਺͸ҎԼͷΑ͏ʹݸʑͷσʔλ఺ͷ෼෍ͷੵͰॻ͚Δɻ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.7) ͜͜ͰɺΨ΢ε෼෍ N(x|µ, σ2) ͸ N(x|µ, σ2) = 1 (2πσ2)1/2 exp { − 1 2σ2 (x − µ)2 } (2.8) Ͱ͋Δɻ 29 / 74
  26. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ໬౓ؔ਺ͷ (2.7) Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔ୅ΘΓʹ໬౓ ؔ਺ͷର਺Λ࠷େԽ͢ΔΑ͏ͳύϥϝʔλΛٻΊΔɻ ·ͣɺ ln { N(tn

    |y(xn , w), β−1) } = ln [ β1/2 (2π)1/2 exp { − β 2 (tn − y(xn , w))2 }] = 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 (2.9) ΑΓɺln p(t|X, w, β) ͸ҎԼͷΑ͏ʹͳΔɻ ln p(t|X, w, β) = N ∑ n=1 ln N(tn |y(xn , w), β−1) = N ∑ n=1 [ 1 2 ln β − 1 2 ln (2π) − β 2 (tn − y(xn , w))2 ] = N 2 ln β − N 2 ln (2π) − β 2 N ∑ n=1 (tn − y(xn , w))2 (2.10) 30 / 74
  27. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜Ͱɺೋ৐࿨ޡࠩ ED (w) Λ ED (w) = 1

    2 N ∑ n=1 (tn − y(xn , w))2 (2.11) ͱఆٛ͢Δͱɺln p(t|X, w, β) ͸ ln p(t|X, w, β) = N 2 ln β − N 2 ln (2π) − βED (w) (2.12) ͱͳΔɻ ࠷໬ਪఆղ wML , βML ΛٻΊΔͨΊʹର਺໬౓ ln p(t|X, w, β) ͷޯ഑ ΛٻΊΔɻ ର਺໬౓ͷ w ʹର͢Δޯ഑͸ β ʹґଘ͠ͳ͍ͷͰɺઌʹ wML ΛٻΊ ͯɺͦͷ͋ͱʹ ln p(t|X, wML , β) Λ༻͍ͯ βML ΛٻΊΔ͜ͱ͕Ͱ ͖Δɻ 31 / 74
  28. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ·ͣɺର਺໬౓ (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱΛߟ͑Δͱɺ (2.12) ͷӈลͷ 1,

    2 ߲໨͸ w ʹґଘ͠ͳ͍ͷͰɺ3 ߲໨ͷ −βED (w) Λ࠷େԽ͢Δ͜ͱͱ౳ՁͰ͋Δɻ β > 0 ΑΓɺର਺໬౓ (2.12) Λ w ʹؔͯ͠࠷େԽ͢Δ͜ͱ͸ೋ৐࿨ޡ ࠩ ED (w)(2.11) Λ w ʹؔͯ͠࠷খʹ͢Δ͜ͱͱ౳ՁͰ͋Δɻ 1-1 Ͱൃݟ๏తʹೋ৐࿨ޡࠩ (1.3) Λ࠷খԽ͕ͨ͠ɺೋ৐࿨ޡࠩ (1.3) ͷ ࠷খԽ͸֬཰࿦Λ༻͍Δͱ໬౓ؔ਺ΛΨ΢ε෼෍ͱԾఆͨ͠ͱ͖ͷ࠷ ໬ਪఆͷ݁ՌͰ͋Δࣄ͕Θ͔Δɻ ͦΕͰ͸࣮ࡍʹର਺໬౓ͷ w ʹର͢Δޯ഑ΛٻΊΔɻ 32 / 74
  29. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ର਺໬౓ͷ w ʹର͢Δޯ഑͸ y(x, w) = wTϕ(x) ΑΓɺ

    ∂ ∂w ln p(t|X, w, β) = − β ∂ ∂w ED (w) = − β 2 N ∑ n=1 ∂ ∂w (tn − wTϕ(xn ))2 =β N ∑ n=1 (tn − wTϕ(xn ))ϕ(xn ) =β { N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )Tw } (2.13) ͱͳΓɺ࠷໬ਪఆղ wML ͸ҎԼͷࣜΛຬͨ͢ɻ N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )TwML = 0 (2.14) 33 / 74
  30. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜͜ͰɺҎԼͷܭըߦྻ Φ Λఆٛ͢Δɻ Φ =   

       ϕ0 (x1 ) ϕ1 (x1 ) · · · ϕM−1 (x1 ) ϕ0 (x2 ) ϕ1 (x2 ) · · · ϕM−1 (x2 ) . . . . . . ... . . . ϕ0 (xN ) ϕ1 (xN ) · · · ϕM−1 (xN )       =       ϕ(x1 )T ϕ(x2 )T . . . ϕ(xN )T       (2.15) ҎԼͷ͕ࣜ੒Γཱͭࣄ͕Θ͔Δɻ ΦTΦ = N ∑ n=1 ϕ(xn )ϕ(xn )T (2.16) ΦTt = N ∑ n=1 tn ϕ(xn ) (2.17) ͜ΕΑΓɺ(2.14) ͸ҎԼͷΑ͏ʹͳΔɻ ΦTt − ΦTΦwML = 0 (2.18) 34 / 74
  31. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ Αͬͯɺ࠷໬ਪఆղ wML ͸ wML = (ΦTΦ)−1ΦTt (2.19) ͱͳΔɻ

    ࣍ʹɺ࠷໬ਪఆղ wML Λ୅ೖͨ͠ ln p(t|X, wML , β) ͷ β ͷඍ෼Λߟ ͑Δͱ ∂ ∂β ln p(t|X, wML , β) = N 2 1 β − ED (wML ) (2.20) ͱͳΔɻ ͜ΕΑΓɺ࠷໬ਪఆղ βML ͷٯ਺͸ҎԼͷΑ͏ʹͳΔɻ 1 βML = 2 N ED (wML ) = 1 N N ∑ n=1 (tn − wT ML ϕ(xn ))2 (2.21) 35 / 74
  32. 2-2. ઢܗجఈؔ਺Ϟσϧͷ࠷໬ਪఆ ͜ΕΑΓɺ৽ͨͳೖྗϕΫτϧ x ͕༩͑ΒΕͨ࣌ͷ໨ඪม਺ t ͷ༧ଌ෼ ෍ p(t|x, wML

    , βML ) ͸ҎԼͷΑ͏ʹͳΔɻ p(t|x, wML , βML ) = N(t|y(x, wML ), β−1 ML ) (2.22) ͜͜ͰɺwML , βML ͸ (2.19) ͱ (2.21) Ͱ༩͑ΒΕΔɻ 36 / 74
  33. 2-3. ϕΠζઢܗճؼ ࣍͸ઢܗճؼϞσϧΛϕΠζతʹѻ͏͜ͱΛߟ͑Δɻ ͦ͜Ͱɺฏۉ͕ m0 Ͱڞ෼ࢄ͕ S0 ͷҎԼͷࣄલ෼෍ΛԾఆ͢Δɻ p(w) =

    N(w|m0 , S0 ) (2.23) ·ͨɺ໬౓ؔ਺͸ p(t|X, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (2.24) Ͱ͋ΔͷͰɺࣄޙ෼෍ p(w|t) ͸ϕΠζͷఆཧʹΑΓɺҎԼͷΑ͏ʹ ͳΔɻ p(w|t) ∝ p(t|X, w, β)p(w) ∝ exp ( − β 2 N ∑ n=1 (tn − wTϕ(xn ))2 ) × exp ( − 1 2 (w − m0 )TS−1 0 (w − m0 ) ) (2.25) 37 / 74
  34. 2-3. ϕΠζઢܗճؼ (2.25) ΑΓɺࢦ਺ͷݞ͕ w ͷ 2 ࣍Ͱ͋ΔͷͰ p(w|t) ͸Ψ΢ε෼෍Ͱ

    ͋Δɻ ۩ମతʹ͸ɺp(w|t) ͸ҎԼͷΑ͏ʹͳΔɻ(PRML ͷԋश 3.7 ࢀর) p(w|t) = N(w|mN , SN ) (2.26) ͜͜ͰɺmN ͱ SN ͸ҎԼͰ͋Δɻ mN =SN (S−1 0 m0 + βΦTt) (2.27) S−1 N =S−1 0 + βΦTΦ (2.28) 38 / 74
  35. 2-3. ϕΠζઢܗճؼ ͜͜Ͱɺ࠷໬ਪఆղ wML (2.19) ͱࣄޙ෼෍ p(w|t) ͷϞʔυ wMAP (Ϟʔυͱ͸ɺp(w|t)

    Λ࠷େʹ͢Δ w) ͱࣄޙ෼෍ͷฏۉ஋ mN ͷؔ܎ Λߟ࡯͢Δɻ ·ͣɺΨ΢ε෼෍ͷϞʔυ͸ฏۉ஋ʹ౳͍͠ͱ͍͏ੑ࣭ (PRML ͷԋश 1.9 ࢀর) ͕͋ΔͷͰɺwMAP = mN Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͞Βʹɺແݶʹ޿͍ࣄલ෼෍ S0 = α−1I(α → 0) Λߟ͑Δͱ S−1 N = S−1 0 + βΦTΦ → βΦTΦ (2.29) ͱͳΓɺ mN = SN (S−1 0 m0 + βΦTt) → (ΦTΦ)−1ΦTt (2.30) ͱͳΔͷͰɺ͜ͷͱ͖ wMAP = mN = wML Ͱ͋Δ͜ͱ͕Θ͔Δɻ ͭ·ΓɺԿ΋৘ใΛ࣋ͨͳ͍ (ແݶʹ޿͍) ࣄલ෼෍Λ࢖༻ͨ͠ͱ͖ͷ ࣄޙ෼෍Λ࠷େʹ͢Δύϥϝʔλ͸໬౓ؔ਺Λ࠷େʹ͢Δύϥϝʔλ ͱҰக͢Δͱ͍͏͜ͱͰ͋Δɻ 39 / 74
  36. 2-3. ϕΠζઢܗճؼ લͷষͰɺϕΠζతͳѻ͍Ͱ͸ɺ໬౓ؔ਺͸ࣄޙ෼෍Λߋ৽͢Δ΋ͷͰ ͋Δͱઆ໌͕ͨ͠ɺͦͷߋ৽ͷ༷ࢠΛྫΛ࢖ͬͯݟ͍ͯ͘ɻ ·ͣɺઃఆͱͯ͠໬౓ؔ਺ͷฏۉ஋͸ y(x, w) = w0 +

    w1 x ͱ͢Δɻ ·ͨɺڭࢣσʔλʹ͍ͭͯ͸ɺೖྗσʔλ xn ͸ −1 ͔Β 1 ͷҰ༷෼෍ ͔ΒબͼɺରԠ͢Δ໨ඪ஋ tn ͸ɺඪ४ภࠩ 0.2 Ͱฏۉ 0 ͷΨ΢εϊΠ ζ ϵ Λ༻͍ͯ tn = f(xn , a0 = −0.3, a1 = 0.5) + ϵ (2.31) ͜͜Ͱɺ f(x, a0 , a1 ) = a0 + a1 x (2.32) Ͱ͋Δɻ ͭ·Γɺ͜͜Ͱͷ໨ඪ͸ڭࢣσʔλΛ༻͍ͯύϥϝʔλ w0 , w1 ͕ a0 = −0.3, a1 = 0.5 Λ෮ݩ͢Δ͜ͱͰ͋Δɻ 40 / 74
  37. 2-3. ϕΠζઢܗճؼ ·ͨɺ໬౓ؔ਺ͷਫ਼౓͸ط஌Ͱ β = (1/0.2)2 = 25 ͱ͠ɺࣄલ෼෍͸Ҏ ԼͷΑ͏ͳ౳ํతΨ΢ε෼෍Λ༻͍ͯɺύϥϝʔλ

    α ͷ஋͸ α = 2.0 ͱ ͢Δɻ p(w) = N(w|0, α−1I) (2.33) ͜ͷઃఆͰڭࢣσʔλ͕૿͍͑ͯ͘ͱ͖ͷࣄޙ෼෍ͷߋ৽ʹ͍ͭͯݟ ͍ͯ͘ɻ 41 / 74
  38. 2-3. ϕΠζઢܗճؼ ·ͣ͸ڭࢣσʔλ͕؍ଌ͞ΕΔલͷஈ֊ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ࣄલ෼෍ p(w) Ͱ͋Γɺӈͷάϥϑ͸ͦͷࣄલ෼෍͔Βϥ ϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1

    Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ౰વڭࢣσʔλ͕ͳ͍ͷͰɺ6 ݸͷؔ਺͸·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔ਺͸ͳ͍ɻ 42 / 74
  39. 2-3. ϕΠζઢܗճؼ ࣍͸ڭࢣσʔλ͕Ұͭ؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸͜ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Λ w ͷؔ਺ͱͯ͠ϓ ϩοτͨ͠΋ͷͰ͋Δɻന͍ेࣈ͕

    a0 = −0.3, a1 = 0.5 ͷ఺Ͱ͋Δɻ ਅΜதͷάϥϑ͸ࣄޙ෼෍ɺͭ·Γࣄલ෼෍ p(w) ʹ໬౓ؔ਺ p(t|x, w) Λ͔͚ͯن֨Խͨ͠΋ͷͰ͋Γɺӈͷάϥϑ͸ͦͷࣄޙ෼෍͔Βϥϯμ Ϝʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ·ͩڭࢣσʔλ͕গͳ͍ͷͰɺ6 ݸͷؔ਺͸·ͱ·Γ͕ͳ͘ɺ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ʹ͍ۙؔ਺͸ͳ͍͕ɺ͢΂ͯ ͷઢ͕σʔλ఺ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔ͜ͱʹ஫ҙɻ 43 / 74
  40. 2-3. ϕΠζઢܗճؼ ࣍͸ೋͭ໨ͷڭࢣσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ಉ͘͡ɺ͜ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Ͱ͋Δɻ ਅΜதͷάϥϑ͸σʔλ఺͕Ұݸͩͬͨ࣌ͷࣄޙ෼෍Λࣄલ෼෍ͱ͠ ͯɺͦΕʹ໬౓ؔ਺Λ͔͚ͨ΋ͷͰ͋Γɺӈͷάϥϑ͸ͦͷࣄޙ෼෍͔ ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ

    w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦ͍ͬͯΔɻ ࣄޙ෼෍͕ a0 = −0.3, a1 = 0.5 ෇ۙʹ࠷େΛ࣋ͪɺ6 ݸͷؔ਺͕ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ෇ۙʹ·ͱ·Γ࢝Ίɺ͢΂ͯ ͷઢ͕ 2 ͭͷσʔλ఺ (੨ؙ) ͷۙ͘Λ௨͍ͬͯΔɻ 44 / 74
  41. 2-3. ϕΠζઢܗճؼ ࠷ޙʹ 20 ݸͷσʔλ͕؍ଌ͞Εͨ࣌ͷάϥϑͰ͋Δɻ ࠨͷάϥϑ͸ɺ20 ݸ໨ͷσʔλ఺ͷ໬౓ؔ਺ p(t|x, w) Ͱ͋Δɻ

    ਅΜதͷάϥϑ͸ 20 ݸ໨ͷσʔλ͢΂ͯΛؚΜͩࣄޙ෼෍Ͱ͋Γɺӈ ͷάϥϑ͸ͦͷࣄޙ෼෍͔ΒϥϯμϜʹબͼग़͞Εͨύϥϝʔλ w0 , w1 Λ༻͍ͯɺؔ਺ y(x, w) = w0 + w1 x Λඳ͘͜ͱΛ 6 ճߦͬͯ ͍Δɻ ࣄޙ෼෍͕ a0 = −0.3, a1 = 0.5 ෇ۙʹӶ͍෼෍Λ࣋ͪɺ6 ݸͷؔ਺͕ f(x, a0 = −0.3, a1 = 0.5) = −0.3 + 0.5x ෇ۙʹ·ͱ·͍͍ͬͯͯΔ͜ ͱ͕Θ͔Δɻ 45 / 74
  42. 2-3. ϕΠζઢܗճؼ ࣍ʹࣄޙ෼෍ p(w|t, α, β) ͱ໬౓ؔ਺ p(t|x, w, β)

    Λ༻͍ͯɺະ஌ͷೖ ྗϕΫτϧ x ʹର͢Δ༧ଌ t ͷ֬཰෼෍ΛٻΊΔɻ(ࣄޙ෼෍ʹϋΠ ύʔύϥϝʔλ α, β ΛҾ਺ʹ෮׆ͤͨ͞ɻ) (1.32) ΑΓɺ༧ଌ෼෍ p(t|t, α, β) ͸ҎԼͷΑ͏ʹͳΔɻ p(t|t, α, β) = ∫ p(t|x, w, β)p(w|t, α, β) dw (2.34) (2.6) ͱ (2.26) ͱ (1.25) Λ༻͍Δͱɺp(t|t, α, β) ͸ҎԼͷΑ͏ʹͳΔɻ (PRML ͷԋश 3.10 ࢀর) p(t|t, α, β) = N(t|mT N ϕ(x), σ2 N (x)) (2.35) ͜͜Ͱɺ༧ଌ෼෍ͷ෼ࢄ σ2 N (x) ͸ҎԼͰ༩͑ΒΕΔɻ σ2 N (x) = 1 β + ϕ(x)TSN ϕ(x) (2.36) 46 / 74
  43. 2-3. ϕΠζઢܗճؼ σ2 N (x) ͷҰ߲໨ͷ 1/β ͸໬౓ؔ਺ͷ෼ࢄͰ͋Γɺೖྗσʔλʹର͢Δ ग़ྗͷόϥ͖ͭ (ϊΠζ)

    Ͱ͋Δɻ Ұํɺೋ߲໨ͷ ϕ(x)TSN ϕ(x) ͸ w ͷෆ࣮֬ੑ (ࣄޙ෼෍ͷ෼ࢄ) ͔Β ͘Δ߲Ͱ͋Δɻ(ύϥϝʔλΛ఺ਪఆ͠ͳ͍ϕΠζਪఆಛ༗ͷϊΠζ) ͜ͷೋ߲໨͸৽ͨͳڭࢣσʔλ͕௥Ճ͞ΕΔ (N → N + 1) ͱখ͘͞ͳ Δɺͭ·Γ σ2 N+1 (x) ≤ σ2 N (x) ͱͳΔɻ(PRML ͷԋश 3.11 ࢀর) ͜Ε͸ڭࢣσʔλ͕૿͑Δͱɺग़ྗͷ༧ଌͷ࣮͕֬͞૿͑Δͱ͍͏͜ͱ Λද͢ɻ ࠷ޙʹྫΛ༻͍ͯɺڭࢣσʔλ͕૿͑Δͱ༧ଌͷෆ͔͕֬͞ݮΔ༷ࢠΛ ݟΔɻ 47 / 74
  44. 2-3. ϕΠζઢܗճؼ ྫ͸؆୯ͳճؼͷͱ͖ʹ࢖༻ͨ͠ࡾ֯ؔ਺ͷྫͰ͋Δɻ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x = (x1 , x2

    , · · · , xN )T ͱͦΕͧΕʹର Ԡ͢Δ N ݸͷ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δɻ tn ͸ҎԼͷΑ͏ʹ sin(2πxn ) ʹΨ΢ε෼෍ʹै͏ϥϯμϜϊΠζ ϵ Λ Ճ͑ͨ΋ͷͱ͢Δɻ tn = sin(2πxn ) + ϵ (2.37) ໬౓ؔ਺ͷฏۉ஋Ͱ͋Δ y(x, w) ͸Ψ΢εجఈؔ਺ (2.4) Ͱల։͢Δɻ ͜ͷઃఆͰڭࢣσʔλͷ਺͕ N = 1, 2, 4, 25 ͷͱ͖ͷάϥϑ͸ҎԼͷΑ ͏ʹͳΔɻ 48 / 74
  45. 2-3. ϕΠζઢܗճؼ ੨ؙ͕ڭࢣσʔλɺԫ྘ͷઢ͕ਖ਼ղͰ͋ΔαΠϯؔ਺ɺ੺͍ઢ͕༧ଌ෼ ෍ͷฏۉ mT N ϕ(x)ɺബ͍੺ͷྖҬ͕༧ଌ ±σN (x) ͷྖҬͰ͋Δɻ

    ڭࢣσʔλ͕૿͑Ε͹૿͑Δ΄Ͳɺ੺͍ઢ͕ԫ྘ͷઢʹۙ෇͖ɺബ͍੺ ͷྖҬ͕ݮ͍༷ͬͯ͘ࢠ͕ݟͯऔΕΔɻ 49 / 74
  46. 3. ෼ྨ໰୊ ͜Ε·Ͱ͸ճؼ໰୊ΛऔΓѻ͖͕ͬͯͨɺ͔͜͜Β͸෼ྨ໰୊ΛऔΓ ѻ͏ɻ ճؼͱಉ͡Α͏ʹɺ܇࿅σʔλͱͯ͠ೖྗσʔλͷू߹ {x1 , x2 , ·

    · · , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δ͕ɺ෼ྨͷ৔߹ (ਖ਼͘͠͸ೋ஋෼ྨ) ͸໨తม਺ tn ͸ 0 ͔ 1 ͷ཭ࢄతͳ஋ΛͱΔɻ ࣮༻తͳྫͱͯ͠ɺೖྗσʔλ x Λը૾ͱͯ͠ɺt = 0 ͳΒݘͷը૾ɺ t = 1 ͳΒೣͷը૾ͱ͔ɻ ࠓճ͸෼ྨͷϞσϧͱͯ͠ɺϩδεςΟοΫճؼϞσϧΛ঺հ͢Δɻ (ʮճؼʯͱ෇͍͍ͯΔ͕ɺ෼ྨͷϞσϧͰ͋Δɻ) 50 / 74
  47. 3-1. ϩδεςΟοΫճؼ ·ͣɺه߸ͱͯ͠ t = 1 ͷΫϥεΛ C1 ͱ͠ɺt =

    0 ͷΫϥεΛ C2 ͱ ͢Δɻ ·ͨɺճؼͷ࣌ͱಉ͡Α͏ʹҎԼͷΑ͏ͳɺೖྗ x ͱύϥϝʔλ w ͷ ؔ਺ y(x, w) Λߟ͑Δɻ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (3.1) ϕ(x) ͸ಛ௃ϕΫτϧͰ͋Δɻ ճؼͰ͸ɺ໬౓ؔ਺ p(t|x, w, β) ͱͯ͠ɺ(2.6) ͷΑ͏ʹฏۉ͕ y(x, w) Ͱ෼ࢄ͕ β ͷΨ΢ε෼෍Λߟ͑ͯɺ࠷໬ਪఆΛߦͳͬͨɻ ࣮ࡍʹ෼ྨ໰୊ʹରͯ͠΋ͦͷΑ͏ͳϞσϧઃఆͰٞ࿦͢Δ৔߹΋͋ Δɻ(PRML ͷ 4.1.3 ࢀর) ͨͩ͠ࠓճ͸ϩδεςΟοΫճؼΛߦ͏ɻ 51 / 74
  48. 3-1. ϩδεςΟοΫճؼ ͜ͷϩδεςΟοΫγάϞΠυؔ਺Λར༻ͯ͠ɺ֬཰ p(C1 |x, w)(໬౓ ؔ਺ p(t|x, w) Ͱ

    t = 1 ͱͨ͠΋ͷ) ΛҎԼͷΑ͏ʹఆٛ͢Δɻ p(C1 |x, w) = σ(y(x, w)) (3.3) ͜͜Ͱೋ஋෼ྨͳͷͰɺن֨Խ৚݅ΑΓ p(C2 |x, w) ͸ p(C1 |x, w) Λ༻ ͍ͯҎԼͷΑ͏ʹٻ·Δɻ p(C2 |x, w) = 1 − p(C1 |x, w) (3.4) t = 1 ͷΫϥεΛ C1 ͱ͠ɺt = 0 ͷΫϥεΛ C2 ͱ͍ͯ͠ΔͷͰɺ໬౓ؔ ਺ p(t|x, w) ͸ p(t|x, w) = σ(y(x, w))t(1 − σ(y(x, w)))1−t (3.5) ͱͳΔɻ (͜ͷΑ͏ͳ෼෍ΛϕϧψʔΠ෼෍ͱ͍͏) 53 / 74
  49. 3-2. ϩδεςΟοΫճؼͷ࠷໬ਪఆ ࣍ʹ࠷໬ਪఆΛߦ͏͜ͱΛߟ͑Δɻ ͍ͭ΋ͷΑ͏ʹɺ܇࿅σʔλͱͯ͠ೖྗσʔλͷू߹ X = {x1 , x2 ,

    · · · , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δɻ ڭࢣσʔλҰͭҰ͕ͭ෼෍ (3.5) ͔Βಠཱʹੜ੒͞Ε͍ͯΔͱ͢Δͱɺ ໬౓ؔ਺ p(t|X, w) ͸ҎԼͷΑ͏ʹͳΔɻ p(t|X, w) = N ∏ n=1 ytn n (1 − yn )1−tn (3.6) ͱͳΔɻ ͜͜Ͱ yn ͸ҎԼͰఆٛ͞ΕΔɻ yn = σ(y(xn , w)) (3.7) 54 / 74
  50. 3-2. ϩδεςΟοΫճؼͷ࠷໬ਪఆ ໬౓ؔ਺ (3.6) Λ࠷େʹ͢Δ w ΛٻΊΔ͜ͱ͸ҎԼͷෛͷର਺໬౓Λ ࠷খʹ͢Δ w ΛٻΊΔ͜ͱͱ౳ՁͰ͋Δɻ

    E(w) = − ln p(t|X, w) = − N ∑ n=1 ln { ytn n (1 − yn )1−tn } = − N ∑ n=1 { tn ln yn + (1 − tn ) ln (1 − yn ) } (3.8) ͜Ε͸ަࠩΤϯτϩϐʔޡࠩͱݺ͹ΕΔޡࠩؔ਺Ͱɺ෼ྨ໰୊ͰΑ͘࢖ ΘΕΔޡࠩؔ਺Ͱ͋Δɻ ճؼͷͱ͖ͷೋ৐࿨ޡࠩͷٞ࿦ͱಉ͡Α͏ʹɺ෼ྨ໰୊ʹ͓͚ΔަࠩΤ ϯτϩϐʔޡࠩͷ࠷খԽ͸ɺ֬཰࿦Λ༻͍Δͱ໬౓ؔ਺ΛϕϧψʔΠ෼ ෍ (3.5) ͱԾఆͨ͠ͱ͖ͷ࠷໬ਪఆͷ݁ՌͰ͋Δࣄ͕Θ͔Δɻ 55 / 74
  51. 3-2. ϩδεςΟοΫճؼͷ࠷໬ਪఆ ࣍ʹෛͷର਺໬౓ (3.8) Λ࠷খʹ͢Δ w ΛٻΊΔͨΊʹ (3.8) ͷ w

    ʹ ର͢Δޯ഑ΛٻΊΔͱҎԼͷΑ͏ʹͳΔɻ(PRML ͷԋश 4.13 ࢀর) ∇E(w) = N ∑ n=1 (yn − tn )ϕ(xn ) (3.9) ͜ͷޯ഑ͷܗ͸ਖ਼ղϥϕϧ tn ͱ༧ଌ஋ yn ͷࠩ (ͭ·Γޡࠩ) ͱجఈؔ ਺ϕΫτϧ ϕ(xn ) ͷ࿨ͷܗΛ͍ͯ͠Δɻ ͜ͷܗ͸લষͷճؼͰͷର਺໬౓ͷޯ഑ (2.13) ͱಉ͡ܗΛ͍ͯ͠Δɻ (PRML ͷ 4.3.6 ࢀর) 56 / 74
  52. 3-2. ϩδεςΟοΫճؼͷ࠷໬ਪఆ ͜ͷޯ഑ ∇E(w) Λθϩʹ͢Δ w ΛղੳతʹٻΊΔ͜ͱ͸Ͱ͖ͳ͍ɻ ͦͷཧ༝͸༧ଌ஋ y =

    σ(wTϕ(x)) ͕ϩδεςΟοΫؔ਺Λ׆ੑԽؔ਺ ʹ͔࣋ͭΒͰ͋Δɻ ճؼͷͱ͖͸༧ଌ஋ y = wTϕ(x) ͸߃౳ؔ਺Λ׆ੑԽؔ਺ʹ͔࣋ͭΒ (2.13) ͸ղੳతʹղ͚ͨΘ͚Ͱ͋Δɻ ͜ͷΑ͏ʹޯ഑ ∇E(w) Λθϩʹ͢Δ w ΛղੳతʹٻΊΔ͜ͱ͕Ͱ͖ ͳ͍࣌͸ޯ഑߱Լ๏Λ༻͍Δ͜ͱ͕͋Δɻ(χϡʔϥϧωοτͰ΋͜ͷ ํ๏͕Α͘༻͍ΒΕΔɻ) ޯ഑߱Լ๏Ͱ͸ɺ·ͣϥϯμϜʹܾΊͨύϥϝʔλͷॳظ஋Λ w(0) ͱ ͠ɺޡࠩؔ਺ͷޯ഑Λ༻͍ͯύϥϝʔλΛҎԼͷΑ͏ʹߋ৽͢Δɻ w(1) = w(0) − η∇E(w(0)) (3.10) ͜͜Ͱ η > 0 ͸ֶशύϥϝʔλͱݺͿɻ ͜ΕΛ܁Γฦ͢͜ͱͰύϥϝʔλ͕ޯ഑ ∇E(w) ͕খ͘͞ͳΔํ޲ʹߋ ৽͞ΕɺE(w) Λ࠷খʹ͢Δύϥϝʔλʹऩଋ͢Δɻ 57 / 74
  53. 3-3. ϕΠζϩδεςΟοΫճؼ ࣍͸ϩδεςΟοΫճؼΛϕΠζతʹѻ͏͜ͱΛߟ͑Δɻ ճؼͰ΋ٞ࿦ͨ͠ͱ͓ΓɺϕΠζਪఆͰ͸ະ஌ͷೖྗ x ʹର͢Δग़ྗ t ͷ༧ଌ෼෍ p(t|x, t,

    X) ΛٻΊΔ͜ͱ͕໨తͱͳΔɻ (1.32) ΑΓɺͦͷ༧ଌ෼෍ p(t|x, t, X) ͸ॏΈ w ͷੵ෼ͰҎԼͷΑ͏ʹ ͔͚Δɻ p(t|x, t, X) = ∫ p(t|x, w)p(w|t, X) dw (3.11) ͜͜Ͱɺp(t|x, w) ͸໬౓ؔ਺Ͱ͋Γɺp(w|t, X) ͸ύϥϝʔλͷࣄޙ෼ ෍Ͱ͋Δɻ ಛʹࠓճ͸ೋ஋෼ྨΛߟ͍͑ͯΔͷͰɺ֬཰ p(C1 |x, t, X) = ∫ p(C1 |x, w)p(w|t, X) dw (3.12) ͚ͩΛੵ෼ͯ͠ٻΊͯɺp(C2 |x, t, X) ͸ p(C2 |x, t, X) = 1 − p(C1 |x, t, X) (3.13) ͷΑ͏ʹن֨Խ৚͔݅ΒٻΊΔ͜ͱΛߟ͑Δɻ 58 / 74
  54. 3-3. ϕΠζϩδεςΟοΫճؼ ͨͩ͠ճؼͷ࣌ͱ͸ҧ͍ɺੵ෼ (3.12) Λղੳతʹղ͘ͷ͸ෆՄೳͰ͋ Δɻ(͜Ε΋ϩδεςΟοΫγάϞΠυؔ਺ͷӨڹͰ͋Δ) ͕ͨͬͯ͠ɺੵ෼ΛۙࣅతʹٻΊΔ͜ͱΛߟ͑Δɻ ࠓճ͸ (PRML Ͱ͸)

    ϥϓϥεۙࣅΛ༻͍ͯੵ෼ΛۙࣅతʹٻΊ͍ͯΔɻ ۩ମతʹ͸ (3.12) ͷύϥϝʔλͷࣄޙ෼෍ p(w|t, X) ʹϥϓϥεۙࣅ Λద༻ͯ͠ɺΨ΢ε෼෍ʹۙࣅ͢Δɻ ͜͜Ͱϥϓϥεۙࣅͷઆ໌Λগ͠ߦ͏ɻ 59 / 74
  55. 3-3. ϕΠζϩδεςΟοΫճؼ ·ͣ͸֬཰ม਺͕Ұ࣍ݩͷม਺ z ͷ৔߹Λߟ͑ɺҎԼͷΑ͏ͳ֬཰෼෍ p(z) Λߟ͑Δɻ p(z) = 1

    Z f(z) (3.14) ͜͜ͰɺZ ͸ҎԼͰఆٛ͞ΕΔن֨Խఆ਺Ͱ͋Δɻ Z = ∫ f(z) dz (3.15) ϥϓϥεۙࣅͷ໨త͸෼෍ p(z) ΛϞʔυ (dp(z)/dz = 0 ͱͳΔ z) Λத ৺ͱ͢ΔΨ΢ε෼෍ʹۙࣅ͢Δ͜ͱͰ͋Δɻ ·ͣ͸Ϟʔυ z = z0 Λݟ͚ͭΔɻϞʔυ͸ (3.14) ΑΓ df(z) dz z=z0 = 0 (3.16) ͳΔ z0 Ͱ͋Δɻ 60 / 74
  56. 3-3. ϕΠζϩδεςΟοΫճؼ Ϟʔυ͕ٻ·ͬͨΒɺؔ਺ ln f(z) Λ z = z0 पΓͰҎԼͷΑ͏ʹςΠ

    ϥʔల։ͷ 2 ࣍·ͰͰۙࣅ͢Δɻ ln f(z) ∼ ln f(z0 ) − 1 2 A(z − z0 )2 (3.17) ͜͜Ͱɺ A = − d2 dz2 ln f(z) z=z0 (3.18) Ͱ͋Δɻ ͜͜Ͱɺ(3.16) ʹΑΓ (3.17) ͷӈลͰҰ࣍ͷ߲͕ଘࡏ͠ͳ͍ɻ (3.17) ͷ྆ลͷࢦ਺ΛͱΔͱ f(z) ∼ f(z0 ) exp { − 1 2 A(z − z0 )2 } (3.19) ͱͳΔɻ 61 / 74
  57. 3-3. ϕΠζϩδεςΟοΫճؼ ن֨ԽΛ͢Δͱɺ෼෍ p(z) ͸ p(z) ∼ ( A 2π

    )1/2 exp { − 1 2 A(z − z0 )2 } (3.20) ͱۙࣅͰ͖Δɻ͜Ε͕ϥϓϥεۙࣅͰ͋Δɻ ͨͩ͠஫ҙ఺ͱͯ͠ɺA > 0 Ͱͳ͍ͱΨ΢ε෼෍͕ఆٛͰ͖ͳ͍ɻ 62 / 74
  58. 3-3. ϕΠζϩδεςΟοΫճؼ ࣍͸Ұ࣍ݩͷ֬཰ม਺͔ΒɺϕΫτϧʹ֦ு͠Α͏ɻ ͭ·ΓɺҎԼͷ֬཰෼෍ p(z) Λఆٛ͢Δɻ p(z) = 1 Z

    f(z) (3.21) ͜͜Ͱɺ Z = ∫ f(z) dz (3.22) Ͱ͋Δɻ Ұ࣍ݩͷ֬཰ม਺ͱಉ͡Α͏ʹޯ഑ ∇f(z) ͕θϩʹͳΔ఺ z0 Λٻ ΊΔɻ 63 / 74
  59. 3-3. ϕΠζϩδεςΟοΫճؼ Ϟʔυ͕ٻ·ͬͨΒɺln f(z) Λ z0 पΓͰςΠϥʔల։Ͱۙࣅ͢Δɻ ln f(z) ∼

    ln f(z0 ) − 1 2 (z − z0 )TA(z − z0 ) (3.23) ͜͜ͰɺA ͸ҎԼͰఆٛ͞ΕΔ M × M ͷϔοηߦྻͰ͋Δɻ A = −∇∇ ln f(z) z=z0 (3.24) ࣍ʹ (3.23) ͷ྆ลͷࢦ਺ΛͱΔͱҎԼͷΑ͏ʹͳΔɻ f(z) ∼ f(z0 ) exp { − 1 2 (z − z0 )TA(z − z0 ) } (3.25) ͜ΕΑΓن֨ԽΛ͢Δͱɺ෼෍ p(z) ͸ p(z) ∼ |A|1/2 (2π)M/2 exp { − 1 2 (z − z0 )TA(z − z0 ) } = N(z|z0 , A−1) (3.26) ͱΨ΢ε෼෍ʹۙࣅͰ͖Δɻ 64 / 74
  60. 3-3. ϕΠζϩδεςΟοΫճؼ Ҏ্Ͱઆ໌ͨ͠ϥϓϥεۙࣅΛ༻͍ͯҎԼͷੵ෼ (3.12) Λۙࣅ͍ͨ͠ɻ p(C1 |x, t, X) =

    ∫ p(C1 |x, w)p(w|t, X) dw (3.27) ·ͣɺࣄޙ෼෍ p(w|t, X) ΛٻΊΔͨΊʹࣄલ෼෍Λಋೖ͢Δɻ p(w) = N(w|m0 , S0 ) (3.28) (3.14) ΑΓɺ໬౓ؔ਺͸ p(t|X, w) ͸ p(t|X, w) = N ∏ n=1 ytn n (1 − yn )1−tn (3.29) Ͱ͋ͬͨɻ 65 / 74
  61. 3-3. ϕΠζϩδεςΟοΫճؼ ͜ΕΑΓɺࣄޙ෼෍ p(w|t, X) ͸ϕΠζͷఆཧΑΓɺҎԼͰ͋Δɻ p(w|t, X) ∝ p(w)p(t|X,

    w) (3.30) ͱͳΔͷͰɺln p(w|t, X) ͸ҎԼͱͳΔɻ ln p(w|t, X) = − 1 2 (w − m0 )TS−1 0 (w − m0 ) + N ∑ n=1 { tn ln yn + (1 − tn ) ln (1 − yn ) } + const. (3.31) ͜ͷࣄޙ෼෍ͷର਺ ln p(w|t, X) Λ࠷େʹ͢Δύϥϝʔλ wMAP Λ (ͨ ͱ͑͹ޯ഑߱Լ๏ͳͲͰ) ٻΊͯɺͦͷ఺ wMAP ͰͷϔοηߦྻΛٻΊ ΔͱɺҎԼͷΑ͏ʹͳΔɻ S−1 N = − ∇∇ ln p(w|t, X) w=wMAP =S−1 0 + N ∑ n=1 yn (1 − yn )ϕn ϕT n w=wMAP (3.32) 66 / 74
  62. 3-3. ϕΠζϩδεςΟοΫճؼ ΑͬͯɺϥϓϥεۙࣅΛ༻͍Δͱࣄޙ෼෍ p(w|t, X) ͸ҎԼͷΑ͏ʹۙ ࣅͰ͖Δɻ p(w|t, X) ∼

    N(w|wMAP , SN ) (3.33) ͜ΕΑΓɺ(3.27) ͷੵ෼͸ҎԼͷΑ͏ʹۙࣅͰ͖Δɻ p(C1 |x, t, X) ∼ ∫ σ(wTϕ) N(w|wMAP , SN ) dw (3.34) ͜͜Ͱɺp(C1 |x, w) = σ(wTϕ) Λར༻ͨ͠ɻ ࣍ʹɺϩδεςΟοΫγάϞΠυؔ਺ΛҎԼͷΑ͏ʹॻ͖௚͢ɻ σ(wTϕ) = ∫ δ(a − wTϕ)σ(a) da (3.35) ͜͜Ͱɺδ(·) ͸σϡϥοΫͷσϧλؔ਺Ͱ͋Δɻ 67 / 74
  63. 3-3. ϕΠζϩδεςΟοΫճؼ ͜ΕΑΓɺ(3.34) ͸ҎԼͷΑ͏ʹॻ͖௚ͤΔɻ ∫ σ(wTϕ) N(w|wMAP , SN )

    dw = ∫ ∫ δ(a − wTϕ)σ(a) N(w|wMAP , SN ) da dw = ∫ ∫ δ(a − wTϕ) N(w|wMAP , SN ) dw σ(a) da = ∫ p(a) σ(a) da (3.36) ͜͜Ͱɺ p(a) = ∫ δ(a − wTϕ) N(w|wMAP , SN ) dw (3.37) Ͱ͋Δɻ 68 / 74
  64. 3-3. ϕΠζϩδεςΟοΫճؼ ੵ෼ (3.37) ʹ͓͍ͯɺϕ ʹฏߦͳ͢΂ͯͷํ޲ͷ w ੵ෼͸ͦΕΒͷύ ϥϝʔλʹઢܗ੍໿Λ༩͑ɺ·ͨ ϕ

    ʹ௚ߦ͢Δ͢΂ͯͷํ޲ͷ w ੵ෼ ͸Ψ΢ε෼෍ N(w|wMAP , SN ) ͷपลԽΛ༩͑Δɻ ͨͱ͑͹ɺw = (w1 , w2 )T ͱ͠ɺϕ = (ϕ, 0)T Ͱ͋Δͱ͖Λߟ͑Δͱɺ ੵ෼ (3.37) ͸ҎԼͷΑ͏ʹ͔͚Δɻ p(a) = ∫ ∫ δ(a − w1 ϕ) N(w|wMAP , SN ) dw1 dw2 = ∫ δ(a − w1 ϕ) [ ∫ N(w|wMAP , SN ) dw2 ] dw1 (3.38) (1.29) ΑΓɺΨ΢ε෼෍ΛपลԽͨ͠पล෼෍͸࠶ͼΨ΢ε෼෍Ͱ͋Δ ͜ͱ͕Θ͔͍ͬͯΔͷͰɺϕ ʹ௚ߦ͢Δ w2 ํ޲ͷੵ෼͸Ψ΢ε෼෍ͷ पลԽΛ༩͑ɺͦͷपลԽ͞ΕͨΨ΢ε෼෍͸ N(w1 |(wMAP )1 , (SN )11 ) ͱͳΔɻ 69 / 74
  65. 3-3. ϕΠζϩδεςΟοΫճؼ ·ͨɺw1 ͷੵ෼Λ͢Δͱɺੵ෼ (3.37) ͸ҎԼͷΑ͏ʹͳΔɻ p(a) = ∫ δ(a

    − w1 ϕ) N(w1 |(wMAP )1 , (SN )11 ) dw1 = 1 |ϕ| N(a/ϕ|(wMAP )1 , (SN )11 ) =N(a|(ϕwMAP )1 , (ϕ2SN )11 ) (3.39) ͭ·Γɺϕ ʹฏߦͳ w1 ͷํ޲ͷੵ෼͸ w1 ʹ w1 = a/ϕ ͳΔઢܗ੍໿ Λ༩͑Δ͜ͱ͕Θ͔Δɻ ͜ΕΑΓɺp(a) ͸֬཰ม਺͕ a ͷΨ΢ε෼෍ʹͳΔ͜ͱ͕Θ͔Δɻ 70 / 74
  66. 3-3. ϕΠζϩδεςΟοΫճؼ Ψ΢ε෼෍͸ฏۉͱ෼ࢄ͕ܾ·Ε͹ɺܗ͕Ұҙʹఆ·Γɺฏۉ µa ͱ෼ ࢄ σ2 a ͸ҎԼͷΑ͏ʹͳΔɻ µa

    = ∫ p(a)a da = ∫ ∫ aδ(a − wTϕ) N(w|wMAP , SN ) dwda = ∫ wTϕ N(w|wMAP , SN ) dw = wT MAP ϕ (3.40) σ2 a = ∫ p(a)(a2 − µ2 a ) da = ∫ ∫ (a2 − µ2 a )δ(a − wTϕ) N(w|wMAP , SN ) dwda = ∫ ((wTϕ)2 − (wT MAP ϕ)2) N(w|wMAP , SN ) dw =ϕT [ ∫ (wwT − wMAP wT MAP ) N(w|wMAP , SN ) dw ] ϕ =ϕTSN ϕ (3.41) 71 / 74
  67. 3-3. ϕΠζϩδεςΟοΫճؼ ͢Δͱɺ༧ଌ෼෍ p(C1 |x, t, X) ͸ (3.36) ΑΓɺҎԼͷΑ͏ʹͳΔ͜ͱ

    ͕Θ͔Δɻ p(C1 |x, t, X) ∼ ∫ σ(a)N(a|µa , σ2 a ) da (3.42) ͜͜Ͱɺµa ͱ σ2 a ͸ (3.40) ͱ (3.41) Ͱܭࢉͨ͠ฏۉͱ෼ࢄͷύϥϝʔ λͰ͋Δɻ ͜ͷੵ෼ (3.42) ΋·ͨղੳతʹੵ෼Ͱ͖ͳ͍ɻ ͦ͜ͰҎԼͷϓϩϏοτؔ਺ͷٯؔ਺ Φ(a) Λಋೖ͢Δɻ Φ(a) = 1 2 { 1 + erf ( a √ 2 )} (3.43) ͜͜Ͱɺޡࠩؔ਺ erf(a) ͸ҎԼͰఆٛ͞ΕΔɻ erf(a) = 2 √ π ∫ a 0 exp (−θ2) dθ (3.44) 72 / 74
  68. 3-3. ϕΠζϩδεςΟοΫճؼ ϓϩϏοτؔ਺ͷٯؔ਺ Φ (√ π 8 a ) ʹΑͬͯϩδεςΟοΫγάϞΠ

    υؔ਺ σ(a) Λۙࣅ͢Δ͜ͱ͕Ͱ͖Δɻ ҎԼ͸ϩδεςΟοΫγάϞΠυؔ਺ σ(a)(੺ͷ࣮ઢ) ͱϓϩϏοτؔ ਺ͷٯؔ਺ Φ (√ π 8 a ) (੨ͷ఺ઢ) Λൺֱͨ͠ਤͰ͋Δɻ 73 / 74
  69. 3-3. ϕΠζϩδεςΟοΫճؼ ͞ΒʹϓϩϏοτؔ਺ͷٯؔ਺ʹ͸ҎԼͷੑ࣭͕͋Δɻ(PRML ͷԋश 4.26 ࢀর) ∫ Φ(λa)N(a|µ, σ2) da

    = Φ ( µ (λ−2 + σ2)1/2 ) (3.45) ͜ΕΒͷੑ࣭Λ༻͍ͯɺੵ෼ (3.42) ΛҎԼͷΑ͏ʹۙࣅͯ͠ٻΊΔɻ p(C1 |x, t, X) ∼ ∫ σ(a)N(a|µa , σ2 a ) da ∼ ∫ Φ (√ π 8 a ) N(a|µa , σ2 a ) da =Φ ( µa (8/π + σ2 a )1/2 ) ∼ σ (√ 8 π µa (8/π + σ2 a )1/2 ) =σ ( µa (1 + πσ2 a /8)1/2 ) (3.46) ͜͜Ͱɺµa ͱ σ2 a ͸ (3.40) ͱ (3.41) Ͱ͋Δɻ 74 / 74