Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PRML第6章

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for gucchi gucchi
January 21, 2019

 PRML第6章

Avatar for gucchi

gucchi

January 21, 2019
Tweet

More Decks by gucchi

Other Decks in Science

Transcript

  1. 3 ষͱ 4 ষͰ͸ɺճؼͱ෼ྨͷઢܕύϥϝτϦοΫϞσϧΛߟ͑ͨɻ ྫ͑͹ 3 ষͰ͸ɺग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσ

    ϧΛߟ͑ͨɻ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (3.3) ͜͜Ͱɺx = (x1 , x2 , · · · , xD )T ͸ D ࣍ݩͷೖྗϕΫτϧɻ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T ͸ೖྗϕΫτϧ x Λ M ࣍ݩͷ ಛ௃ۭؒʹࣸ૾͢ΔϕΫτϧؔ਺ɻ·ͨɺw = (w0 , w1 , · · · , wM−1 )T ͸ M ࣍ݩͷύϥϝʔλϕΫτϧͰ͋Δɻ 3 ষͰ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͦΕͧΕʹରԠ͢Δ໨ඪ ϕΫτϧͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙͯ͠ɺ࠷খೋ৐๏Λ༻͍ͯɺxn Λೖྗͨ࣌͠ͷग़ྗ y(xn , w) ͕ tn Λ࠶ݱ͢ΔΑ͏ʹύϥϝʔλ w Λ ܾΊͨɻ 2 / 34
  2. 3 ষͱ 4 ষͰ͸ɺͬ͘͟Γݴ͏ͱɺϕΫτϧؔ਺ ϕ(x) ͷܗΛܾΊΔ (ྫ ͑͹ɺΨ΢εجఈؔ਺) ͜ͱ͕Ϟσϧߏஙͷग़ൃ఺Ͱ͋ͬͨɻ(ͪͳΈʹ 5

    ষͷχϡʔϥϧωοτͰ͸ɺϕ(x) ࣗମ΋ֶशύϥϝʔλʹґଘ͞ ͤͨ) ଟ͘ͷઢܕύϥϝτϦοΫϞσϧͰ͸ɺϞσϧΛ૒ରදݱͰॻ͖௚͢͜ ͱʹΑΓɺΧʔωϧؔ਺ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) Λ௨ͯ͠ͷΈ ϕ(x) ֶ͕शࡁΈͷύϥϝʔλ wML ΍ͦͷύϥϝʔλΛ ༻͍ͨग़ྗ y(x, wML ) ΁ґଘ͢ΔΑ͏ʹॻ͖௚ͤΔɻ(6.1 Ͱৄ͘͠ղ આ͢Δ) ·ͨɺճؼͱ෼ྨͷઢܕύϥϝτϦοΫϞσϧΛ֬཰తʹऔΓѻ͏͜ͱ ʹΑͬͯɺ͜ΕΒͷϞσϧ͕Ψ΢εաఔͷҰྫʹͳ͍ͬͯΔ͜ͱΛΈ Δɻ(6.4 Ͱৄ͘͠ղઆ͢Δ) 3 / 34
  3. 6.1 ૒ରදݱ ग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσϧΛߟ͑Δɻ y(x, w) = wTϕ(x)

    ҎԼͷਖ਼ଇԽ͞Εͨೋ৐࿨ޡࠩΛ࠷খԽ͢Δ͜ͱΛߟ͑Δɻ J(w) = 1 2 N ∑ n=1 {wTϕ(xn ) − tn }2 + λ 2 wTw (6.2) ͜͜Ͱɺೖྗσʔλͷू߹Λ {x1 , x2 , · · · , xN }ɺ໨ඪϕΫτϧͷू߹Λ {t1 , t2 , · · · , tN } ͱ͢Δɻ 4 / 34
  4. 6.1 ૒ରදݱ ఀཹ఺৚݅ ∂J(w)/∂w = 0 ͸ҎԼͷΑ͏ʹมܗͰ͖Δɻ(3.1.1 Λࢀর) w =

    N ∑ n=1 an ϕ(xn ) = ΦTa (6.3) ͜͜Ͱɺ an = − 1 λ {wTϕ(xn ) − tn } (6.4) Ͱ͋Γɺa = (a1 , · · · , aN )T ͱ͠ɺΦ = (ϕ(x1 ), · · · , ϕ(xN ))T ͸ܭըߦ ྻ (3.16) Ͱ͋Δɻ w = ΦTa Λ༻͍ͯɺJ(w) ΛύϥϝʔλϕΫτϧ a ͷؔ਺ʹॻ͖௚͢ ͱҎԼͷΑ͏ʹͳΔɻ J(a) = 1 2 aTΦΦTΦΦTa − aTΦΦTt + 1 2 tTt + λ 2 aTΦΦTa (6.5) ͜͜Ͱɺt = (t1 , · · · , tN )T Ͱ͋Δɻ 5 / 34
  5. 6.1 ૒ରදݱ ͜͜ͰɺάϥϜߦྻ K = ΦΦT Λఆٛ͢Δɻ͜ͷߦྻͷ੒෼ Knm ͸Ҏ ԼͷΑ͏ʹΧʔωϧͰॻ͚Δɻ

    Knm = ϕ(xn )Tϕ(xm ) = k(xn , xm ) (6.6) άϥϜߦྻ K Λ༻͍ͯ (6.5) ͷ J(a) ͸ҎԼͷΑ͏ʹॻ͚Δɻ J(a) = 1 2 aTKKa − aTKt + 1 2 tTt + λ 2 aTKa (6.7) ͜ͷΑ͏ʹύϥϝʔλ w ͷ୅ΘΓʹύϥϝʔλ a Ͱ࠷খೋ৐๏ͷΞϧ ΰϦζϜΛදݱ͢Δ͜ͱ͕Ͱ͖ɺ͜ͷදݱΛ૒ରදݱͱݴ͏ɻ ૒ରදݱͰॻ͖௚͢ͱɺJ(a) ͷ ϕ(x) ґଘ͸Χʔωϧ (6.6) Λ௨ͯ͠ͷ Έґଘ͍ͯ͠Δ͜ͱ͕Θ͔Δɻ(ੜͷ ϕ(x) ґଘ͸ͳ͍) 6 / 34
  6. 6.1 ૒ରදݱ ͜ͷ J(a) Λ࠷খʹ͢Δ a ΛٻΊΔͱɺҎԼͷΑ͏ʹͳΔɻ(௚઀ J(a) ͷ a

    ʹର͢Δޯ഑͕θϩʹͳΔΑ͏ͳ a ΛٻΊͯ΋ྑ͍͠ɺຊจͰ ΍͍ͬͯΔΑ͏ʹɺJ(w) ͷ w ʹର͢Δޯ഑͕θϩʹͳΔΑ͏ͳ w(6.3) ͱ a ͱ w ͷؔ܎ (6.4) Λ༻͍ͯٻΊͯ΋ྑ͍ɻ) a = (K + λIN )−1t (6.8) ͜͜ͰɺIN ͸ N × N ͷ୯ҐߦྻͰ͋Δɻ ͜ͷղ a ͱ w = ΦTa ͱ y(x, w) = wTϕ(x) Λ༻͍Δͱɺ৽͍͠ೖྗ x ʹର͢Δ༧ଌ y(x) ͸ҎԼͷΑ͏ʹͳΔɻ y(x) = aTΦϕ(x) = k(x)T(K + λIN )−1t (6.9) ͜͜Ͱɺk(x) = (k(x1 , x), k(x2 , x), · · · , k(xN , x))T Ͱ͋Δɻ ͜ΕΑΓɺ༧ଌ y(x) ΋Χʔωϧؔ਺ͷΈʹΑͬͯද͞Ε͍ͯΔɻ 7 / 34
  7. 6.1 ૒ରදݱ ૒ରදݱͰղ a ΛٻΊΔࡍ͸ɺ(6.8) ΑΓ N × N ͷߦྻͷٯߦྻΛٻΊ

    Δඞཁ͕͋Δɻ(N ͸ڭࢣσʔλͷ਺ɻ) Ұํɺओදݱ (ࠓͰ͸ύϥϝʔλ w Ͱͷදݱͷํ) Ͱͷղ w ͸ɺ w = ( λIM + ΦTΦ )−1 ΦTt (3.28) ͳͷͰɺM × M ͷߦྻͷٯߦྻΛٻΊΔඞཁ͕͋Δɻ(M ͸ಛ௃ྔۭ ؒͷ࣍ݩɻ) N ≫ M ͷ࣌ (͜ͷΑ͏ͳ৔߹͕େଟ਺)ɺओදݱͰղΛٻΊΔํָ͕ɻ Ұํɺ૒ରදݱͰ͸ M ͕ແݶେͷ࣌ͷಛ௃ۭؒ΋औΓѻ͏͜ͱ͕Ͱ͖ Δɻ(6,2 Ͱ M ͕ແݶେͷ࣌ͷಛ௃ۭؒͷྫΛڍ͛Δɻ) 8 / 34
  8. 6.2 Χʔωϧؔ਺ͷߏ੒ ͜ͷઅͰ͸ɺΧʔωϧؔ਺ͷఆٛΛ༻͍ͯɺ͍ΖΜͳΧʔωϧؔ਺Λ঺ հ͢Δɻ Χʔωϧؔ਺ͷఆٛ͸ɺೖྗ x ͔Βద੾ͳ M ࣍ݩಛ௃ۭؒ΁ͷࣸ૾ ϕ

    ͕ఆٛͰ͖ɺk(x, x′) ͕ k(x, x′) = ϕ(x)Tϕ(x′) (6.1) ͱॻ͚Δ͜ͱͰ͋Δɻ Χʔωϧؔ਺ͷ؆୯ͳྫ͸ k(x, z) = (xTz)2 (6.11) Ͱ͋Δɻ 9 / 34
  9. 6.2 Χʔωϧؔ਺ͷߏ੒ ྫ͑͹ x = (x1 , x2 )T ͱ͠ɺࣸ૾

    ϕ Λ ϕ(x) = (x2 1 , √ 2x1 x2 , x2 2 )T ͱ͢ Δͱɺ k(x, z) = (xTz)2 = ϕ(x)Tϕ(z) (6.12) ͱॻ͚ΔͷͰɺk(x, z) = (xTz)2 ͸Χʔωϧؔ਺Ͱ͋Δɻ ࣮͸Χʔωϧؔ਺ͷఆٛ͸ (6.1) ͷଞʹ΋͏Ұͭ͋Γɺ੒෼͕ Knm = k(xn , xm ) Ͱ͋ΔάϥϜߦྻ K ͕൒ਖ਼ఆஔߦྻͰ͋Δ͜ͱͰ ͋Δɻ ɹ (͜ΕΒͷ 2 ͭͷఆ͕ٛ౳ՁͰ͋Δ͜ͱ͸ҎԼͷهࣄͰূ໌ͯ͠Έ·͠ ͨɻ͚ٓ͠Ε͹ɺ͝ཡ and άου͍ͩ͘͞) https://qiita.com/gucchi0403/items/544065345f91144524c4 10 / 34
  10. 6.2 Χʔωϧؔ਺ͷߏ੒ ࣍ʹɺطʹΧʔωϧؔ਺ͩͱΘ͔͍ͬͯΔؔ਺͔Βɺ৽͍͠Χʔωϧؔ ਺ k(x, x′) Λੜ੒͢Δํ๏ΛҎԼʹࣔ͢ɻ ͜͜Ͱɺؔ਺ k1 (·,

    ·), k2 (·, ·) ͸Χʔωϧؔ਺ɺc > 0 ͸ఆ਺ɺf(·) ͸೚ҙ ͷؔ਺ɺq(·) ͸ඇෛͷ܎਺Λ࣋ͭଟ߲ࣜɺϕ(·) ͸ M ࣍ݩϕΫτϧؔ਺ɺ k3 (·, ·) ͸ M ࣍ݩϕΫτϧ্ۭؒʹఆٛ͞ΕͨΧʔωϧؔ਺ɺA ͸ରশ ͳ൒ਖ਼ఆஔߦྻɺx = (xa , xb )ɺka (·, ·), kb (·, ·) ͸Χʔωϧؔ਺Ͱ͋Δɻ 11 / 34
  11. 6.2 Χʔωϧؔ਺ͷߏ੒ ͜ΕΒͷߏ੒๏Λ༻͍Δͱɺྫ͑͹ҎԼͷؔ਺͕ΧʔωϧͰ͋Δ͜ͱ͕ Θ͔Δɻ k(x, x′) = (xTx′ + c)M

    ͜͜Ͱɺc ≥ 0 ͷఆ਺ɺM ͸೚ҙͷࣗવ਺ɻ ·ͨɺҎԼͷඇৗʹॏཁͳΨ΢εΧʔωϧͱݴ͏Χʔωϧؔ਺Λߏ੒Ͱ ͖Δɻ k(x, x′) = exp (−∥x − x′∥/2σ2) (6.23) ͜͜Ͱɺσ2 ͸೚ҙͷਖ਼ͷఆ਺ɻ ͪͳΈʹΨ΢εΧʔωϧʹରԠ͢Δಛ௃ϕΫτϧ͸ແݶ࣍ݩͰ͋Δɻ (→ ԋश 6.11) 12 / 34
  12. 6.3 RBF ωοτϫʔΫ ͜ͷઅͰ͸ɺҰൠతʹΑ͘࢖ΘΕΔ RBF ͱ͍͏جఈؔ਺ʹ͍ͭͯड़΂ ΔɻRBF ͱ͸ɺத৺ µj ͔Βͷڑ཭ͷΈʹґଘ͍ͯ͠Δجఈؔ਺Ͱɺ

    ϕj (x) = h(∥x − µj ∥) ͱ͍͏ܗΛ͍ͯ͠Δɻ RFB ͕ొ৔͢Δͷ͸ɺೖྗม਺ʹϊΠζؚ͕·ΕΔ࣌Ͱ͋ΔɻϊΠζ ξ ͷ֬཰෼෍Λ ν(ξ) ͱ͢Δͱɺೋ৐࿨ޡࠩ͸ҎԼͷΑ͏ʹͳΔɻ E = 1 2 N ∑ n=1 ∫ {y(xn + ξ) − tn }2ν(ξ) dξ (6.39) 13 / 34
  13. 6.3 RBF ωοτϫʔΫ ͜ͷೋ৐࿨ޡࠩΛ࠷େʹ͢Δ y(x) ͸ม෼๏ʹΑΓҎԼͷΑ͏ʹͳΔ͜ ͱ͕Θ͔Δɻ y(x) = N

    ∑ n=1 tn h(x − xn ) (6.40) ͜͜Ͱɺh(x − xn ) ͸ҎԼͷΑ͏ʹ༩͑ΒΕΔɻ h(x − xn ) = ν(x − xn ) N ∑ m=1 ν(x − xm ) (6.41) ͜ͷΑ͏ͳϞσϧΛ Nadaraya-Watson Ϟσϧͱ͍͏ɻ ·ͨϊΠζ͕౳ํతɺͭ·Γ ∥ξ∥ ͷ࣌͸ (6.41) ͷجఈؔ਺ͷ౳ํతɺ ͭ·Γ h(∥x − xn ∥) ͱͳΓɺRBF ͱͳΔɻ (6.3.1 ͷ Nadaraya-Watson Ϟσϧ͸Ҏ߱ͷষͰ࢖༻͠ͳ͔ͬͨͷͰɺඈ ͹͠·͢ɻ) 14 / 34
  14. 6.4 Ψ΢εաఔ 6.1 Ͱ͸ɺઢܗճؼͷඇ֬཰తͳϞσϧ (ग़ྗ y(x, w) Λͦͷ··༧ଌʹ ࢖༻) ʹ͍ͭͯɺ૒ରදݱͰॻ͖௚͢͜ͱͰΧʔωϧ͕ग़ݱ͢Δ͜ͱΛ

    ݟͨɻ ɹ ࠓ౓͸ઢܗճؼͷ֬཰Ϟσϧ (ग़ྗ y(x, w) ͷ֬཰෼෍Λಋग़͢Δ) Λ ѻ͍ɺ͜͜Ͱ΋ࣗવʹΧʔωϧ͕ग़ͯ͘Δ͜ͱΛ֬ೝ͢Δɻ 15 / 34
  15. 6.4.1 ઢܗճؼ࠶๚ 6.1 ͱಉ༷ʹҎԼͷΑ͏ͳೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ͔Β࢝ΊΔɻ y(x, w) = wTϕ(x)

    (6.49) ࣍ʹɺύϥϝʔλϕΫτϧ w ͷࣄલ෼෍ p(w) = N(w|0, α−1I) (6.50) ΛԾఆ͢Δɻ ͜͜Ͱɺw ͕༩͑ΒΕͨͱ͢Δͱɺ(6.49) ΑΓ x ʹ͍ͭͯͷಛఆͷؔ ਺ y(x) ͕ܾ·Δɻͭ·Γɺw ͷ֬཰෼෍͸ y(x) ͷ֬཰෼෍Λಋ͘ɻ ࣮༻తʹ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͕༩͑ΒΕ͍ͯΔ࣌ ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y(x1 ), y(x2 ), · · · , y(xN )) ͕ w ͷ֬཰෼෍ ͱ (6.49) ʹΑΓಋ͔ΕΔɻ(ΑΓਖ਼֬ʹݴ͏ͱɺX = {x1 , x2 , · · · , xN } ͱͯ͠ɺp(y(x1 ), y(x2 ), · · · , y(xN )|X) Λߟ͑Δɻ) 16 / 34
  16. 6.4.1 ઢܗճؼ࠶๚ ͦ͜Ͱɺy = (y(x1 ), y(x2 ), · ·

    · , y(xN ))T ͱఆٛ͢Δͱɺ(6.49) ΑΓ y = Φw (6.51) ͕Θ͔Δɻ(Φ ͸ܭըߦྻ) ͜ͷ࣌ɺy ͸Ψ΢ε෼෍ (6.50) ʹै͏ w ͷઢܗม׵ΑΓɺy ΋Ψ΢ε ෼෍ʹै͏ɻΑͬͯɺ෼෍Λ׬શʹܾఆ͢ΔͨΊʹ͸ฏۉͱڞ෼ࢄ͕Θ ͔Ε͹Α͘ɺ E[y] = ΦE[w] = 0 (6.52) cov[y] = E[yyT] = ΦE[wwT]ΦT = 1 α ΦΦT = K (6.53) ͱΘ͔Δɻ͜͜ͰɺK ͸ҎԼͷΑ͏ʹ੒෼ʹΧʔωϧؔ਺Λ΋ͭάϥ ϜߦྻͰ͋Δɻ Knm = k(xn , xm ) = 1 α ϕ(xn )Tϕ(xm ) (6.54) 17 / 34
  17. 6.4.1 ઢܗճؼ࠶๚ Ҏ্Ͱઆ໌ͨ͠ઢܗճؼ͸Ψ΢εաఔͷҰྫͱͳ͍ͬͯΔɻ Ψ΢εաఔͱ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · ·

    · , xN } ͕༩͑ΒΕ͍ͯ Δ࣌ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) = p(y(x1 ), y(x2 ), · · · , y(xN )) ͕Ψ ΢ε෼෍ʹै͏ͱԾఆ͢Δ΋ͷͰ͋Δɻ ͦͷฏۉ͸θϩͱԾఆ͢Δ͜ͱ͕ଟ͘ɺ·ͨڞ෼ࢄ͸ҎԼͷΑ͏ʹΧʔ ωϧͱ͢Δɻ E[y(xn ), y(xm )] = k(xn , xm ) (6.55) ্Ͱઆ໌ͨ͠ઢܗճؼ͸͔֬ʹΨ΢εաఔͷҰྫͱͳ͍ͬͯΔ͜ͱ͕ Θ͔Δɻ 18 / 34
  18. 6.4.2 Ψ΢εաఔʹΑΔճؼ ͜͜Ͱ͸ɺΨ΢εաఔΛઢܗճؼʹదԠ͢Δɻ ໨ඪม਺ tn ͸ग़ྗؔ਺ yn = y(xn )

    Λฏۉͱͨ͠Ψ΢ε෼෍ʹै͏ͱ ͢Δɻ p(tn |yn ) = N(tn |yn , β−1) (6.58) β ͸ਫ਼౓ͷϋΠύʔύϥϝʔλɻ ಠཱੑʹΑΓɺy = (y(x1 ), y(x2 ), · · · , y(xN ))T ͕༩͑ΒΕͨ࣌ͷ t = (t1 , · · · , tN )T ͷ༧ଌ෼෍͸ҎԼͷΑ͏ʹͳΔɻ p(t|y) = N(t|y, β−1IN ) (6.59) ·ͨΨ΢εաఔʹΑΓɺपล෼෍ p(y) ͸ฏۉ͕ 0 Ͱڞ෼ࢄ͕άϥϜߦ ྻ K Ͱ͋ΔΨ΢ε෼෍ʹै͏ͱ͢Δɻ p(y) = N(y|0, K) (6.60) 19 / 34
  19. 6.4.2 Ψ΢εաఔʹΑΔճؼ (6.59) ͷ p(t|y) ͱ (6.60) ͷ p(y) Λ༻͍Δͱɺ{x1

    , x2 , · · · , xN } ͕༩͑ ΒΕ͍ͯΔ࣌ͷ໨తม਺ t ͷ෼෍ p(t) ͸ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ p(t) = ∫ p(t|y) p(y) dy = N(t|0, C) (6.61) ͜͜Ͱɺڞ෼ࢄ C ͷ੒෼ Cnm ͸ Cnm = k(xn , xm ) + β−1δnm (6.62) Ͱ͋Δɻ(ࣜ (2.113)ʙࣜ (2.115) Λ࢖༻ͨ͠ɻ) ڞ෼ࢄ C ʹग़ͯ͘ΔΧʔωϧؔ਺ͱͯ͠Α͘࢖༻͞ΕΔͷ͕ɺҎԼͷ Α͏ͳΧʔωϧͰ͋Δɻ k(xn , xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) θ0 , · · · , θ3 ͸ϋΠύʔύϥϝʔλɻ 20 / 34
  20. 6.4.2 Ψ΢εաఔʹΑΔճؼ զʑ͕஌Γ͍ͨͷ͸ɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · ·

    , xN } ͱ {t1 , t2 , · · · , tN } Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ ໨ඪม਺ tN+1 ͷ෼෍Ͱ͋Δɻͭ·ΓɺtN = (t1 , · · · , tN )T ͱఆٛͨ͠ ࣌ͷ p(tN+1 |tN ) Ͱ͋Δɻ(͜͜Ͱɺೖྗม਺ͷґଘੑ͸লུͨ͠ɻ) p(tN+1 |tN ) ΛٻΊΔͨΊʹɺ·ͣ͸पล֬཰ p(tN+1 ) ͔ΒٻΊΔɻ͜ ͜ͰɺtN+1 = (t1 , · · · , tN+1 )T Ͱ͋Δɻ (6.61) ͷ݁ՌΛར༻͢Δͱɺp(tN+1 ) ͸ p(tN+1 ) = N(tN+1 |0, CN+1 ) (6.64) ͱͳΔɻ 21 / 34
  21. 6.4.2 Ψ΢εաఔʹΑΔճؼ ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͸ CN+1 = ( CN k

    kT c ) (6.65) Ͱ͋Δɻ͜͜ͰɺCN ͸੒෼͕ (6.62) Ͱ͋ΔΑ͏ͳ N × N ͷߦྻͰɺ k = (k(x1 , xN+1 ), k(x2 , xN+1 ), · · · , k(xN , xN+1 ))T ͳΔϕΫτϧɺ c = k(xN+1 , xN+1 ) + β−1 Ͱ͋Δɻ ͜ͷ݁Ռͱ (2.81) ͱ (2.82) Λ༻͍Δͱɺp(tN+1 |tN ) ͸Ψ΢ε෼෍ʹै ͍ɺͦͷฏۉ m(xN+1 ) ͱ෼ࢄ σ2(xN+1 ) ͸ҎԼͷΑ͏ʹͳΔɻ m(xN+1 ) = kTC−1 N tN (6.66) σ2(xN+1 ) = c − kTC−1 N k (6.67) ͭ·Γɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ໨ඪม਺ tN+1 ͷ֬ ཰෼෍͸ฏۉͱ෼ࢄ͕ xN+1 ʹґଘ͢ΔΨ΢ε෼෍ͱͳΔɻ 22 / 34
  22. 6.4.3 ௒ύϥϝʔλͷֶश Χʔωϧ๏Ͱ͸ɺΧʔωϧؔ਺Λܾఆ͢Δඞཁ͕͋ΔɻҰ͔ΒΧʔωϧ ؔ਺Λܾఆ͢ΔΑΓ΋ɺԼͷ (6.63) ͷΑ͏ʹΧʔωϧؔ਺Λύϥϝʔ λԽͯ͠ɺ܇࿅σʔλ͔Β͜ͷϋΠύʔύϥϝʔλΛܾఆ͢Δͷָ͕ͳ ͱ͖΋͋Δɻ k(xn ,

    xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (6.63) ͜ͷϋΠύʔύϥϝʔλΛܾΊΔͨΊʹ͸ɺ(6.61) ͷ p(t|θ) ͷର਺ ln p(t|θ) Λͱͬͨ΋ͷΛ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛܾΊΕ͹ ͍͍ɻ ln p(t|θ) ͸ҎԼͷΑ͏ʹͳΔɻ ln p(t|θ) = − 1 2 ln |CN | − 1 2 tTC−1 N t − N 2 ln (2π) (6.69) ͜ͷ ln p(t|θ) Λ࠷େʹ͢ΔϋΠύʔύϥϝʔλ θ ΛٻΊΔ͜ͱʹͳΔɻ 23 / 34
  23. 6.4.4 ؔ࿈౓ࣗಈܾఆ 6.4.3 ͰϋΠύʔύϥϝʔλͷ఺ਪఆΛߦ͕ͬͨɺ͜ͷ఺ਪఆͷ݁ՌΑ Γೖྗม਺ͷ༧ଌ΁ͷॏཁ౓͕Θ͔Δɻ ྫ͑͹ɺҎԼͷΑ͏ͳΧʔωϧΛߟ͑Δɻ k(x, x′) = θ0

    exp { − θ1 2 2 ∑ i=1 ηi (xi − x′ i )2 } (6.71) ͜͜Ͱɺθ0 , η1 , η2 ͸ϋΠύʔύϥϝʔλͰ͋Δɻ ͜ͷΧʔωϧΛ༻͍ͯɺy ͷࣄલ෼෍Λߟ͑Δɻ p(y) = N(y|0, K) (6.60) 24 / 34
  24. 6.4.4 ؔ࿈౓ࣗಈܾఆ ্ͷද͸ɺη1 , η2 ΛมԽͤͨ࣌͞ͷ y ͷࣄલ෼෍ʹΑͬͯಘΒΕΔαϯ ϓϧͰ͋Δɻ ηi

    Λখ͘͢͞Δͱɺxi ͷมԽʹΑΔ y ͷมԽ͸খ͘͞ͳΔ͜ͱ͕Θ ͔Δɻ ͜ͷ࣌ɺy ʹϊΠζΛ෇͚Ճ͑ͨ໨ඪม਺ t ͷ֬཰෼෍ p(t|θ) Λ࠷େʹ ͢ΔϋΠύʔύϥϝʔλΛٻΊΔͱɺηi ͸খ͍͞஋ʹͳΔɻ 25 / 34
  25. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ࠓ౓͸Ψ΢εաఔͰΫϥε෼ྨΛߦ͏ɻ ճؼͰ͸ɺ(6.60) ͷΑ͏ʹग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) ͕Ψ΢ε෼෍ ʹै͏ͱԾఆͨ͠ɻ͜ͷ࣌ɺyn ͸࣮਺શମͷ஋ΛͱΔɻ ෼ྨͰ͸ɺग़ྗ͸

    yn ͸ 0 ≤ yn ≤ 1 ͱͳΔ΂͖Ͱ͋Δɻͦ͜Ͱɺग़ྗͰ ͸ͳ͘׆ੑ an = a(xn ) ͷಉ࣌෼෍ؔ਺Λߟ͑Δ͜ͱʹ͠ɺग़ྗΛ yn = σ(an ) ͱ͢Δɻ ࣍ʹ֬཰ԽΛߦ͏ɻ໨తม਺ tn = 1 ͷ࣌ͷ֬཰Λ p(tn = 1|an ) = σ(an ) ͱ͢Δͱɺp(tn = 0|an ) = 1 − σ(an ) ΑΓɺ p(tn |an ) = σ(an )tn (1 − σ(an ))1−tn (6.73) ͱͳΔɻ ճؼͷ࣌ͱಉ༷ʹɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · · , xN } ͱ tN = (t1 , · · · , tN )T Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ ࣌ͷ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) ΛٻΊΔɻ(͜͜Ͱ΋ೖྗม਺ͷ ґଘੑ͸লུͨ͠ɻ) 26 / 34
  26. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ·ͣɺaN+1 = (a(x1 ), a(x2 ), · ·

    · , a(xN+1 ))T ͱͯ͠ɺΨ΢εաఔΑΓ ׆ੑͷಉ࣌෼෍ p(aN+1 ) ΛҎԼͷΑ͏ʹԾఆ͢Δɻ p(aN+1 ) = N(aN+1 |0, CN+1 ) (6.74) ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͷ੒෼͸ҎԼͱ͢Δɻ (CN+1 )nm = k(xn , xm ) + νδnm (6.75) ν ͸ϊΠζ߲Ͱ͋Δɻ ٻΊ͍ͨͷ͸ɺ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) Ͱ͋Γɺ2 ஋෼ྨͰ͸ p(tN+1 = 0|tN ) = 1 − p(tN+1 = 1|tN ) ͳͷͰɺp(tN+1 = 1|tN ) ͷΈΛ ٻΊΕ͹ྑ͍ɻ 27 / 34
  27. 6.4.5 Ψ΢εաఔʹΑΔ෼ྨ ͜͜Ͱɺ p(tN+1 = 1, tN ) = ∫

    p(tN+1 = 1, tN , aN+1 ) daN+1 = ∫ p(tN+1 = 1|tN , aN+1 )p(aN+1 |tN )p(tN ) daN+1 = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN )p(tN ) daN+1 ΑΓɺp(tN+1 = 1|tN ) ͸ҎԼͷΑ͏ʹܭࢉ͞ΕΔɻ p(tN+1 = 1|tN ) = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN ) daN+1 (6.76) ͜͜Ͱɺp(tN+1 = 1|aN+1 ) = σ(aN+1 ) Ͱ͋Δɻ ͜ͷੵ෼͸ղੳతʹ࣮ߦ͢Δ͜ͱ͸ෆՄೳͰ͋Γɺ༷ʑͳํ๏Λ༻͍ͯ ۙࣅతʹٻΊΔ͜ͱ͕͞Ε͍ͯΔɻࠓճ͸ϥϓϥεۙࣅΛ༻͍Δɻ 28 / 34
  28. 6.4.6 ϥϓϥεۙࣅ ͜ͷઅͰ͸ɺϥϓϥεۙࣅΛ༻͍ͯੵ෼ (6.76) ΛධՁ͢Δɻ ·ͣɺp(aN+1 |tN ) ΛϕΠζͷఆཧΛ༻͍ͯҎԼͷΑ͏ʹมܗ͢Δɻ p(aN+1

    |tN ) = ∫ p(aN+1 |aN )p(aN |tN ) daN (6.77) p(aN |tN ) ͸ࣄޙ෼෍Ͱ͋Δɻ ͜͜Ͱɺ৚݅෇͖෼෍ p(aN+1 |aN ) ͸ɺճؼͷ࣌ͷ (6.66) ͱ (6.67) ͷ p(tN+1 |tN ) ͷ݁ՌΛࢀߟʹ͢Δͱɺ p(aN+1 |aN ) = N(aN+1 |kTC−1 N aN , c − kTC−1 N k) (6.78) ͱͳΔɻ 29 / 34
  29. 6.4.6 ϥϓϥεۙࣅ p(aN |tN ) Λۙࣅ͢Δ (ϥϓϥεۙࣅ)ɻ ͦͷͨΊʹ͸ɺ ∂p(aN |tN

    ) ∂aN = ∇p(aN |tN ) = 0 Λຬͨ͢ aN (= a⋆ N ) ͱɺaN = a⋆ N Ͱͷϔοηߦྻ −∇∇ ln p(aN |tN ) ͕ ඞཁͰ͋Δɻ(4.4 ͱ 4.5 ࢀর) ·ͣɺࣄલ෼෍ p(aN ) ͸ p(aN ) = N(aN |0, CN ) Ͱ༩͑Δɻ͜Ε͸ (6.74) Ͱ N + 1 → N ͱͨ͠΋ͷɻ ໬౓ؔ਺ p(tN |aN ) ͸σʔλ఺ͷಠཱੑΑΓɺ p(tN |aN ) = N ∏ n=1 σ(an )tn (1 − σ(an ))1−tn = N ∏ n=1 eantn σ(−an ) (6.79) ͱͳΔɻ 30 / 34
  30. 6.4.6 ϥϓϥεۙࣅ ϕΠζͷఆཧΑΓɺp(aN |tN ) ∝ p(tN |aN )p(aN )

    ͳͷͰɺ࣮ࡍʹܭࢉΛ ͢Δͱɺ a⋆ N = CN (tN − σN ) (6.84) ͱͳΔɻ͜͜ͰɺσN = (σ(a1 ), σ(a2 ), · · · , σ(aN ))T Ͱ͋Δɻ ·ͨɺaN = a⋆ N Ͱͷϔοηߦྻ H ͸ H = W⋆ + C−1 N (6.85) ͱͳΔɻ͜͜ͰɺW ͸ σ(an )(1 − σ(an )) Λର֯੒෼ʹ࣋ͭର֯ߦྻͰ ͋ΓɺW⋆ ͸ aN = a⋆ N Ͱͷ W Ͱ͋Δɻ Αͬͯɺࣄޙ෼෍ p(aN |tN ) ͸ҎԼͷΑ͏ʹۙࣅ͞ΕΔɻ(ϥϓϥε ۙࣅ) p(aN |tN ) ∼ N(aN |a⋆ N , H−1) (6.86) 31 / 34
  31. 6.4.6 ϥϓϥεۙࣅ (6.78) ͱ (6.86) ΑΓɺҎԼͷΑ͏ʹ (6.77) ͷੵ෼͕ۙࣅͰ͖Δɻ p(aN+1 |tN

    ) ∼ ∫ N(aN+1 |kTC−1 N aN , c−kTC−1 N k)N(aN |a⋆ N , H−1) daN (2.115) ΑΓɺp(aN+1 |tN ) ͸ҎԼͷฏۉͱ෼ࢄΛ࣋ͭΨ΢ε෼෍ͱ ͳΔɻ E[aN+1 |tN ] = kT(tN − σN ) (6.87) var[aN+1 |tN ] = c − kT(W−1 N + CN )−1k (6.88) ͜͜ͰɺWN ͸ (6.85) ͷ W⋆ Ͱ͋Δɻ 32 / 34
  32. 6.4.6 ϥϓϥεۙࣅ p(aN+1 |tN ) ΋෼͔ͬͨͷͰɺ(6.76) ͷ p(tN+1 = 1|tN

    ) Λ (4.153) Λ༻ ͍ͯɺۙࣅܭࢉͰ͖Δɻ 6.4.3 Ͱ΋ٞ࿦ͨ͠Α͏ʹϋΠύʔύϥϝʔλʔ θ ͕ CN ʹؚ·ΕΔͷ Ͱɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ p(tN |θ) ͸ (6.61) ͷ࣌ͱಉ༷ʹ p(tN |θ) = ∫ p(tN |aN ) p(aN |θ) daN (6.89) ͱܭࢉ͢Δ͕ɺ͜Ε΋·ͨղੳతʹ͸ܭࢉͰ͖ͳ͍ɻ ϥϓϥεۙࣅΛ༻͍ͯɺp(tN |θ) Λ࠷େʹ͢ΔΑ͏ͳ θ ΛٻΊΔɻ 33 / 34
  33. 6.4.7 χϡʔϥϧωοτϫʔΫͱͷؔ܎ ϕΠζχϡʔϥϧωοτ (→5.7) ʹ͓͍ͯ΋ग़ྗؔ਺ (ωοτϫʔΫؔ ਺)y(x, w) ͱ w

    ͷࣄલ෼෍ʹΑΓɺग़ྗؔ਺ͷࣄલ෼෍͕ಘΒΕΔɻ χϡʔϥϧωοτͷӅΕ૚ͷϢχοτͷ਺Λ M ͱͯ͠ɺM → ∞ ʹ͠ ͨ࣌ͷग़ྗؔ਺ͷࣄલ෼෍͕Ψ΢εաఔͷग़ྗؔ਺ͷࣄલ෼෍ʹۙͮ ͘͜ͱ͕஌ΒΕ͍ͯΔɻ(Neal 1996) χϡʔϥϧωοτͰ͸ɺग़ྗؔ਺ y(x, w) ͷ੒෼ yk (x, w) ͸ಠཱͰ͸ ͳ͍ɻ(ॏΈͷڞ༗Λߦ͍ͬͯΔɻ) M → ∞ ͰΨ΢εաఔʹۙ͘ͱ͍͏ࣄ࣮͸ɺχϡʔϥϧωοτͷग़ྗ yk (x, w) ͕ M → ∞ Ͱಠཱʹͳ͍ͬͯ͘ͱ͍͏͜ͱΛओு͢Δɻ 34 / 34