Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PRML第1章

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for gucchi gucchi
October 29, 2018

 PRML第1章

Avatar for gucchi

gucchi

October 29, 2018
Tweet

More Decks by gucchi

Other Decks in Science

Transcript

  1. ୈ 1 ষ ং࿦ ▶ ػցֶशͰ͸ɺಛʹͦͷதͰ΋ڭࢣ͋ΓֶशͰ͸ɺ·ͣೖྗσʔλ ͷू߹ {x1 , x2

    , · · · , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪϕΫτϧͷू ߹ {t1 , t2 , · · · , tN } Λ༻ҙ͢Δ (܇࿅σʔλ) ▶ ܇࿅σʔλΛ༻͍ͯɺೖྗσʔλ͔Β໨ඪϕΫτϧΛ༧ଌ͢Δؔ ਺ y(x) Λ࡞Δ (ֶश) ▶ ֶशऴྃޙɺະ஌ͷσʔλ x ͷ໨ඪϕΫτϧΛ y(x) Ͱ༧ଌ͢Δ ▶ ֤ೖྗϕΫτϧΛ༗ݶݸͷ཭ࢄతͳΧςΰϦʹׂΓ౰ͯΔ৔߹ (ྫ ͑͹ɺखॻ͖਺ࣈͷೝࣝ) ΛΫϥε෼ྨͱ͍͍ɺग़ྗ͕࿈ଓม਺ͷ ৔߹Λճؼͱ͍͏ ▶ ·ͣ͸ճؼͷ؆୯ͳྫʹ͍ͭͯߟ͑Δ 2 / 55
  2. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ܇࿅σʔλͱͯ͠ɺN ݸͷೖྗ x = (x1 , x2

    , · · · , xN )T ͱͦΕͧΕ ʹରԠ͢Δ N ݸͷ໨ඪม਺ t = (t1 , t2 , · · · , tN )T Λ༻ҙ͢Δ (ճؼ ͳͷͰɺग़ྗ tn ͸࿈ଓతͳ஋ΛͱΔ) ▶ tn ͸ sin(2πxn ) ʹΨ΢ε෼෍ʹै͏ϥϯμϜϊΠζΛՃ͑ͨ΋ͷ ▶ ܇࿅σʔλ (x, t) Λ࢖ͬͯɺ৽ͨͳೖྗ ˆ x ͕༩͑ΒΕͨ࣌ͷग़ྗ ˆ t Λ༧૝͍ͨ͠ ▶ Լͷਤ͸ N = 10 ͷ৔߹ͷྫ 3 / 55
  3. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ܇࿅σʔλ͕༗ݶݸ (N ݸ) Ͱ͋Δ͕Ώ͑ɺˆ t ʹ͸ෆ࣮֬ੑ͕͋Γɺ ͦͷෆ࣮֬ੑͷఆྔతͳදݱΛ༩͑Δ࿮૊Έ͸

    1.2 અͰಋೖ͢Δ ▶ ͱΓ͋͑ͣ͜ͷઅͰ͸ɺҎԼͷΑ͏ͳଟ߲ࣜΛ࢖ͬͯϑΟοςΟ ϯάΛߦ͍ɺ༧ଌΛߦ͏͜ͱΛߟ͑Δ y(x, w) = w0 + w1 x + w2 x2 + · · · + wM xM = M ∑ j=0 wj xj (1.1) ▶ ܇࿅σʔλ (x, t) Λ࢖ͬͯɺଟ߲ࣜͷύϥϝʔλ w = (w0 , w1 , · · · , wM )T Λ͍͍ײ͡ʹνϡʔχϯά͍ͨ͠ ▶ ͦ͜ͰɺҎԼͷޡࠩؔ਺ E(w) Λ࠷খʹ͢ΔΑ͏ͳ w(= w⋆) Λٻ ΊΔ͜ͱΛߟ͑Δ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 (1.2) 4 / 55
  4. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ্ͷਤ͸ଟ߲ࣜͷ࣍ݩ M = 0, 1, 3, 9

    ͷϑΟοςΟϯά݁Ռ (྘͕ sin(2πx) Ͱɺ੺͕ y(x, w⋆)) ▶ ͜ͷதͰ͸ɺM = 3 ͕Ұ൪ sin(2πx) ʹ౰ͯ͸·͍ͬͯΔΑ͏ʹݟ ͑Δ ▶ M = 9 Ͱ͸ɺE(w⋆) = 0 ͕ͩɺsin(2πx) ʹ͸౰ͯ͸·͍ͬͯͳ͍ (աֶश) 5 / 55
  5. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ͦΕͧΕͷ M ʹରͯ͠ɺy(x, w⋆) ͕ͲΕ͚ͩະ஌ͷσʔλΛਖ਼͠ ͘༧ଌͰ͖͍ͯΔͷ͔ΛਤΛݟͯͳΜͱͳ͘ͷධՁͰ͸ͳ͘ɺఆ ྔతʹධՁ͍ͨ͠

    ▶ ͦ͜ͰɺҎԼͷฏۉೋ৐ฏํࠜޡࠩΛಋೖ͢Δ ERMS = √ 2E(w⋆)/N (1.3) ▶ ςετσʔλͷ ERMS ͕খ͚͞Ε͹খ͍͞΄Ͳɺະ஌ͷσʔλΛ ਖ਼͘͠༧ଌͰ͖͍ͯΔͱݴ͑Δ 6 / 55
  6. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ্ͷਤ͸͍ΖΜͳ M ʹର͢Δ܇࿅σʔλͱςετσʔλͷ ERMS ▶ ܇࿅σʔλʹର͢Δ ERMS

    ͕੨ɺςετσʔλ (ະ஌ͷσʔλ) ʹ ର͢Δ ERMS ͕੺ ▶ M = 9 Ͱ͸܇࿅σʔλʹ͸Α͘ϑΟοτ͍ͯ͠Δ͕ɺςετσʔ λʹ͸શ͘ϑΟοτ͍ͯ͠ͳ͍ 7 / 55
  7. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ͞ΒʹɺͦΕͧΕͷ M Ͱͷ w⋆ ͷ஋্͕ͷදͰ͋Δ ▶ M

    = 9 Ͱ͸ɺϥϯμϜϊΠζʹ΋ϑΟοτ͢ΔΑ͏ʹɺେ͖ͳਖ਼ ෛͷ਺Ͱௐ੔Λ͍ͯ͠Δ༷ࢠ͕Θ͔Δ 8 / 55
  8. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ্ͷਤ͸ɺM = 9 Ͱͷ͍ΖΜͳ܇࿅σʔλͷ਺ͰͷϑΟοςΟϯ ά݁Ռ ▶ M

    = 9 Ͱ͋ͬͯ΋ɺ܇࿅σʔλͷ਺͕े෼ଟ͚Ε͹ɺաֶशΛ๷ ͛Δ͜ͱ͕Θ͔Δ 9 / 55
  9. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ࣍ʹɺෳࡶͳϞσϧ (M = 9 ͱ͔) ΛݶΒΕͨ܇࿅σʔλ਺ (N

    = 10 ͱ͔) Λ༻͍ͯɺաֶश͕ى͖ͳ͍Α͏ʹ͢ΔͨΊʹਖ਼ଇ ԽΛߦ͏ ▶ ද 1.1 Ͱɺաֶश͕ى͖͍ͯΔ M = 9 Ͱ͸ɺύϥϝʔλ w⋆ ͷ੒ ෼͕େ͖ͳਖ਼ෛͷ਺ʹͳ͍ͬͯͨͷͰɺҎԼͷΑ͏ͳޡࠩؔ਺Λ ߟ͑Δ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 + λ 2 ∥w∥2 (1.4) ▶ ͜͜ͰɺϊϧϜ ∥w∥2 = wT w = w2 0 + w2 1 + · · · w2 M ɺλ ͸ਖ਼ͷύϥ ϝʔλ (ਖ਼ଇԽ߲ͱೋ৐ޡࠩͷ࿨ͷ߲ͷ૬ରతͳॏཁ౓Λௐઅ) ▶ ͜ͷޡࠩؔ਺Λ࢖༻͢Δͱɺύϥϝʔλ w ͷϊϧϜ͕େ͖͘ͳΒ ͳ͍Α͏ʹϑΟοςΟϯά͞ΕΔ 10 / 55
  10. 1.1 ྫɿଟ߲ࣜۂઢϑΟοςΟϯά ▶ ্ͷਤ͸ɺM = 9 Ͱ (1.4) ͷޡࠩؔ਺Λ༻͍ͯɺϑΟοςΟϯά͠ ͨ݁Ռ

    (λ = e−18 ͱ λ = 1) ▶ λ = e−18 Ͱ͸ɺλ = 0 ͷ࣌ʹൺ΂ͯաֶश͕཈੍͞Ε͍ͯΔ͜ͱ ͕Θ͔Δ ▶ ·ͨɺλ = 1 Ͱ͸ (1.4) ͷӈล 2 ߲໨ͷॏཁ౓্͕͕Γ͗ͯ͢ɺύ ϥϝʔλ w⋆ ͷ੒෼͕ 0 ʹ͖͍ۙͮ͗ͯ͢Δ 11 / 55
  11. 1.2 ֬཰࿦ ▶ ύλʔϯೝࣝʹ͓͍ͯɺॏཁͳෆ࣮֬ੑΛఆྔతʹධՁ͢ΔͨΊ ʹ֬཰࿦Λಋೖ͢Δ ▶ ֬཰ม਺ X, Y Λߟ͑ɺ͜ΕΒ͸

    X = xi (i = 1, 2, · · · , M)ɺ Y = yj (j = 1, 2, · · · , L) ΛͱΔͱ͠ɺX = xi , Y = yj ͱͳΔ֬཰ (ಉ࣌֬཰) Λ p(X = xi , Y = yj ) ͱ͔͘ ▶ X = xi ͱͳΔ֬཰ p(X = xi ) ͸ɺp(X = xi , Y = yj ) Λ༻͍ͯҎ ԼͷΑ͏ʹ͔͚Δ (Ճ๏ఆཧ) p(X = xi ) = L ∑ j=1 p(X = xi , Y = yj ) (1.7) ▶ ·ͨɺX = xi ͕༩͑ΒΕ্ͨͰɺY = yj ͱͳΔ֬཰ (৚݅෇͖֬ ཰) Λ p(Y = yj |X = xi ) ͱ͢ΔͱɺҎԼͷΑ͏ͳؔ܎͕ࣜ੒ཱ͢ Δ (৐๏ఆཧ) p(X = xi , Y = yj ) = p(Y = yj |X = xi )p(X = xi ) (1.9) 13 / 55
  12. 1.2 ֬཰࿦ ▶ ৐๏ఆཧͱಉ࣌֬཰ͷରশੑ p(X, Y ) = p(Y, X)

    Λ༻͍ΔͱɺϕΠ ζͷఆཧ͕ಋ͚Δ p(Y |X) = p(X|Y )p(Y ) p(X) (1.12) ▶ ͜͜Ͱɺp(Y ) Λࣄલ֬཰ (X ͕༩͑ΒΕΔલͷ֬཰) ͱ͍͍ɺ p(Y |X) Λࣄޙ֬཰ (X ͕༩͑ΒΕͨޙͷ֬཰) ͱ͍͏ ▶ ϕΠζͷఆཧ͸ࣄલ֬཰ p(Y ) ʹ໬౓ p(X|Y ) Λ͔͚Δͱɺࣄޙ֬ ཰ p(X|Y ) ʹͳΔͱ͍͏͜ͱΛද͢ (p(X) ͸ p(Y |X) ͕ Y ʹର͠ ͯن֨Խ͞Ε͍ͯΔ͜ͱΛอূ͢Δن֨Խఆ਺) ▶ ͞Βʹɺಉ࣌෼෍ p(X, Y ) ͕ҎԼͷΑ͏ʹपล෼෍ͷੵͰදͤΔ ࣌ɺX ͱ Y ͸ಠཱͰ͋Δͱ͍͏ p(X, Y ) = p(X) p(Y ) 14 / 55
  13. 1.2.1 ֬཰ີ౓ ▶ ࣍ʹ࿈ଓతͳ֬཰ม਺ͷ෼෍ʹ͍ͭͯߟ͑Δ ▶ ֬཰ม਺ x ͕ (x, x

    + δx) ͷൣғʹೖΔ֬཰͕ δx → 0 ͷ࣌ʹ p(x) δx ͱ༩͑ΒΕΔ࣌ɺp(x) Λ֬཰ີ౓ͱ͍͏ ▶ ͜ͷ࣌ɺม਺ x ͕۠ؒ (a, b) ʹ͋Δ֬཰͸ҎԼͷࣜͰ༩͑ΒΕΔ p(x ∈ (a, b)) = ∫ b a p(x) dx (1.24) ▶ ·ͨɺ֬཰ͷඇෛੑͱن֨ԽΑΓɺp(x) ͸ҎԼͷੑ࣭Λ࣋ͭ p(x) ≥ 0 (1.25) ∫ ∞ −∞ p(x) dx = 1 (1.26) 15 / 55
  14. 1.2.1 ֬཰ີ౓ ▶ x ͕ (−∞, z) ͷൣғʹೖΔ֬཰͸ྦྷੵ෼෍ؔ਺ͱݺ͹ΕɺҎԼͷ Α͏ʹ͔͚Δ P(z)

    = ∫ z −∞ p(x) dx (1.28) ▶ Ճ๏ఆཧͱ৐๏ఆཧͷ࿈ଓม਺൛͸ҎԼͷΑ͏ʹͳΔ p(x) = ∫ p(x, y) dy (1.31) p(x, y) = p(y|x)p(x) (1.32) ▶ ࿈ଓม਺ͷՃ๏ఆཧͱ৐๏ఆཧΛݫີʹࣔ͢ʹ͸ଌ౓࿦͕ඞཁʹ ͳΔ͕ɺͦ͜ʹ͸ཱͪೖΒͳ͍ 16 / 55
  15. 1.2.2 ظ଴஋ͱ෼ࢄ ▶ ֬཰࿦Ͱͷॏཁͳܭࢉͱͯ͠ɺॏΈ෇͖ฏۉ͕͋Δ ▶ ࿈ଓతͳ֬཰ม਺ x ʹରͯ͠ɺؔ਺ f(x) ͷ֬཰෼෍

    p(x) ͷԼͰ ͷฏۉ஋͸ҎԼͷΑ͏ʹͳΔ E[f] = ∫ p(x)f(x) dx (1.34) ▶ ͜͜Ͱه๏ͱͯ͠ɺͲͷม਺ʹ͍ͭͯ࿨ (΋͘͠͸ੵ෼) Λͱͬͯ ͍Δͷ͔ΛఴࣈͰද͢͜ͱʹ͢Δɻྫ͑͹ɺҎԼͷྔ͸ x ͍ͭͯ ࿨ (΋͘͠͸ੵ෼) Λͱͬͨ΋ͷͰ͋Δ Ex [f(x, y)] (1.36) 17 / 55
  16. 1.2.2 ظ଴஋ͱ෼ࢄ ▶ ؔ਺ f(x) ͷ֬཰෼෍ p(x) ͷԼͰͷ෼ࢄ (ؔ਺ f(x)

    ͕ͦͷฏۉ஋ E[f(x)] ͷपΓͰͲΕ͚ͩόϥ͍͍ͭͯΔͷ͔Λද͢) var[f] = E [ (f(x) − E[f(x)])2 ] (1.38) ▶ 2 ͭͷ֬཰ม਺ x ͱ y ͷؒͷڞ෼ࢄ (2 ͭͷ֬཰ม਺ͷґଘੑΛද ͢) ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔ cov[x, y] = Ex,y [ {x − E[x]}{y − E[y]} ] = Ex,y [xy] − E[x]E[y] (1.41) ▶ 2 ͭͷ֬཰ม਺ x ͱ y ͕ಠཱͷ࣌ɺcov[x, y] = 0 ͱͳΔ 18 / 55
  17. 1.2.3 ϕΠζ֬཰ ▶ ϕΠζ֬཰Λଟ߲ࣜۂઢϑΟοςΟϯάΛྫʹઆ໌͢Δ ▶ ϕΠζతͳ֬཰ղऍͰ͸ɺ·ͣσʔλΛ؍ଌ͢Δલʹɺզʑͷύϥ ϝʔλ w ΁ͷԾઆΛࣄલ֬཰ p(w)

    ͷܗͰऔΓࠐΜͰ͓͘ ▶ ࣮ࡍʹ؍ଌσʔλ D = {t1 , t2 , · · · , tN } Λ༻͍ͯ໬౓ؔ਺ p(D|w) ΛٻΊΔ ▶ ϕΠζͷఆཧΑΓɺࣄޙ֬཰ p(w|D) ΛٻΊΔ p(w|D) = p(D|w)p(w) p(D) (1.43) 19 / 55
  18. 1.2.3 ϕΠζ֬཰ ▶ ස౓ओٛతͳ֬཰ղऍͱϕΠζతͳ֬཰ղऍͰɺ໬౓ؔ਺ p(D|w) ͷ໾ׂ͕มΘΔ ▶ ස౓ओٛతͳ֬཰ղऍͰ͸ɺw ͸͋Δݻఆ͞Εͨύϥϝʔλͱ͠ ͯଊ͑ɺ໬౓ؔ਺

    p(D|w) Λ࠷େʹ͢ΔΑ͏ͳ w Λਪఆྔͱͯ͠ ఆΊΔ (w ͸ 1 ͭʹఆ·Δ) ▶ ϕΠζతͳ֬཰ղऍͰ͸ɺ໬౓ؔ਺͸ࣄલ෼෍Λ؍ଌσʔλ D ʹ Αͬͯɺࣄޙ෼෍ʹߋ৽͢ΔͨΊʹ࢖͏ (ࣄޙ෼෍ p(w|D) ͸ w ͷ ֬཰෼෍Ͱ͋Γɺw ͸ෆ࣮֬ੑΛ΋ͭ) 20 / 55
  19. 1.2.4 Ψ΢ε෼෍ ▶ x ͷฏۉ஋ͱ x ͷ෼ࢄ͸ҎԼͷΑ͏ʹͳΔ E[x] = ∫

    ∞ −∞ N(x|µ, σ2) x dx = µ (1.49) var[x] = E[x2] − E[x]2 = σ2 (1.51) 22 / 55
  20. 1.2.4 Ψ΢ε෼෍ ▶ Ψ΢ε෼෍ʹΑΓಠཱʹੜ੒͞Εͨσʔλू߹ x = (x1 , x2 ,

    · · · , xN )T ͔ΒΨ΢ε෼෍ͷ µ, σ2 ΛٻΊΔ ▶ σʔλू߹͸ͦΕͧΕಠཱͳͷͰɺ໬౓ؔ਺͸ҎԼͷΑ͏ʹͳΔ p(x|µ, σ2) = N ∏ n=1 N(xn |µ, σ2) (1.53) ▶ (1.53) Λ࢖ͬͯ µ, σ2 ΛٻΊΔ୅ΘΓʹҎԼͷର਺໬౓ؔ਺Λ࢖ͬ ͯ µ, σ2 ΛٻΊΔ ln p(x|µ, σ2) = − 1 2σ2 N ∑ n=1 (xn − µ)2 − N 2 ln σ2 − N 2 ln(2π) (1.54) ▶ ͜ͷର਺໬౓Λ࠷େʹ͢Δ µ, σ2 ͸ҎԼͷΑ͏ʹͳΔ µML = 1 N N ∑ n=1 xn (1.55) σ2 ML = 1 N N ∑ n=1 (xn − µML )2 (1.56) 23 / 55
  21. 1.2.4 Ψ΢ε෼෍ ▶ ͜͜Ͱ࠷໬Ξϓϩʔνͷݶքʹ͍ͭͯड़΂Δ ▶ µML ͱ σ2 ML ͸σʔλ఺ͷू߹

    x1 , x2 , · · · , xN ͷؔ਺Ͱ͋Γɺ µML ͱ σ2 ML ͷύϥϝʔλ µ, σ2 Λ࣋ͭΨ΢ε෼෍ (σʔλ఺ͷू ߹ x1 , x2 , · · · , xN Λੜ੒͢Δ෼෍) Ͱͷظ଴஋ΛٻΊΔͱҎԼͷΑ ͏ʹͳΔ E[µML ] = µ (1.57) E[σ2 ML ] = ( N − 1 N ) σ2 (1.58) ▶ µML ͷظ଴஋͸ਅͷ஋ͱ౳͍͕͠ɺ෼ࢄ͕ (N − 1)/N ഒʹաখධ Ձ͞Ε͍ͯΔ ▶ (1.58) ΑΓɺҎԼͷ ˜ σ2 ͕ෆภਪఆྔ (E[˜ σ2] = σ2) ͱͳΔ ˜ σ2 = ( N N − 1 ) σ2 ML = 1 N − 1 N ∑ n=1 (xn − µML )2 (1.59) 24 / 55
  22. 1.2.5 ۂઢϑΟοςΟϯά࠶๚ ▶ 1.1 અͰߦͳͬͨۂઢϑΟοςΟϯάΛ֬཰తͳ؍఺͔Βٞ࿦ͯ͠ ΈΔ ▶ ೖྗม਺ x ʹରԠ͢Δ

    t ͸ɺฏۉ͕ y(x, w) Ͱ͋ΔҎԼͷΨ΢ε෼ ෍ʹै͏ͱ͢Δ (β−1 = σ2) p(t|x, w, β) = N(t|y(x, w), β−1) (1.60) ▶ ܇࿅σʔλ (x, t) Λ༻͍ͯɺҎԼͷ໬౓ؔ਺Λ࠷େʹ͢ΔΑ͏ͳύ ϥϝʔλ w, β ΛٻΊΔ p(t|x, w, β) = N ∏ n=1 N(tn |y(xn , w), β−1) (1.61) ▶ ࣮ࡍͷܭࢉͰ͸ɺҎԼͷର਺໬౓Λ࠷େʹ͢ΔΑ͏ͳ w, β Λٻ ΊΔ ln p(t|x, w, β) = − β 2 N ∑ n=1 (y(xn , w) − tn )2 + N 2 ln β − N 2 ln(2π) (1.62) 25 / 55
  23. 1.2.5 ۂઢϑΟοςΟϯά࠶๚ ▶ ·ͣɺln p(t|x, w, β) Λ࠷େʹ͢Δ wML Λߟ͑Δ

    ▶ (1.62) ͷӈล 2 ߲໨Ҏ߱͸ w ʹدΒͳ͍ͷͰແࢹͰ͖ɺβ ͸ਖ਼ͳ ͷͰ β = 1 ͱͯ͠΋ wML ͷ஋͸มΘΒͳ͍ ▶ ln p(t|x, w, β) Λ࠷େʹ͢Δ͜ͱ͸ − ln p(t|x, w, β) Λ࠷খʹ͢Δ ͜ͱͱ౳ՁͳͷͰɺwML ΛٻΊΔ͜ͱ͸ (1.2) Λ࠷খʹ͢Δ w Λ ٻΊΔ͜ͱͱ౳ՁͰ͋Δ E(w) = 1 2 N ∑ n=1 {y(xn , w) − tn }2 (1.2) ▶ ೋ৐࿨ޡࠩؔ਺͸ϊΠζ͕Ψ΢ε෼෍ʹै͏ͱ͍͏ԾఆͷԼͰͷ ໬౓ͷ࠷େԽͷ݁ՌͱΈͳͤΔ ▶ ͞Βʹ βML ͸ҎԼͷΑ͏ʹٻΊΒΕΔ 1 βML = 1 N N ∑ n=1 (y(xn , wML ) − tn )2 (1.63) 26 / 55
  24. 1.2.5 ۂઢϑΟοςΟϯά࠶๚ ▶ ࣍͸ϕΠζతͳΞϓϩʔνΛߦ͏ͨΊʹɺҎԼͷࣄલ෼෍Λಋೖ ͢Δ p(w|α) = N(w|0, α−1I) =

    ( α 2π )(M+1)/2 exp { − α 2 wT w } (1.65) ▶ ϕΠζͷఆཧΑΓɺࣄޙ෼෍ p(w|x, t, α, β) ͸໬౓ؔ਺ͱࣄલ෼෍ Λ༻͍ͯҎԼͷΑ͏ʹͳΔ p(w|x, t, α, β) ∝ p(t|x, w, β)p(w|α) (1.66) ▶ ͜ͷࣄޙ෼෍ p(w|x, t, α, β) Λ࠷େʹ͢Δ w Λݟ͚ͭΔ͜ͱ͸Ҏ ԼͷྔΛ࠷খʹ͢Δ w Λݟ͚ͭΔ͜ͱͱ౳Ձ β 2 N ∑ n=1 {y(xn , w) − tn }2 + α 2 wT w (1.67) ▶ ͜ΕΑΓɺ(1,67) ͸ਖ਼ଇԽ͞Εͨೋ৐࿨ޡࠩ (1.4) ͱ౳Ձ ▶ ϕΠζతͳΞϓϩʔνͰ͸ɺ(ࣄલ෼෍Λ͋Β͔͡ΊઃఆͰ͖Δ͓ ͔͛Ͱ) աֶशΛ཈੍͢Δ͜ͱ͕Ͱ͖Δ 27 / 55
  25. 1.2.6 ϕΠζۂઢϑΟοςΟϯά ▶ 1.2.5 Ͱ͸ɺࣄޙ෼෍ͷ఺ਪఆΛߦ͍ͬͯΔ͚ͩͳͷͰɺ׬શͳϕ ΠζతͳΞϓϩʔνͱ͸ݴ͑ͳ͍ ▶ Ճ๏ఆཧͱ৐๏ఆཧΛ༻͍Δͱɺະ஌ͷೖྗσʔλ x ͕༩͑ΒΕ

    ͨ࣌ͷग़ྗ t ͷ༧ଌ෼෍ p(t|x, x, t) ͸ҎԼͷΑ͏ʹͳΔ p(t|x, x, t) = ∫ p(t|x, w) p(w|x, t) dw (1.68) ▶ (1.60) ͱ (1.66) ͷӈลΛن֨Խͨ͠΋ͷΛ༻͍Δͱɺ(1.68) ͷੵ෼ ͸࣮ߦͰ͖ͯɺҎԼͷΑ͏ʹͳΔ p(t|x, x, t) = N(t|m(x), s2(x)) (1.69) 28 / 55
  26. 1.2.6 ϕΠζۂઢϑΟοςΟϯά ▶ ͜͜Ͱɺฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔ m(x) = β ϕ(x)T S N

    ∑ n=1 ϕ(xn )tn (1.70) s2(x) = β−1 + ϕ(x)T Sϕ(x) (1.71) ▶ S ͸ҎԼͰఆٛ͞ΕΔ S−1 = αI + β N ∑ n=1 ϕ(xn )ϕ(xn )T (1.72) ▶ p(t|x, x, t) ͸ฏۉͱ෼ࢄ͕ೖྗ x ʹґଘͨ͠Ψ΢ε෼෍ʹͳΔ͜ ͱ͕Θ͔Δ (۩ମతͳಋग़͸ 3.3 અͰ) 29 / 55
  27. 1.3 Ϟσϧબ୒ ▶ ൚Խੑೳ͕͋ΔϞσϧΛબͿʹ͸ͲͷΑ͏ʹ͢Ε͹͍͍ͷ͔Λߟ ͑Δ ▶ ଟ߲ࣜۂઢϑΟοςΟϯάͷྫͰ͸ɺ࣍਺ M ΍ਖ਼ଇύϥϝʔλ λ

    ΛܾΊΔ͜ͱ͕Ϟσϧͷબ୒ʹ͋ͨΔ ▶ Ϟσϧબ୒ͷύϥϝʔλ (ϋΠύʔύϥϝʔλ) ΛͲͷ஋ʹ͢Ε͹ ྑ͍ͷ͔ΛܾΊΔͨΊʹ࢖༻͢Δσʔλͷू߹ΛόϦσʔγϣ ϯσʔλͱ͍͏ ▶ ͨͩ͠ɺϋΠύʔύϥϝʔλΛม͑ͳ͕ΒόϦσʔγϣϯΛߦ͍ɺ ͋Δύϥϝʔλͷ࣌ʹྑ͍ਫ਼౓͕ग़ͨͱͯ͠΋ɺͦΕ͸୯ʹόϦ σʔγϣϯσʔλʹରͯ͠աֶशΛى͍ͯ͜͠Δ͚͔ͩ΋͠Ε ͳ͍ ▶ ैͬͯɺόϦσʔγϣϯσʔλʹΑͬͯબ୒͞ΕͨϞσϧ (ϋΠ ύʔύϥϝʔλ) ͷਫ਼౓ΛςετσʔλΛ࢖ͬͯଌΔඞཁ͕͋Δ 30 / 55
  28. 1.3 Ϟσϧબ୒ ▶ Ҏ্ͷํ๏Ͱ͸ɺશσʔλΛ܇࿅σʔλͱόϦσʔγϣϯσʔλ ͱςετσʔλͷ 3 ͭʹ෼͚ͳ͚Ε͹ͳΒͳ͍͕ɺͰ͖Δ͚ͩ σʔλ͸܇࿅σʔλʹׂ͖͍ͨ ▶ ༩͑ΒΕͨશσʔλͷ಺ɺ(S

    − 1)/S Λֶशʹ࢖͍ɺ1/S ΛόϦ σʔγϣϯʹ࢖͏ํ๏͕͋Δ (ΫϩεόϦσʔγϣϯ) ▶ ͜ΕΛ࢖͑͹ɺଟ͘ͷσʔλΛֶशʹ࢖͑ͯɺόϦσʔγϣϯ΋Ͱ ͖Δ ▶ Լͷਤ͸ S = 4 ͷ࣌ͷΫϩεόϦσʔγϣϯ 31 / 55
  29. 1.4 ࣍ݩͷढ͍ ▶ ࣍ʹɺߴ࣍ݩσʔλͷऔΓѻ͍ͷ೉͠͞Λݟ͍ͯ͘ ▶ ·ͣɺ1.1 અͰ͸ೖྗม਺ x ͸Ұ࣍ݩ͕ͩͬͨɺ͜͜Ͱ͸ೖྗΛ D

    ࣍ݩͷϕΫτϧ x ʹͯ͠ΈΑ͏ ▶ ͜ͷ࣌ɺ3 ࣍·Ͱͷଟ߲ࣜ͸ҰൠతʹҎԼͷΑ͏ʹ͔͚Δ y(x, w) = w0 + D ∑ i=1 wi xi + D ∑ i=1 D ∑ j=1 wij xi xj + D ∑ i=1 D ∑ j=1 D ∑ k=1 wijk xi xj xk (1.74) ▶ ಠཱͳ܎਺ͷ਺͸ D3 ʹൺྫ͍ͯͯ͠ɺ͜Ε͕ M ࣍ଟ߲ࣜͰ͸ɺ ಠཱͳ܎਺ͷ਺͸ DM ʹൺྫ͢Δ ▶ D ͕େ͖͘ͳΔͱɺࢦ਺ؔ਺૿Ճ΄ͲͰ͸ͳ͍͕ɺಠཱͳ܎਺ͷ ਺͸ඇৗʹେ͖͘ͳͬͯɺਪఆ͕ࠔ೉ʹͳΔ 32 / 55
  30. 1.4 ࣍ݩͷढ͍ ▶ ·ͨ 3 ࣍ݩۭؒͷزԿֶతͳ௚ײΛߴ࣍ݩʹͦͷ··͍࣋ͬͯ͘ ͜ͱ͸Ͱ͖ͳ͍ྫΛڍ͛Δ ▶ D ࣍ݩͷ൒ܘ

    r = 1 ͷٿΛߟ͑ɺr = 1 − ϵ ͱ r = 1 ͷؒͷٿ֪ͷ ମੵͱશମੵͷൺΛٻΊΔ ▶ ൒ܘ r ͷ D ࣍ݩ௒ٿͷମੵ VD (r) ͸ rD ʹൺྫ͍ͯ͠ΔͷͰɺൺ ͸ҎԼͷΑ͏ʹͳΔ VD (1) − VD (1 − ϵ) VD (1) = 1 − (1 − ϵ)D (1.76) 33 / 55
  31. 1.5 ܾఆཧ࿦ ▶ ֬཰࿦ʹΑΓɺෆ࣮֬ੑͷఆྔԽΛߦ͏͜ͱ͕Ͱ͖Δ͜ͱΛΈͨ ▶ ະ஌ͷೖྗσʔλ x ͱ໨ඪϕΫτϧ t ͷಉ࣌֬཰෼෍

    p(x, t) ͸܇ ࿅σʔλ͔Βਪଌ͢Δ͜ͱ͕Ͱ͖Δ ▶ ܾఆཧ࿦͸ɺ͜ͷ p(x, t) Λ༻͍ͯͲͷΑ͏ʹ໨ඪϕΫτϧ t Λܾ ఆ͢Ε͹ྑ͍ͷ͔ͱ͍͏ɺܾఆํ๏Λ༩͑Δ 35 / 55
  32. 1.5.1 ޡࣝผ཰ͷ࠷খԽ ▶ ઌͣ͸୯ʹޡࣝผͷ਺ΛݮΒ͢͜ͱ͚ͩΛߟ͑Δ ▶ ͦͯ͠ɺग़ྗม਺ t ͸ 2 ஋෼ྨͱ͠ɺt

    = 0 ͷ࣌ΛΫϥε C1 ɺt = 1 ͷ࣌ΛΫϥε C2 ͱ͢Δ ▶ ͞ΒʹɺೖྗϕΫτϧۭؒΛ R1 ͱ R2 ʹ෼ׂ͠ɺRk ্ͷ఺͸શ ͯΫϥε Ck ʹׂΓ౰ͯΔ (Rk ΛܾఆྖҬͱ͍͏) ▶ R1 ্ͷೖྗϕΫτϧΛΫϥε C2 ʹׂΓ౰ͯͯ͠·ͬͨΓɺR2 ্ ͷೖྗϕΫτϧΛΫϥε C1 ׂΓ౰ͯͯ͠·͏֬཰͸ҎԼͷΑ͏ʹ ͳΔ p(ޡΓ) = p(x ∈ R1 , C2 ) + p(x ∈ R2 , C1 ) = ∫ R1 p(x, C2 ) dx + ∫ R2 p(x, C1 ) dx (1.78) 36 / 55
  33. 1.5.1 ޡࣝผ཰ͷ࠷খԽ ▶ p(ޡΓ) Λ࠷খԽ͢Δʹ͸ɺ(1.78) ͷӈลͷੵ෼Λ࠷খʹ͢Ε͹ ྑ͍ ▶ ͭ·Γɺྫ͑͹͋Δ x

    ʹରͯ͠ɺp(x, C1 ) > p(x, C2 ) ͳΒɺx ͷΫ ϥε͸ C1 ʹׂΓ౰ͯΔ΂͖ ▶ ֬཰ͷ৐๏ఆཧΑΓ p(x, Ck ) = p(Ck |x)p(x) ͱͳΔͷͰɺp(ޡΓ) Λ࠷খʹ͢Δʹ͸ɺࣄޙ෼෍ p(Ck |x) ͕࠷େʹͳΔΫϥεʹׂΓ౰ ͯΔ࣌Ͱ͋Δ 37 / 55
  34. 1.5.2 ظ଴ଛࣦͷ࠷খԽ ▶ ଟ͘ͷ৔߹ɺ୯ʹޡࣝผͷ਺ΛݮΒ͚ͩ͢Ͱ͸μϝͳ৔߹͕͋Δ ▶ ྫ͑͹ɺ਍࡯݁Ռ͔Β؞͔Ͳ͏͔Λ൑ఆ͢Δ৔߹ɺຊ౰͸؞Ͱͳ͍ ਓΛ؞ͱ൑ఆͯ͠͠·͏ޡࣝผͷ਺ΑΓ΋ɺຊ౰͸؞ͳͷʹ؞Ͱ ͸ͳ͍ͱ൑ఆͯ͠͠·͏ޡࣝผͷ਺ΛݮΒ͢΂͖Ͱ͋Δ ▶ Ҏ্ͷ͜ͱΛߟྀ͢ΔͨΊɺҎԼͷଛࣦؔ਺Λ࠷খʹ͢ΔΑ͏ʹ

    ܾఆྖҬ Rk ΛܾΊΔ E[L] = ∑ k ∑ j ∫ Rj Lkj p(x, Ck ) dx (1.80) ▶ ͜͜Ͱग़ͯ͘Δ Lkj ͸ଛࣦߦྻͷ੒෼Ͱɺجຊతʹ͸ Lii = 0 ͱ ͠ɺΫϥε Cl ͱ൑ఆ͢Δ͜ͱ͕ඇৗʹϦεΫ͕͋Δ࣌ (؞ͷྫͩ ͱʮ؞Ͱ͸ͳ͍ʯΫϥε)ɺLlj (j ̸= l) ͷ஋Λେ͖͘͢ΔͱɺϦε Ϋͷ͋Δޡࣝผͷ਺ΛݮΒ͢͜ͱ͕Ͱ͖Δ 39 / 55
  35. 1.5.2 ظ଴ଛࣦͷ࠷খԽ ▶ (1.80) Λ࠷খʹ͢Δͱ͍͏͜ͱ͸ɺͦΕͧΕͷ x ʹରͯ͠ɺΫϥε ͷݸ਺෼͋Δྔ { ∑

    k Lkj p(x, Ck )} ͷத͔Β࠷΋খ͍͞Ϋϥε j ʹ x ΛׂΓ౰ͯΔ͜ͱʹͳΔ ▶ ৐๏ఆཧΑΓ p(x, Ck ) = p(Ck |x)p(x) ͱͳΔͷͰɺҎԼͷྔ͕࠷খ ͱͳΔΫϥε j ʹ x ΛׂΓ౰ͯΔ ∑ k Lkj p(Ck |x) (1.81) 40 / 55
  36. 1.5.3 غ٫Φϓγϣϯ ▶ ࣄޙ෼෍ p(Ck |x) ͷ࠷େ஋͕ 1 ΑΓ΋͔ͳΓখ͍࣌͞͸ɺͲͷΫϥ εʹଐ͢Δ͔Λܾఆ͠ͳ͍΄͏͕ྑ͍৔߹͕͋Δ

    ▶ ͦ͜Ͱɺᮢ஋ θ (0 ≤ θ ≤ 1) Λ༻ҙ͠ɺp(Ck |x) ≤ θ ͱͳΔΑ͏ͳೖ ྗ x ͸ഁغ͢Δ (ΫϥεͷܾఆΛ͠ͳ͍) ͱ͢Δ͜ͱ΋͋Δ 41 / 55
  37. 1.5.4 ਪ࿦ͱܾఆ ܾఆ໰୊Λղͨ͘Ίͷ 3 ͭͷҟͳΔΞϓϩʔνʹ͍ͭͯड़΂Δ (a) Ϋϥεͷ৚݅෇͖ີ౓ p(x|Ck ) ͱΫϥεͷࣄલ෼෍

    p(Ck ) ΛٻΊɺ ϕΠζͷఆཧΛ༻͍ͯ p(Ck |x) ΛٻΊɺܾఆཧ࿦Λ༻͍ܾͯఆ͢Δ p(Ck |x) = p(x|Ck )p(Ck ) p(x) (1.82) (b) ௚઀ p(Ck |x) ΛٻΊɺܾఆཧ࿦Λ༻͍ܾͯఆ͢Δ (c) ͦΕͧΕͷೖྗϕΫτϧ x ͔ΒΫϥεϥϕϧʹࣸ૾͢Δؔ਺ f(x) Λݟ͚ͭΔ (ྫ͑͹ɺf = 0 ͸ C1 Ͱ f = 1 ͸ C2 ʹରԠͱ͔) 42 / 55
  38. 1.5.5 ճؼͷͨΊͷଛࣦؔ਺ ▶ ͜Ε·ͰΫϥε෼ྨͷܾఆཧ࿦ʹ͍ͭͯߟ͖͕͑ͯͨɺࠓ౓͸ճ ؼ໰୊ͷܾఆཧ࿦Λߟ͑Δ ▶ ͦͷͨΊɺ1.1 ͷۂઢϑΟοςΟϯάͷྫΛߟ͑Δ ▶ ͦ͜Ͱɺଛࣦ

    L(t, y(x)) = {y(x) − t}2 Λ༻͍ͯҎԼͷΑ͏ͳଛࣦ ؔ਺Λ࠷খԽ͢ΔΑ͏ͳ y(x) ΛٻΊΔ E[L] = ∫ ∫ {y(x) − t}2 p(x, t) dx dt (1.87) 44 / 55
  39. 1.5.5 ճؼͷͨΊͷଛࣦؔ਺ ▶ E[L] Λ y(x) Ͱม෼͠ɺͦΕΛ 0 ͱ͓͘ δE[L]

    δy(x) = 2 ∫ {y(x) − t} p(x, t) dt = 0 (1.88) ▶ (1.88) Λຬͨ͢ y(x) ͸ҎԼͷΑ͏ʹͳΔ y(x) = ∫ t p(t|x) dt = Et [t|x] (1.89) ▶ ԼͷਤͷΑ͏ʹɺx0 ͕༩͑ΒΕͨΒɺy(x0 ) = Et [t|x0 ] ͱͳΔΑ͏ ʹ y(x0 ) ΛܾΊΔ 45 / 55
  40. 1.6 ৘ใཧ࿦ ▶ ύλʔϯೝࣝ΍ػցֶशͰ༗༻ͳ৘ใཧ࿦ͷ֓೦Λ͍͔ͭ͘ಋೖ ͢Δ ▶ ઌͣ͸֬཰ม਺ x Λ؍ଌͨ࣌͠ʹͲͷ͘Β͍ͷ৘ใྔ͕͋Δ͔Λ ߟ͑ΔͱɺΑ͘ى͜Δࣄ৅͸͋·Γ৘ใ͸ͳ͍͕ɺ͋·Γى͜Βͳ

    ͍ࣄ৅ͷ৘ใྔ͸ଟ͍ͱߟ͑ΒΕΔ ▶ Αͬͯɺ৘ใྔ h(x) ͸֬཰ p(x) ʹґଘ͓ͯ͠Γɺ2 ͭͷಠཱͳࣄ ৅ͷ৘ใྔ h(x, y) ͸ h(x, y) = h(x) + h(y) ͱͳ͍ͬͯΔͰ͋Ζ͏ ͜ͱͱɺp(x, y) = p(x)p(y) Ͱ͋Δ͜ͱΛ༻͍Δͱ h(x) ͸ҎԼͷΑ ͏ʹͳΔ͜ͱ͕Θ͔Δ h(x) = − log2 p(x) (1.92) ▶ ৘ใྔͷظ଴஋ΛΤϯτϩϐʔͱఆٛ͢Δ H[x] = − ∑ x p(x) log2 p(x) (1.93) 46 / 55
  41. 1.6 ৘ใཧ࿦ ▶ (1.92) ͷର਺ͷఈͷબͼํʹ͸ࣗ༝౓͕͋ΓɺҎ߱͸ఈΛ e ͱ͢Δ ▶ ͞Βʹɺ཭ࢄม਺Λ xi

    ͱ͠ɺ֬཰Λ p(xi ) ͱॻ͘ͱΤϯτϩϐʔ ͸ҎԼͷΑ͏ʹͳΔ H[p] = − ∑ i p(xi ) ln p(xi ) (1.98) ▶ ҎԼͷਤ͕Τϯτϩϐʔͷੑ࣭Λද͢ਤͰ͋ΓɺࠨͷΑ͏ʹগͳ ͍஋ͰӶ͍ϐʔΫΛ͍࣋ͬͯΔΑ͏ͳ෼෍ʹରͯ͠͸Τϯτϩ ϐʔ͸খ͘͞ɺҰํӈͷΑ͏ʹͨ͘͞Μͷ஋ʹ޿͕͍ͬͯΔΑ͏ ͳ෼෍ʹରͯ͠͸Τϯτϩϐʔ͸େ͖͍ 47 / 55
  42. 1.6 ৘ใཧ࿦ ▶ ࿈ଓͳ֬཰ม਺ʹରͯ͠ɺΤϯτϩϐʔ͸ҎԼͷΑ͏ʹఆٛ͞ΕΔ H[p] = − ∫ p(x) ln

    p(x) dx (1.103) ▶ ͜ͷΤϯτϩϐʔ (1.103) ΛҎԼͷ৚݅ԼͰ࠷େʹ͢Δ֬཰෼෍ p(x) ΛٻΊΔ͜ͱΛߟ͑Δ ∫ ∞ −∞ p(x) dx = 1 (1.105) ∫ ∞ −∞ x p(x) dx = µ (1.106) ∫ ∞ −∞ (x − µ)2 p(x) dx = σ2 (1.107) 48 / 55
  43. 1.6 ৘ใཧ࿦ ▶ ϥάϥϯδϡ৐਺๏ΑΓɺҎ্ͷྔΛ p(x) Ͱม෼ͨ͠΋ͷΛ 0 ʹ ͯ͠ɺλi Ͱภඍ෼ͨ͠΋ͷΛ

    0 ͱ͓͍ͨ΋ͷΛղ͘ − ∫ p(x) ln p(x) dx + λ1 ( ∫ ∞ −∞ p(x) dx − 1 ) + λ2 ( ∫ ∞ −∞ x p(x) dx − µ ) + λ3 ( ∫ ∞ −∞ (x − µ)2 p(x) dx − σ2 ) ▶ ղ͸ҎԼͷΑ͏ʹΨ΢ε෼෍ʹͳΔ p(x) = 1 (2πσ2)1/2 exp ( − (x − µ)2 2σ2 ) (1.109) ▶ Ψ΢ε෼෍ͷΤϯτϩϐʔ͸ҎԼͷΑ͏ʹͳΓɺ෼ࢄ͕େ͖͍ (ό ϥ͍ͭͨ෼෍) ͷ࣌͸Τϯτϩϐʔ͕େ͖͍͜ͱ͕Θ͔Δ H[] = 1 2 {1 + ln (2πσ2)} (1.110) 49 / 55
  44. 1.6 ৘ใཧ࿦ ▶ ҎԼͷ৚݅෇͖ΤϯτϩϐʔΛఆٛ͢Δ H[y|x] = − ∫ ∫ p(y,

    x) ln p(y|x) dy dx (1.111) ▶ ֬཰ͷ৐๏ఆཧΛ༻͍Δͱɺ৚݅෇͖Τϯτϩϐʔ͸ҎԼͷؔ܎ Λຬͨ͢͜ͱ͕Θ͔Δ H[x, y] = H[y|x] + H[x] (1.112) 50 / 55
  45. 1.6.1 ૬ରΤϯτϩϐʔͱ૬ޓ৘ใྔ ▶ ͋Δະ஌ͷ෼෍ p(x) ͕͋Γɺۙࣅతʹ q(x) ͰϞσϧԽͨ͠ͱͨ͠ ࣌ͷ͜ΕΒͷ෼෍͕ͲΕ͚ͩࣅ͍ͯΔ͔ΛଌΔΧϧόοΫ-ϥΠϒ ϥʔμΠόʔδΣϯεΛಋೖ͢Δ

    KL(p ∥ q) = − ∫ p(x) ln q(x) dx − ( − ∫ p(x) ln p(x) dx ) = − ∫ p(x) ln { q(x) p(x) } dx (1.113) ▶ ΧϧόοΫ-ϥΠϒϥʔμΠόʔδΣϯε͸ KL(p ∥ q) ≥ 0 ͱͳΓɺ ͳ͓͔ͭ౳͕ࣜ੒ཱ͢Δͷ͸ p(x) = q(x) ͷ͚࣌ͩͰ͋Δ͜ͱΛ ࣔ͢ 51 / 55
  46. 1.6.1 ૬ରΤϯτϩϐʔͱ૬ޓ৘ใྔ ▶ ತؔ਺ f(x) ʹ͸೚ҙͷ఺ू߹ {xi } ʹରͯ͠ҎԼͷΠΣϯηϯͷ ෆ౳͕ࣜ੒ཱ͢Δ

    f ( M ∑ i=1 λi xi ) ≤ M ∑ i=1 λi f(xi ) (1.115) ▶ ͜͜Ͱɺλi ≥ 0 Ͱ ∑ i λi = 1 Ͱ͋Δ ▶ λi Λ஋ xi ΛͱΔ֬཰ͱ͢Δͱɺ(1.115) ͸ҎԼͷΑ͏ʹͳΔ f(E[x]) ≤ E[f(x)] (1.116) ▶ (1.116) Λ࿈ଓม਺ʹରͯ͠దԠ͢ΔͱҎԼͷΑ͏ʹͳΔ f ( ∫ x p(x) dx ) ≤ ∫ f(x) p(x) dx (1.117) 53 / 55
  47. 1.6.1 ૬ରΤϯτϩϐʔͱ૬ޓ৘ใྔ ▶ ͜ͷΠΣϯηϯͷෆ౳ࣜΛΧϧόοΫ-ϥΠϒϥʔμΠόʔδΣϯ εʹదԠ͢ΔͨΊʹ (1.115) ʹ໭Γɺ λi Λ஋ zi

    ΛऔΔ࣌ͷ֬཰ͱ ͠ɺxi = ξ(zi ) ͱ͢Δͱɺ(1.115) ͷ࿈ଓ൛͸ҎԼͷΑ͏ʹͳΔ f ( ∫ ξ(x) p(x) dx ) ≤ ∫ f(ξ(x)) p(x) dx ▶ ্ͷࣜͰɺf(x) = − ln x, ξ(x) = q(x)/p(x) ͱ͢ΔͱɺҎԼͷΑ͏ ʹ KL(p ∥ q) ≥ 0 ͕ࣔͤΔ KL(p ∥ q) = − ∫ p(x) ln { q(x) p(x) } dx ≥ − ln ∫ q(x) dx = 0 (1.118) 54 / 55
  48. 1.6.1 ૬ରΤϯτϩϐʔͱ૬ޓ৘ใྔ ▶ ࠷ޙʹ 2 ͭͷ֬཰ม਺͕ͲΕ͚ͩಠཱʹ͍ۙͷ͔ΛΧϧόοΫ-ϥ ΠϒϥʔμΠόʔδΣϯεΛ༻͍ͯධՁ͢Δɻ ▶ ͭ·ΓɺҎԼͷಉ࣌֬཰ p(x,

    y) ͱपล֬཰ͷੵ p(x)p(y) ͷؒͷ ΧϧόοΫ-ϥΠϒϥʔμΠόʔδΣϯε I[x, y] Λߟ͑Δ I[x, y] = − ∫ ∫ p(x, y) ln { p(x)p(y) p(x, y) } dx dy (1.120) ▶ I[x, y] ΛΤϯτϩϐʔΛ༻͍ͯॻ͘ͱҎԼͷΑ͏ʹͳΔ I[x, y] = H[x] − H[x|y] = H[y] − H[y|x] (1.121) ▶ ͜ΕΑΓɺI[x, y] ͸ y Λ஌Δ͜ͱʹΑͬͯɺx ͷෆ࣮֬ੑ͕Ͳͷ͘ Β͍ݮΔͷ͔Λද͍ͯ͠Δ (ٯ΋ಉ͡) 55 / 55