Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PRMLセミナー

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for gucchi gucchi
March 20, 2019

 PRMLセミナー

Avatar for gucchi

gucchi

March 20, 2019
Tweet

More Decks by gucchi

Other Decks in Science

Transcript

  1. 0. ࠓճͷηϛφʔʹ͍ͭͯ ࠓճͷηϛφʔͰ͸ɺPRML ͷୈ 6 ষͷΧʔωϧ๏ͱୈ 7 ষͷૄͳղ Λ࣋ͭΧʔωϧϚγʔϯΛ͓࿩͍ͨ͠͠ͱࢥ͍·͢ɻ ·ͨɺ͜ΕΒͷ࿩୊Λઆ໌͢ΔͨΊʹඞཁͳ༧උ஌ࣝΛղઆ͠·͢ɻ

    (PRML ͷୈ 3 ষͷઢܗճؼϞσϧͱ 4.1.1 ͷ఺ͱ௒ฏ໘ͷڑ཭) ͳ͓஫ҙ఺ͱͯ͠ɺຊεϥΠυͷࣜ൪߸ͱ PRML ͷࣜ൪߸͸ҟͳΓ· ͢ͷͰɺ͝஫ҙ͍ͩ͘͞ɻ 2 / 74
  2. ໨࣍ 1. ༧උ஌ࣝ 1-1. ઢܗճؼϞσϧ 1-2. ఺ͱ௒ฏ໘ͷڑ཭ 2. Χʔωϧ๏ 2-1.

    ૒ରදݱ 2-2. Χʔωϧؔ਺ͷߏ੒ 2-3. Ψ΢εաఔʹΑΔճؼ 2-4. Ψ΢εաఔʹΑΔ෼ྨ 3. ૄͳղΛ࣋ͭΧʔωϧϚγʔϯ 3-1. ࠷େϚʔδϯ෼ྨث 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ 3-3. ճؼͷͨΊͷ SVM 3-4. ճؼͷͨΊͷ RVM 3-5. ෼ྨͷͨΊͷ RVM 3-5. ෼ྨͷͨΊͷ RVM 3 / 74
  3. 1-1. ઢܗճؼϞσϧ ճؼ໰୊ͷجૅͱͳΔઢܗճؼϞσϧͷ֓ཁͷઆ໌Λ؆୯ʹߦ͏ɻ (PRML 3.1.1) ·ͣɺD ࣍ݩೖྗϕΫτϧΛ xɺֶशύϥϝʔλΛ w =

    (w0 , w1 , · · · , wM−1 )T ͱ͠ɺؔ਺ y(x, w) Λඇઢܗͳجఈؔ਺ ϕj (x) (j = 1, · · · , M − 1) ͰҎԼͷΑ͏ʹల։͢Δ͜ͱΛߟ͑Δɻ y(x, w) = w0 + M−1 ∑ j=1 wj ϕj (x) (1.1) ·ͨࣜΛ୹ॖ͢ΔͨΊɺϕ0 (x) = 1 ͱ͠ɺ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T ͱఆٛ͢Δͱɺ(1.1) ͸ y(x, w) = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (1.2) ͱॻ͚Δɻ 4 / 74
  4. 1-1. ઢܗճؼϞσϧ ͜͜Ͱڭࢣσʔλͱͯ͠ɺೖྗσʔλͷू߹ X = {x1 , x2 , ·

    · · , xN } ͱ ͦΕͧΕʹରԠ͢Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺҎԼͷ ଛࣦؔ਺Λߟ͑Δɻ ED (w) = 1 2 N ∑ n=1 (tn − y(xn , w))2 (1.3) ͜ͷଛࣦؔ਺Λ࠷খʹ͢ΔΑ͏ͳύϥϝʔλ w ΛٻΊΔɻ(ֶश) ED (w) ͷ w ʹର͢Δޯ഑͸ɺy(x, w) = wTϕ(x) ΑΓҎԼͷΑ͏ʹ ͳΔɻ ∂ ∂w ED (w) = 1 2 N ∑ n=1 ∂ ∂w (tn − wTϕ(xn ))2 = − N ∑ n=1 (tn − wTϕ(xn ))ϕ(xn ) = − { N ∑ n=1 tn ϕ(xn ) − N ∑ n=1 ϕ(xn )ϕ(xn )Tw } (1.4) 5 / 74
  5. 1-1. ઢܗճؼϞσϧ ͜ΕΑΓɺ࠷໬ਪఆղ wML ͸ҎԼͷࣜΛຬͨ͢ɻ N ∑ n=1 tn ϕ(xn

    ) − N ∑ n=1 ϕ(xn )ϕ(xn )TwML = 0 (1.5) ͜͜ͰɺҎԼͷܭըߦྻ Φ Λఆٛ͢Δɻ(ޙͷষͰ΋ग़͖ͯ·͢ɻ) Φ =       ϕ0 (x1 ) ϕ1 (x1 ) · · · ϕM−1 (x1 ) ϕ0 (x2 ) ϕ1 (x2 ) · · · ϕM−1 (x2 ) . . . . . . ... . . . ϕ0 (xN ) ϕ1 (xN ) · · · ϕM−1 (xN )       =       ϕ(x1 )T ϕ(x2 )T . . . ϕ(xN )T       (1.6) ҎԼͷ͕ࣜ੒Γཱͭࣄ͕Θ͔Δɻ ΦTΦ = N ∑ n=1 ϕ(xn )ϕ(xn )T (1.7) ΦTt = N ∑ n=1 tn ϕ(xn ) (1.8) 6 / 74
  6. 1-1. ઢܗճؼϞσϧ ͜ΕΑΓɺ(1.5) ͸ҎԼͷΑ͏ʹͳΔɻ ΦTt − ΦTΦwML = 0 (1.9)

    Αͬͯɺ࠷໬ਪఆղ wML ͸ wML = (ΦTΦ)−1ΦTt (1.10) ͱͳΔɻ ͜ͷ࠷໬ਪఆղΛ༻͍ͯɺະ஌ͷೖྗ ˜ x ͕༩͑ΒΕͨͱ͖ͷग़ྗͷ༧ ଌ͸ y(˜ x, wML ) Ͱ༩͑ΒΕΔɻ 7 / 74
  7. 1-1. ઢܗճؼϞσϧ ޡࠩؔ਺ (1.3) ͰֶशΛߦ͏ͱɺ͠͹͠͹աֶश (ڭࢣσʔλʹରͯ͠ ͸ྑ͍ਫ਼౓Λग़͕͢ɺςετσʔλʹରͯ͠͸ѱ͍ਫ਼౓Λग़ͯ͠͠·͏ ঢ়ଶ) Λىͯ͜͠͠·͏͜ͱ͕͋Δɻ աֶश͕ى͖͍ͯΔ࣌͸ɺύϥϝʔλ

    wML ͷ੒෼ͷઈର஋͕େ͖͘ͳ Δ܏޲͕͋ΔͨΊɺҎԼͷΑ͏ͳޡࠩؔ਺Λߟ͑Δɻ ED (w) = 1 2 N ∑ n=1 (tn − y(xn , w))2 + λ 2 ∥w∥2 (1.11) ͜͜ͰɺϊϧϜ ∥w∥2 = wTw = w2 0 + w2 1 + · · · w2 M ɺλ ͸ਖ਼ͷύϥϝʔ λɻ(ਖ਼ଇԽ߲ͱೋ৐ޡࠩͷ࿨ͷ߲ͷ૬ରతͳॏཁ౓Λௐઅ) ͜ͷޡࠩؔ਺Λ༻͍ΔͱɺաֶशΛ཈੍͢Δ͜ͱ͕Ͱ͖Δ͜ͱ͕͋Δɻ ͜ͷ࣌ͷύϥϝʔλͷ࠷໬ਪఆղ wML (ޙ΄Ͳ࠶ͼग़͖ͯ·͢ɻ) ͸ wML = ( λIM + ΦTΦ )−1 ΦTt (1.12) ͱͳΔɻ(PRML 3.1.4 ࢀর) ͜͜ͰɺIM ͸ M × M ͷ୯ҐߦྻͰ͋Δɻ 8 / 74
  8. 1-2. ఺ͱ௒ฏ໘ͷڑ཭ ͋ͱͰઆ໌͢Δ SVM Ͱ͸ɺσʔλ఺ͱΫϥεͷڥք໘ (Ұൠతʹ௒ฏ ໘) ͷڑ཭Λ࢖ͬͯٞ࿦͢ΔͷͰɺ͜͜Ͱ఺ͱ௒ฏ໘ͷڑ཭ʹ͍ͭͯٞ ࿦͢Δɻ(PRML 4.1.1)

    ·ͣҎԼͷઢܗؔ਺Λߟ͑Δɻ y(x) = wTx + w0 (1.13) ͜͜Ͱɺw ͱ x ͸ڞʹ D ࣍ݩϕΫτϧͱ͢Δɻ ෼ྨ໰୊Ͱ͸Α͘ɺy(x) ≥ 0 ͳΔೖྗ x ͸Ϋϥε C1 ʹׂΓ౰ͯΒΕɺ ͦΕҎ֎͸Ϋϥε C2 ʹׂΓ౰ͯΔɺΈ͍ͨͳ࢖͍ํΛ͢Δɻ Αͬͯɺy(x) = 0 (D − 1 ࣍ݩ௒ฏ໘) ͸Ϋϥεͷڥք (ܾఆ໘) Λද͢ɻ 9 / 74
  9. 1-2. ఺ͱ௒ฏ໘ͷڑ཭ ·ͣ͸ɺܾఆ໘্ͷҟͳΔ 2 ఺ xA ͱ xB Λߟ͑Δͱɺ͜ΕΒͷ఺͸ܾ ఆ໘্ʹ͋ΔͷͰҎԼ͕੒Γཱͭɻ

    y(xA ) = wTxA + w0 = 0 (1.14) y(xB ) = wTxB + w0 = 0 (1.15) ͜ΕΒͷࣜΛҾ͖ࢉ͢ΔͱɺwT(xA − xB ) = 0 ͱͳΔɻ ϕΫτϧ xA − xB ͸ܾఆ໘ʹฏߦͳϕΫτϧͳͷͰɺw ͸ܾఆ໘ʹਨ ௚ͳϕΫτϧͰ͋Δ͜ͱ͕Θ͔Δɻ 11 / 74
  10. 1-2. ఺ͱ௒ฏ໘ͷڑ཭ ͦΕͰ͸ɺ͜ͷΫϥεͷڥք y(x) = 0 ͱ఺ x ͷ௚ߦڑ཭ |r|

    ΛٻΊΔɻ ͦ͜Ͱɺx ΛҎԼͷΑ͏ʹɺܾఆ໘ʹਨ௚ͳํ޲ͱͦΕҎ֎ͷํ޲ x⊥ ʹ෼ղ͢Δɻ x = x⊥ + r w ∥w∥ (1.16) ͜͜Ͱɺw/∥w∥ ͸ܾఆ໘ʹਨ௚ͳ୯ҐϕΫτϧͰ͋Γɺx⊥ ͸ܾఆ໘ ্ͷ఺ (ϕΫτϧ) ʹͱΔɻ 12 / 74
  11. 1-2. ఺ͱ௒ฏ໘ͷڑ཭ (1.16) ͷ྆ลʹ wT Λ͔͚ͯɺw0 Λ଍͢ͱҎԼͷΑ͏ʹͳΔɻ((1.13) ΋༻͍ͨ) wTx +

    w0 = wTx⊥ + w0 + r∥w∥ →y(x) = y(x⊥ ) + r∥w∥ (1.17) x⊥ ͸ܾఆ໘্ͷ఺ͳͷͰɺy(x⊥ ) = 0 Λຬͨ͢ͷͰɺ௚ߦڑ཭ |r| ͸Ҏ ԼͷΑ͏ʹٻΊΒΕΔɻ |r| = |y(x)| ∥w∥ (1.18) ޙͷষͰ݁Ռ (1.18) Λ࢖༻ͯٞ͠࿦Λߦ͏ɻ 13 / 74
  12. 2. Χʔωϧ๏ 1-1 Ͱ͸ɺग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσϧΛߟ ͑ͨɻ y(x, w)

    = M−1 ∑ j=0 wj ϕj (x) = wTϕ(x) (2.1) ͜͜Ͱɺx = (x1 , x2 , · · · , xD )T ͸ D ࣍ݩͷೖྗϕΫτϧɻ ϕ(x) = (ϕ0 (x), ϕ1 (x), · · · , ϕM−1 (x))T ͸ೖྗϕΫτϧ x Λ M ࣍ݩͷ ಛ௃ۭؒʹࣸ૾͢ΔϕΫτϧؔ਺ɻ·ͨɺw = (w0 , w1 , · · · , wM−1 )T ͸ M ࣍ݩͷύϥϝʔλϕΫτϧͰ͋Δɻ 1-1 Ͱ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͦΕͧΕʹରԠ͢Δ໨ඪ ม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙͯ͠ɺ࠷খೋ৐๏Λ༻͍ͯɺxn Λೖ ྗͨ࣌͠ͷग़ྗ y(xn , w) ͕ tn Λ࠶ݱ͢ΔΑ͏ʹ࠷໬ਪఆղ wML Λٻ Ίͨɻ 14 / 74
  13. 2. Χʔωϧ๏ ͭ·Γ 1-1 Ͱ͸ɺग़ྗ y(x, w) ΛϕΫτϧؔ਺ ϕ(x) Λ࢖ͬͯߏ੒͢Δ

    ͜ͱ͕Ϟσϧߏஙͷग़ൃ఺Ͱ͋ͬͨɻ(ͪͳΈʹ PRML 5 ষͷχϡʔϥ ϧωοτͰ͸ɺϕ(x) ࣗମ΋ֶशύϥϝʔλʹґଘͤ͞Δͱ͜Ζ͔Βग़ ൃ͢Δ) ଟ͘ͷઢܕύϥϝτϦοΫϞσϧͰ͸ɺϞσϧΛ૒ରදݱͰॻ͖௚͢͜ ͱʹΑΓɺΧʔωϧؔ਺ k(x, x′) = ϕ(x)Tϕ(x′) (2.2) Λ௨ͯ͠ͷΈ ϕ(x) ͕࠷໬ਪఆղ wML ΍ͦͷύϥϝʔλΛ༻͍ͨग़ྗ y(x, wML ) ΁ґଘ͢ΔΑ͏ʹॻ͖௚ͤΔɻ(2-1 Ͱৄ͘͠ղઆ͢Δ) ·ͨɺճؼͱ෼ྨͷઢܕύϥϝτϦοΫϞσϧ (1-1 Ͱ͸ɺճؼͷઢܕ ύϥϝτϦοΫϞσϧΛऔΓѻͬͨ) Λ֬཰తʹऔΓѻ͏͜ͱʹΑͬ ͯɺ͜ΕΒͷϞσϧ͕Ψ΢εաఔͷҰྫʹͳ͍ͬͯΔ͜ͱΛΈΔɻ (2-3, 2-4 Ͱৄ͘͠ղઆ͢Δ) 15 / 74
  14. 2-1. ૒ରදݱ ग़ྗ y(x, w) ͕ҎԼͷΑ͏ʹͳΔύϥϝτϦοΫϞσϧΛߟ͑Δɻ y(x, w) = wTϕ(x)

    (2.3) ҎԼͷਖ਼ଇԽ͞Εͨೋ৐࿨ޡࠩΛ࠷খԽ͢Δ͜ͱΛߟ͑Δɻ(λ ͸ਖ਼ͷ ύϥϝʔλ) J(w) = 1 2 N ∑ n=1 {wTϕ(xn ) − tn }2 + λ 2 wTw (2.4) ͜͜Ͱɺೖྗσʔλͷू߹Λ {x1 , x2 , · · · , xN }ɺ໨ඪม਺ͷू߹Λ {t1 , t2 , · · · , tN } ͱ͢Δɻ 16 / 74
  15. 2-1. ૒ରදݱ ఀཹ఺৚݅ ∂J(w)/∂w = 0 ͸ҎԼͷΑ͏ʹมܗͰ͖Δɻ((1.4) Λࢀর) w =

    N ∑ n=1 an ϕ(xn ) = ΦTa (2.5) ͜͜Ͱɺ an = − 1 λ {wTϕ(xn ) − tn } (2.6) Ͱ͋Γɺa = (a1 , · · · , aN )T ͱ͠ɺΦ = (ϕ(x1 ), · · · , ϕ(xN ))T ͸ܭըߦ ྻ (1.6) Ͱ͋Δɻ w = ΦTa Λ༻͍ͯɺJ(w) ΛύϥϝʔλϕΫτϧ a ͷؔ਺ʹॻ͖௚͢ ͱҎԼͷΑ͏ʹͳΔɻ(ม਺ม׵) J(a) = 1 2 aTΦΦTΦΦTa − aTΦΦTt + 1 2 tTt + λ 2 aTΦΦTa (2.7) ͜͜Ͱɺt = (t1 , · · · , tN )T Ͱ͋Δɻ 17 / 74
  16. 2-1. ૒ରදݱ ͜͜ͰɺάϥϜߦྻ K = ΦΦT Λఆٛ͢Δɻ͜ͷߦྻͷ੒෼ Knm ͸ఆ ٛΑΓɺҎԼͷΑ͏ʹΧʔωϧͰॻ͚Δɻ

    Knm = ϕ(xn )Tϕ(xm ) = k(xn , xm ) (2.8) άϥϜߦྻ K Λ༻͍Δͱɺ(2.7) ͷ J(a) ͸ҎԼͷΑ͏ʹॻ͚Δɻ J(a) = 1 2 aTKKa − aTKt + 1 2 tTt + λ 2 aTKa (2.9) ͜ͷΑ͏ʹύϥϝʔλ w ͷ୅ΘΓʹύϥϝʔλ a Ͱ࠷খೋ৐๏ͷΞϧ ΰϦζϜΛදݱ͢Δ͜ͱ͕Ͱ͖ɺ͜ͷදݱΛ૒ରදݱͱݴ͏ɻ ૒ରදݱͰॻ͖௚͢ͱɺJ(a) ͷ ϕ(x) ґଘ͸Χʔωϧ (2.8) Λ௨ͯ͠ͷ Έґଘ͍ͯ͠Δ͜ͱ͕Θ͔Δɻ(ੜͷ ϕ(x) ґଘ͸ͳ͍) 18 / 74
  17. 2-1. ૒ରදݱ ͜ͷ J(a) Λ࠷খʹ͢Δ a ΛٻΊΔ (ޯ഑͕θϩʹͳΔΑ͏ͳ a ΛٻΊ

    Δ) ͱɺҎԼͷΑ͏ʹͳΔɻ a = (K + λIN )−1t (2.10) ͜͜ͰɺIN ͸ N × N ͷ୯ҐߦྻͰ͋Δɻ ͜ͷղ a ͱ w = ΦTa ͱ y(x, w) = wTϕ(x) Λ༻͍Δͱɺ৽͍͠ೖྗ x ʹର͢Δ༧ଌ y(x) ͸ҎԼͷΑ͏ʹͳΔɻ y(x) = aTΦϕ(x) = k(x)T(K + λIN )−1t (2.11) ͜͜Ͱɺk(x) = (k(x1 , x), k(x2 , x), · · · , k(xN , x))T Ͱ͋Δɻ ͜ΕΑΓɺ༧ଌ y(x) ΋Χʔωϧؔ਺ͷΈʹΑͬͯද͞Ε͍ͯΔɻ 19 / 74
  18. 2-1. ૒ରදݱ ૒ରදݱͰղ a ΛٻΊΔࡍɺ(2.10) ΑΓ N × N ͷߦྻͷٯߦྻΛٻΊ

    Δඞཁ͕͋Δɻ(N ͸ڭࢣσʔλͷ਺ɻ) Ұํɺओදݱ (ࠓͰ͸ύϥϝʔλ w Ͱͷදݱͷํ) Ͱͷղ w ͸ɺ(1.12) ΑΓ w = ( λIM + ΦTΦ )−1 ΦTt (2.12) ͳͷͰɺM × M ͷߦྻͷٯߦྻΛٻΊΔඞཁ͕͋Δɻ(M ͸ಛ௃ྔۭ ؒͷ࣍ݩɻ) N ≫ M ͷ࣌ (͜ͷΑ͏ͳ৔߹͕େଟ਺)ɺओදݱͰղΛٻΊΔํָ͕ɻ Ұํɺ૒ରදݱͰ͸ M ͕ແݶେͷ࣌ͷಛ௃ۭؒ΋औΓѻ͏͜ͱ͕Ͱ͖ Δɻ(2-2 Ͱ M ͕ແݶେͷ࣌ͷಛ௃ۭؒͷྫΛڍ͛Δɻ) 20 / 74
  19. 2-2. Χʔωϧؔ਺ͷߏ੒ ͜ͷઅͰ͸ɺΧʔωϧؔ਺ͷఆٛΛ༻͍ͯɺ͍ΖΜͳΧʔωϧؔ਺Λ঺ հ͢Δɻ Χʔωϧؔ਺ͷఆٛ͸ɺೖྗ x ͔Βద੾ͳ M ࣍ݩಛ௃ۭؒ΁ͷࣸ૾ ϕ

    ͕ఆٛͰ͖ɺk(x, x′) ͕ k(x, x′) = ϕ(x)Tϕ(x′) (2.13) ͱॻ͚Δ͜ͱͰ͋Δɻ Χʔωϧؔ਺ͷ؆୯ͳྫ͸ k(x, z) = (xTz)2 (2.14) Ͱ͋Δɻ 21 / 74
  20. 2-2. Χʔωϧؔ਺ͷߏ੒ ྫ͑͹ x = (x1 , x2 )T ͱ͠ɺࣸ૾

    ϕ Λ ϕ(x) = (x2 1 , √ 2x1 x2 , x2 2 )T ͱ͢ Δͱɺ k(x, z) = (xTz)2 = ϕ(x)Tϕ(z) (2.15) ͱॻ͚ΔͷͰɺk(x, z) = (xTz)2 ͸Χʔωϧؔ਺Ͱ͋Δɻ ࣮͸Χʔωϧؔ਺ͷఆٛ͸ (2.13) ͷଞʹ΋͏Ұͭ͋Γɺ੒෼͕ Knm = k(xn , xm ) Ͱ͋ΔάϥϜߦྻ K ͕൒ਖ਼ఆஔߦྻͰ͋Δ͜ͱͰ ͋Δɻ ɹ (͜ΕΒͷ 2 ͭͷఆ͕ٛ౳ՁͰ͋Δ͜ͱ͸ҎԼͷهࣄͰূ໌ͯ͠Έ·͠ ͨɻ͚ٓ͠Ε͹ɺ͝ཡ and άου͍ͩ͘͞) https://qiita.com/gucchi0403/items/544065345f91144524c4 22 / 74
  21. 2-2. Χʔωϧؔ਺ͷߏ੒ ࣍ʹɺطʹΧʔωϧؔ਺ͩͱΘ͔͍ͬͯΔؔ਺͔Βɺ৽͍͠Χʔωϧؔ ਺ k(x, x′) Λੜ੒͢Δํ๏ΛҎԼʹࣔ͢ɻ ͜͜Ͱɺؔ਺ k1 (·,

    ·), k2 (·, ·) ͸Χʔωϧؔ਺ɺc > 0 ͸ఆ਺ɺf(·) ͸೚ҙ ͷؔ਺ɺq(·) ͸ඇෛͷ܎਺Λ࣋ͭଟ߲ࣜɺϕ(·) ͸ M ࣍ݩϕΫτϧؔ਺ɺ k3 (·, ·) ͸ M ࣍ݩϕΫτϧ্ۭؒʹఆٛ͞ΕͨΧʔωϧؔ਺ɺA ͸ରশ ͳ൒ਖ਼ఆஔߦྻɺx = (xa , xb )ɺka (·, ·), kb (·, ·) ͸Χʔωϧؔ਺Ͱ͋Δɻ 23 / 74
  22. 2-2. Χʔωϧؔ਺ͷߏ੒ ͜ΕΒͷߏ੒๏Λ༻͍Δͱɺྫ͑͹ҎԼͷؔ਺͕ΧʔωϧͰ͋Δ͜ͱ͕ Θ͔Δɻ k(x, x′) = (xTx′ + c)M

    (2.16) ͜͜Ͱɺc ≥ 0 ͷఆ਺ɺM ͸೚ҙͷࣗવ਺ɻ ·ͨɺҎԼͷඇৗʹॏཁͳΨ΢εΧʔωϧͱݴ͏Χʔωϧؔ਺Λߏ੒Ͱ ͖Δɻ k(x, x′) = exp (−∥x − x′∥/2σ2) (2.17) ͜͜Ͱɺσ2 ͸೚ҙͷਖ਼ͷఆ਺ɻ ͪͳΈʹΨ΢εΧʔωϧʹରԠ͢Δಛ௃ϕΫτϧ͸ແݶ࣍ݩͰ͋Δɻ (→PRML ͷԋश໰୊ 6.11) 24 / 74
  23. 2-3. Ψ΢εաఔʹΑΔճؼ 2-1 Ͱ͸ɺઢܗճؼͷඇ֬཰తͳϞσϧ (ग़ྗ y(x, w) Λͦͷ··༧ଌ ʹ࢖༻) ʹ͍ͭͯɺ૒ରදݱͰॻ͖௚͢͜ͱͰΧʔωϧ͕ग़ݱ͢Δ͜ͱ

    Λݟͨɻ ɹ ࠓ౓͸ઢܗճؼͷ֬཰Ϟσϧ (༧ଌ t ͷ֬཰෼෍Λಋग़͢Δ) Λѻ͍ɺ ͜͜Ͱ΋ࣗવʹΧʔωϧ͕ग़ͯ͘Δ͜ͱΛ֬ೝ͢Δɻ 25 / 74
  24. 2-3. Ψ΢εաఔʹΑΔճؼ 2-1 ͱಉ༷ʹҎԼͷΑ͏ͳೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ͔Β࢝ΊΔɻ y(x, w) = wTϕ(x)

    (2.18) ࣍ʹɺϕΠζతͳΞϓϩʔνΛߦ͍͍ͨͷͰɺύϥϝʔλϕΫτϧ w ͷࣄલ෼෍ p(w) = N(w|0, α−1I) (2.19) ΛԾఆ͢Δɻ ͜͜Ͱɺp(w) ͔Β w ͕༩͑ΒΕͨͱ͠ɺ͞Βʹσʔλ఺ x ͕༩͑ΒΕ Δͱɺ(2.18) ΑΓ y(x) ͷ஋͕ܾ·Δɻ ͭ·Γɺw ͷ֬཰෼෍͸ x ͕༩͑ΒΕͨ࣌ͷ y(x) ͷ֬཰෼෍Λಋ͘ɻ ࣮༻తʹ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͕༩͑ΒΕ͍ͯΔ࣌ ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y(x1 ), y(x2 ), · · · , y(xN )) ͕ w ͷ֬཰෼෍ ͱ (2.18) ʹΑΓಋ͔ΕΔɻ(ΑΓਖ਼֬ʹݴ͏ͱɺX = {x1 , x2 , · · · , xN } ͱͯ͠ɺp(y(x1 ), y(x2 ), · · · , y(xN )|X)) 26 / 74
  25. 2-3. Ψ΢εաఔʹΑΔճؼ ͦ͜Ͱɺ֬཰ม਺ y = (y(x1 ), y(x2 ), ·

    · · , y(xN ))T ͱఆٛ͢Δͱɺ (2.18) ΑΓ y = Φw (2.20) ͕Θ͔Δɻ(Φ ͸ܭըߦྻ (1.6)) ͜ͷ࣌ɺy ͸Ψ΢ε෼෍ (2.19) ʹै͏ w ͷઢܗม׵ΑΓɺy ΋Ψ΢ε ෼෍ʹै͏ɻ Αͬͯɺy ͷ෼෍Λ׬શʹܾఆ͢ΔͨΊʹ͸ฏۉͱڞ෼ࢄߦྻ͕Θ͔Ε ͹Α͘ɺ E[y] = ΦE[w] = 0 (2.21) cov[y] = E[yyT] = ΦE[wwT]ΦT = 1 α ΦΦT = K (2.22) ͱΘ͔Δɻ ͜͜ͰɺK ͸ҎԼͷΑ͏ʹ੒෼ʹΧʔωϧؔ਺Λ΋ͭάϥϜߦྻͰ͋ Δɻ((2.13) ͷఆٛͱ͸ఆ਺ഒҟͳΔ) Knm = k(xn , xm ) = 1 α ϕ(xn )Tϕ(xm ) (2.23) 27 / 74
  26. 2-3. Ψ΢εաఔʹΑΔճؼ Ҏ্Ͱઆ໌ͨ͠ઢܗճؼ͸Ψ΢εաఔͷҰྫͱͳ͍ͬͯΔɻ Ψ΢εաఔͱ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · ·

    · , xN } ͕༩͑ΒΕ͍ͯ Δ࣌ͷग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) = p(y(x1 ), y(x2 ), · · · , y(xN )) ͕Ψ ΢ε෼෍ʹै͏ͱԾఆ͢Δ΋ͷͰ͋Δɻ ͦͷฏۉ͸θϩͱԾఆ͢Δ͜ͱ͕ଟ͘ɺ·ͨڞ෼ࢄ͸ҎԼͷΑ͏ʹΧʔ ωϧͱ͢Δɻ E[y(xn ), y(xm )] = k(xn , xm ) (2.24) ্Ͱઆ໌ͨ͠ઢܗճؼ͸͔֬ʹΨ΢εաఔͷҰྫͱͳ͍ͬͯΔ͜ͱ͕ Θ͔Δɻ 28 / 74
  27. 2-3. Ψ΢εաఔʹΑΔճؼ ͜͜Ͱ͸ɺΨ΢εաఔΛઢܗճؼʹద༻͢Δɻ ໨ඪม਺ tn ͸ग़ྗؔ਺ yn = y(xn )

    Λฏۉͱͨ͠Ψ΢ε෼෍ʹै͏ͱ ͢Δɻ p(tn |yn ) = N(tn |yn , β−1) (2.25) β ͸ਫ਼౓ͷϋΠύʔύϥϝʔλɻ ಠཱੑʹΑΓɺy = (y(x1 ), y(x2 ), · · · , y(xN ))T ͕༩͑ΒΕͨ࣌ͷ t = (t1 , · · · , tN )T ͷ༧ଌ෼෍͸ҎԼͷΑ͏ʹͳΔɻ p(t|y) = N(t|y, β−1IN ) (2.26) ·ͨΨ΢εաఔʹΑΓɺपล෼෍ p(y) ͸ฏۉ͕ 0 Ͱڞ෼ࢄ͕άϥϜߦ ྻ K Ͱ͋ΔΨ΢ε෼෍ʹै͏ͱ͢Δɻ p(y) = N(y|0, K) (2.27) 29 / 74
  28. 2-3. Ψ΢εաఔʹΑΔճؼ (2.26) ͷ p(t|y) ͱ (2.27) ͷ p(y) Λ༻͍Δͱɺ{x1

    , x2 , · · · , xN } ͕༩͑ ΒΕ͍ͯΔ࣌ͷ໨తม਺ t ͷ෼෍ p(t) ͸ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ p(t) = ∫ p(t|y) p(y) dy = N(t|0, C) (2.28) ͜͜Ͱɺڞ෼ࢄ C ͷ੒෼ Cnm ͸ Cnm = k(xn , xm ) + β−1δnm (2.29) Ͱ͋Δɻ(PRML ຊจͷࣜ (2.113)ʙࣜ (2.115) Λ࢖༻ͨ͠ɻ) ڞ෼ࢄ C ʹग़ͯ͘ΔΧʔωϧؔ਺ͱͯ͠Α͘࢖༻͞ΕΔͷ͕ɺҎԼͷ Α͏ͳΧʔωϧͰ͋Δɻ k(xn , xm ) = θ0 exp { − θ1 2 ∥xn − xm ∥2 } + θ2 + θ3 xT n xm (2.30) θ0 , · · · , θ3 ͸ϋΠύʔύϥϝʔλɻ 30 / 74
  29. 2-3. Ψ΢εաఔʹΑΔճؼ զʑ͕஌Γ͍ͨͷ͸ɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · ·

    , xN } ͱ {t1 , t2 , · · · , tN } Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ ໨ඪม਺ tN+1 ͷ෼෍Ͱ͋Δɻͭ·ΓɺtN = (t1 , · · · , tN )T ͱఆٛͨ͠ ࣌ͷ p(tN+1 |tN ) Ͱ͋Δɻ(͜͜Ͱɺೖྗม਺ͷґଘੑ͸লུͨ͠ɻ) p(tN+1 |tN ) ΛٻΊΔͨΊʹɺ·ͣ͸पล֬཰ p(tN+1 ) ͔ΒٻΊΔɻ͜ ͜ͰɺtN+1 = (t1 , · · · , tN+1 )T Ͱ͋Δɻ (2.28) ͷ݁ՌΛར༻͢Δͱɺp(tN+1 ) ͸ p(tN+1 ) = N(tN+1 |0, CN+1 ) (2.31) ͱͳΔɻ 31 / 74
  30. 2-3. Ψ΢εաఔʹΑΔճؼ ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͸ CN+1 = ( CN k

    kT c ) (2.32) Ͱ͋Δɻ͜͜ͰɺCN ͸੒෼͕ (2.29) Ͱ͋ΔΑ͏ͳ N × N ͷߦྻͰɺ k = (k(x1 , xN+1 ), k(x2 , xN+1 ), · · · , k(xN , xN+1 ))T ͳΔϕΫτϧɺ c = k(xN+1 , xN+1 ) + β−1 Ͱ͋Δɻ ͜ͷ݁Ռͱ PRML ຊจͷࣜ (2.81) ͱࣜ (2.82) Λ༻͍Δͱɺp(tN+1 |tN ) ͸Ψ΢ε෼෍ʹै͍ɺͦͷฏۉ m(xN+1 ) ͱ෼ࢄ σ2(xN+1 ) ͸ҎԼͷΑ ͏ʹͳΔɻ m(xN+1 ) = kTC−1 N tN (2.33) σ2(xN+1 ) = c − kTC−1 N k (2.34) ͭ·Γɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ࣌ͷ໨ඪม਺ tN+1 ͷ֬ ཰෼෍͸ฏۉͱ෼ࢄ͕ xN+1 ʹґଘ͢ΔΨ΢ε෼෍ͱͳΔɻ 32 / 74
  31. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ࠓ౓͸Ψ΢εաఔͰΫϥε෼ྨΛߦ͏ɻ ճؼͰ͸ɺ(2.27) ͷΑ͏ʹग़ྗؔ਺ͷಉ࣌෼෍ؔ਺ p(y) ͕Ψ΢ε෼෍ ʹै͏ͱԾఆͨ͠ɻ͜ͷ࣌ɺyn ͸࣮਺શମͷ஋ΛͱΔɻ ෼ྨͰ͸ɺग़ྗ͸

    yn ͸ 0 ≤ yn ≤ 1 ͱͳΔ΂͖Ͱ͋Δɻͦ͜Ͱɺग़ྗͰ ͸ͳ͘׆ੑ an = a(xn ) ͷಉ࣌෼෍ؔ਺Λߟ͑Δ͜ͱʹ͠ɺग़ྗΛ yn = σ(an ) ͱ͢Δɻ ·ͨɺ໨తม਺ tn = 1 ͷ࣌ͷ֬཰Λ p(tn = 1|an ) = σ(an ) ͱ͢Δͱɺ p(tn = 0|an ) = 1 − σ(an ) ΑΓɺ p(tn |an ) = σ(an )tn (1 − σ(an ))1−tn (2.35) ͱͳΔɻ ճؼͷ࣌ͱಉ༷ʹɺ܇࿅σʔλͰ͋Δ {x1 , x2 , · · · , xN } ͱ tN = (t1 , · · · , tN )T Λ࢖༻ͯ͠ɺະ஌ͷೖྗσʔλ xN+1 ͕༩͑ΒΕͨ ࣌ͷ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) ΛٻΊΔɻ(͜͜Ͱ΋ೖྗม਺ͷ ґଘੑ͸লུͨ͠ɻ) 33 / 74
  32. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ·ͣɺaN+1 = (a(x1 ), a(x2 ), · ·

    · , a(xN+1 ))T ͱͯ͠ɺΨ΢εաఔΑΓ ׆ੑͷಉ࣌෼෍ p(aN+1 ) ΛҎԼͷΑ͏ʹԾఆ͢Δɻ(ճؼͰ͸ (2.27) ʹ ରԠ) p(aN+1 ) = N(aN+1 |0, CN+1 ) (2.36) ͜͜Ͱɺڞ෼ࢄߦྻ CN+1 ͷ੒෼͸ҎԼͱ͢Δɻ (CN+1 )nm = k(xn , xm ) + νδnm (2.37) ν ͸ϊΠζ߲Ͱ͋Δɻ ٻΊ͍ͨͷ͸ɺ໨ඪม਺ tN+1 ͷ෼෍ p(tN+1 |tN ) Ͱ͋Γɺ2 ஋෼ྨͰ͸ p(tN+1 = 0|tN ) = 1 − p(tN+1 = 1|tN ) ͳͷͰɺp(tN+1 = 1|tN ) ͷΈΛ ٻΊΕ͹ྑ͍ɻ 34 / 74
  33. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ͜͜Ͱɺ p(tN+1 = 1, tN ) = ∫

    p(tN+1 = 1, tN , aN+1 ) daN+1 = ∫ p(tN+1 = 1|tN , aN+1 )p(aN+1 |tN )p(tN ) daN+1 = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN )p(tN ) daN+1 (2.38) ΑΓɺp(tN+1 = 1|tN ) ͸ҎԼͷΑ͏ʹܭࢉ͞ΕΔɻ p(tN+1 = 1|tN ) = ∫ p(tN+1 = 1|aN+1 )p(aN+1 |tN ) daN+1 (2.39) ͜͜Ͱɺp(tN+1 = 1|aN+1 ) = σ(aN+1 ) Ͱ͋Δɻ ͜ͷੵ෼͸ղੳతʹ࣮ߦ͢Δ͜ͱ͸ෆՄೳͰ͋Γɺ༷ʑͳํ๏Λ༻͍ͯ ۙࣅతʹٻΊΔ͜ͱ͕͞Ε͍ͯΔɻࠓճ͸ϥϓϥεۙࣅ (PRML 4.5.1) Λ༻͍Δɻ 35 / 74
  34. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ͜ͷઅͰ͸ɺϥϓϥεۙࣅΛ༻͍ͯੵ෼ (2.39) ΛධՁ͢Δɻ ·ͣɺp(aN+1 |tN ) ΛҎԼͷΑ͏ʹมܗ͢Δɻ p(aN+1

    |tN ) = ∫ p(aN+1 |aN )p(aN |tN ) daN (2.40) p(aN |tN ) ͸ࣄޙ෼෍Ͱ͋Δɻ ͜͜Ͱɺ৚݅෇͖෼෍ p(aN+1 |aN ) ͸ɺճؼͷ࣌ͷ (2.33) ͱ (2.34) ͷ p(tN+1 |tN ) ͷ݁ՌΛࢀߟʹ͢Δͱɺ p(aN+1 |aN ) = N(aN+1 |kTC−1 N aN , c − kTC−1 N k) (2.41) ͱͳΔɻ 36 / 74
  35. 2-4. Ψ΢εաఔʹΑΔ෼ྨ p(aN |tN ) Λۙࣅ͢Δ (ϥϓϥεۙࣅ)ɻ ͦͷͨΊʹ͸ɺ ∂p(aN |tN

    ) ∂aN = ∇p(aN |tN ) = 0 (2.42) Λຬͨ͢ aN (= a⋆ N ) ͱɺaN = a⋆ N Ͱͷϔοηߦྻ −∇∇ ln p(aN |tN ) ͕ ඞཁͰ͋Δɻ(PRML 4.5.1) ·ͣɺࣄલ෼෍ p(aN ) ͸ p(aN ) = N(aN |0, CN ) (2.43) Ͱ༩͑Δɻ͜Ε͸ (2.36) Ͱ N + 1 → N ͱͨ͠΋ͷɻ ໬౓ؔ਺ p(tN |aN ) ͸σʔλ఺ͷಠཱੑΑΓɺ p(tN |aN ) = N ∏ n=1 σ(an )tn (1 − σ(an ))1−tn = N ∏ n=1 eantn σ(−an ) (2.44) ͱͳΔɻ 37 / 74
  36. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ϕΠζͷఆཧΑΓɺp(aN |tN ) ∝ p(tN |aN )p(aN )

    ͳͷͰɺ࣮ࡍʹܭࢉΛ ͢Δͱɺ a⋆ N = CN (tN − σN ) (2.45) ͱͳΔɻ͜͜ͰɺσN = (σ(a1 ), σ(a2 ), · · · , σ(aN ))T Ͱ͋Δɻ ·ͨɺaN = a⋆ N Ͱͷϔοηߦྻ H ͸ H = W⋆ + C−1 N (2.46) ͱͳΔɻ͜͜ͰɺW ͸ σ(an )(1 − σ(an )) Λର֯੒෼ʹ࣋ͭର֯ߦྻͰ ͋ΓɺW⋆ ͸ aN = a⋆ N Ͱͷ W Ͱ͋Δɻ Αͬͯɺࣄޙ෼෍ p(aN |tN ) ͸ҎԼͷΑ͏ʹۙࣅ͞ΕΔɻ(ϥϓϥε ۙࣅ) p(aN |tN ) ∼ N(aN |a⋆ N , H−1) (2.47) 38 / 74
  37. 2-4. Ψ΢εաఔʹΑΔ෼ྨ (2.41) ͱ (2.47) ΑΓɺҎԼͷΑ͏ʹ (2.40) ͷੵ෼͕ۙࣅͰ͖Δɻ p(aN+1 |tN

    ) ∼ ∫ N(aN+1 |kTC−1 N aN , c−kTC−1 N k)N(aN |a⋆ N , H−1) daN (2.48) PRML ຊจͷࣜ (2.115) ΑΓɺp(aN+1 |tN ) ͸ҎԼͷฏۉͱ෼ࢄΛ࣋ͭ Ψ΢ε෼෍ͱͳΔɻ E[aN+1 |tN ] = kT(tN − σN ) (2.49) var[aN+1 |tN ] = c − kT(W−1 N + CN )−1k (2.50) ͜͜ͰɺWN ͸ (2.46) ͷ W⋆ Ͱ͋Δɻ 39 / 74
  38. 2-4. Ψ΢εաఔʹΑΔ෼ྨ ͜ΕΑΓɺ(2.39) ͷ p(tN+1 = 1|tN ) ͸ҎԼͷΑ͏ʹۙࣅͰ͖Δɻ (PRML

    ຊจͷࣜ (4.153) Λ࢖༻) p(tN+1 = 1|tN ) ∼ ∫ σ(aN+1 )N(aN+1 |kT(tN − σN ), c − kT(W−1 N + CN )−1k) daN+1 ∼ σ ( κ ([ c − kT(W−1 N + CN )−1k ]2 ) · kT(tN − σN ) ) (2.51) ͜͜Ͱɺ κ(a2) = (1 + πa2/8)−1/2 (2.52) Ͱ͋Δɻ 40 / 74
  39. 3. ૄͳղΛ࣋ͭΧʔωϧϚγʔϯ લষͰ͸ɺΧʔωϧΛग़ൃ఺ͱ༷ͨ͠ʑͳΞϧΰϦζϜΛ঺հͨ͠ɻ ͜ΕΒͷΞϧΰϦζϜͰ͸ɺΧʔωϧؔ਺ k(xn , xm ) Λ͢΂ͯͷڭࢣ σʔλͷରͰܭࢉ͠ͳ͍ͱ͍͚ͳ͍ɻ(ྫ͑͹ɺ(2.11))

    ͜ΕʹΑΓɺֶशͱ༧ଌ࣌ʹඇৗʹֻ͕͔࣌ؒΔɻ ͦ͜Ͱ͜ͷষͰ͸ɺڭࢣσʔλͷҰ෦͚ͩʹରͯ͠Χʔωϧؔ਺ k(xn , xm ) Λܭࢉ͢Ε͹ɺ৽͍͠ೖྗͷ༧ଌ͕Ͱ͖ΔΑ͏ͳϞσϧΛ঺ հ͢Δɻ ಛʹαϙʔτϕΫτϧϚγʔϯ (SVM) ʹ͍ͭͯৄ͘͠঺հ͢Δɻ ͜ͷ SVM ͸ࣝผؔ਺Λ༩͑Δ͚ͩͰɺ༧ଌͷ֬཰෼෍͸༩͑ͳ͍ɻ ͦ͜Ͱɺؔ࿈ϕΫτϧϚγʔϯ (RVM) Ͱ͸ɺ֬཰࿦Λ༻͍Δ͜ͱͰɺ ϕΠζਪ࿦ʹج͖ͮ t ͷ༧ଌ෼෍Λ༩͑Δ͜ͱ͕Ͱ͖Δɻ 41 / 74
  40. 3-1. ࠷େϚʔδϯ෼ྨث ·ͣ͸͡ΊʹɺҎԼͷઢܗϞσϧΛ༻͍ͯ 2 ஋෼ྨΛղ͘͜ͱ͔Β࢝ ΊΔɻ y(x, w) = wTϕ(x)

    + b (3.1) ͜͜Ͱɺw = (w1 , w2 , · · · , wM−1 )T ͸ύϥϝʔλϕΫτϧͰ͋Γɺ ϕ(x) = (ϕ1 (x), ϕ2 (x), · · · , ϕM−1 (x))T ͸ೖྗ x Λಛ௃ۭؒʹࣸ૾͢Δ ϕΫτϧؔ਺Ͱ͋Γɺb ͸όΠΞεύϥϝʔλͰ͋Δɻ ͜ͷޙ͙͢ʹΧʔωϧؔ਺Λಋೖ͠ɺಛ௃ۭؒΛཅʹѻΘͳͯ͘Α͘ ͳΔɻ ·ͨڭࢣσʔλʹ͍ͭͯ͸ɺೖྗσʔλͷू߹ {x1 , x2 , · · · , xN } ͦΕ ͧΕʹରԠ͢Δ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } ͱ͢Δɻ 2 ஋෼ྨͳͷͰ໨ඪม਺͸཭ࢄతͳ஋ΛͱΓɺtn ∈ {−1, 1} ͱ͢Δɻ 42 / 74
  41. 3-1. ࠷େϚʔδϯ෼ྨث ·ͨ౰໘ͷؒ͸ɺڭࢣσʔλ͸ಛ௃ۭؒ (ϕ ۭؒ) Ͱઢܗ෼཭ՄೳͰ͋ Δ͜ͱΛԾఆ͢Δɻ ೖྗۭؒ (x ۭؒ)

    Ͱ͸ઢܗ෼཭ՄೳͰͳͯ͘΋ྑ͍ɻ ͭ·Γɺy(x, w) = 0 ͕ڭࢣσʔλΛಛ௃ۭؒͰઢܗ෼཭͢ΔΑ͏ͳ w ͱ b ͕গͳ͘ͱ΋Ұͭ͋Δ͸͋Δͱ͍͏͜ͱͰ͋Δɻ ͦͷΑ͏ͳ w ͱ b Ͱ͸ɺtn = +1 ͳΒ y(xn ) > 0 Ͱ͋Γɺtn = −1 ͳΒ y(xn ) < 0 Ͱ͋ΔͱԾఆ͢Δɻ ͜ΕΒͷ৚݅͸·ͱΊͯ tn y(xn ) > 0 ͱॻ͚Δ͜ͱʹ஫ҙɻ 43 / 74
  42. 3-1. ࠷େϚʔδϯ෼ྨث ڭࢣσʔλΛઢܗ෼཭͢ΔΑ͏ͳ w ͱ b ͸ෳ਺͋Δ͜ͱ͕͋Δɻ SVM Ͱ͸ɺͦͷෳ਺ͷղ͔ΒҰͭΛબͼग़࣌͢ʹɺϚʔδϯͱݺ͹Ε Δ֓೦Λಋೖ͢Δɻ

    Ϛʔδϯͱ͸ɺԼͷਤͷΑ͏ʹɺಛ௃ۭؒͷ෼ྨڥք (y(x) = 0 ͷઢ) ͱڭࢣσʔλͱͷ࠷୹ڑ཭Ͱ͋Δɻ ͜ͷϚʔδϯΛ࠷େʹ͢ΔΑ͏ͳ w ͱ b ΛબͿɻ(Լͷਤͷӈଆ) ෳ਺ͷղ͔ΒҰͭΛબͼग़࣌͢ʹϚʔδϯΛ࠷େʹ͢ΔղΛબͿཧ༝ ͸ PRML 7.1.5 Λࢀরɻ 44 / 74
  43. 3-1. ࠷େϚʔδϯ෼ྨث ͦΕͰ͸ɺڭࢣσʔλΛઢܗ෼཭͢ΔΑ͏ͳ w ͱ b ΛٻΊΔͨΊͷࣜ Λಋग़͢Δɻ ·ͣɺຊεϥΠυͷ 1-2

    ͷ (1.18) ΑΓɺ௒ฏ໘ y(x) = 0 ͱಛ௃্ۭؒ ͷ఺ ϕ(x) ͱͷڑ཭͸ |y(x)|/∥w∥ Ͱ༩͑ΒΕΔͷͰɺ֤ڭࢣσʔλͷ ఺ͱ௒ฏ໘ y(x) = 0 ͱͷڑ཭͸ |y(xn )|/∥w∥ Ͱ༩͑ΒΕΔɻ ࠓɺڭࢣσʔλΛઢܗ෼཭ՄೳͰ͋Δ͜ͱ͔Βɺ|y(xn )| = tn y(xn ) ͱ ͳΔɻ Αͬͯɺ(3.1) ΑΓ֤ڭࢣσʔλͷ఺ͱ௒ฏ໘ y(x) = 0 ͱͷڑ཭͸ tn y(xn ) ∥w∥ = tn (wTϕ(xn ) + b) ∥w∥ (3.2) ͱͳΔɻ 45 / 74
  44. 3-1. ࠷େϚʔδϯ෼ྨث Ϛʔδϯ͸௒ฏ໘ y(x) = 0 ͱڭࢣσʔλͷ࠷୹ڑ཭ min n {

    tn (wTϕ(xn ) + b) ∥w∥ } = 1 ∥w∥ min n [ tn (wTϕ(xn ) + b) ] (3.3) Ͱ͋Γɺ͜ͷϚʔδϯΛ࠷େʹ͢Δ w ͱ b ΛٻΊ͍ͨͷͰɺҎԼͷࣜ Λղ͘͜ͱʹͳΔɻ argmax w,b { 1 ∥w∥ min n [ tn (wTϕ(xn ) + b) ]} (3.4) ͜ΕΛ௚઀ղ͘ͷ͸೉͍͠ͷͰɺw → κw, b → κb ͱ͍͏ม׵Λͯ͠ ΋ɺ௒ฏ໘ y(x) = 0 ͱಛ௃্ۭؒͷ఺ ϕ(xn ) ͱͷڑ཭ |y(xn )|/∥w∥ = tn (wTϕ(xn ) + b)/∥w∥ ͸มԽ͠ͳ͍͜ͱʹ஫໨͢Δɻ 46 / 74
  45. 3-1. ࠷େϚʔδϯ෼ྨث ࠓɺ௒ฏ໘ y(x) = 0 ͱ࠷୹ڑ཭ʹ͋Δσʔλ఺Λ n ͱ͠ɺ tn

    (wTϕ(xn ) + b) = an (3.5) ͢Δɻ(an ≤ aj (j ̸= n)) ͦͯ͠ɺw → an w, b → an b ͱ͍͏ม׵Λ͢Δͱɺ tn (wTϕ(xn ) + b) = 1 (3.6) ͱͳΔɻ j ̸= n Ͱ͋Δ j Ͱ͸ɺw → an w, b → an b ͱ͍͏ม׵Ͱ tj (wTϕ(xj ) + b) = aj an ≥ 1 (3.7) ͱͳΔɻ 47 / 74
  46. 3-1. ࠷େϚʔδϯ෼ྨث ͜ΕΒΛ·ͱΊͯॻ͘ͱ ɺ tn (wTϕ(xn ) + b) ≥

    1 (n = 1, · · · , N) (3.8) ͱͳΔɻ ͜ͷม׵Λ͢Δͱɺղ͖͍ͨࣜ (3.4) ͸ argmax w,b [ 1 ∥w∥ ] = argmin w,b [ 1 2 ∥w∥2 ] (3.9) ͱͳΔɻ ͭ·Γɺ(3.8) ͷ৚݅ͷԼͰ argmin w,b [ 1 2 ∥w∥2 ] (3.10) Λղ͘໰୊ʹؼண͢Δɻ 48 / 74
  47. 3-1. ࠷େϚʔδϯ෼ྨث ͜ͷ࠷খԽ໰୊Λղͨ͘Ίʹ͸ҎԼͷϥάϥϯδϡؔ਺ͷ w, b, a ʹର ͢Δఀཹ఺ΛٻΊΕ͹ྑ͍͜ͱ͕Θ͔Δɻ(PRML ෇࿥ E

    ࢀর) L(w, b, a) = 1 2 ∥w∥2 − N ∑ n=1 an {tn (wTϕ(xn ) + b) − 1} (3.11) ͨͩ͠ɺҎԼͷ৚݅ (Karush-Kuhn-Tucker ৚݅) ͕෇͘ɻ an ≥0 (3.12) tn (wTϕ(xn ) + b) − 1 ≥0 (3.13) an {tn (wTϕ(xn ) + b) − 1} =0 (3.14) ͜ͷ࠷খԽ໰୊ʹ͍ͭͯɺQiita ͷهࣄʹ·ͱΊ·ͨ͠ɻ https://qiita.com/gucchi0403/items/3d5f27f8d3b2ff0e766d 49 / 74
  48. 3-1. ࠷େϚʔδϯ෼ྨث L(w, b, a) Λ w ͱ b Ͱඍ෼ͨࣜ͠Λθϩͱஔ͍ͨࣜ͸

    w = N ∑ n=1 an tn ϕ(xn ) (3.15) 0 = N ∑ n=1 an tn (3.16) ͱͳΓɺ͜ΕΛ༻͍Δͱɺ(3.11) ͷӈล͔Β w ͱ b Λফڈ͢Δ͜ͱ͕ Ͱ͖ɺϥάϥϯδϡؔ਺͸ ˜ L(a) = N ∑ n=1 an − 1 2 N ∑ n=1 N ∑ m=1 an am tn tm k(xn , xm ) (3.17) ͱͳΔɻ ͜͜Ͱɺk(xn , xm ) = ϕ(xn )Tϕ(xm ) ͸Χʔωϧؔ਺Ͱ͋Γɺϥάϥϯ δϡؔ਺ ˜ L(a) ͸Χʔωϧؔ਺Λ௨ͯ͠ͷΈ ϕ(x) ʹґଘ͢Δ͜ͱ͕Θ ͔Δɻ 50 / 74
  49. 3-1. ࠷େϚʔδϯ෼ྨث ͦΕͰ͸ɺ৽͍͠ೖྗ x ͕༩͑ΒΕͨ࣌ͷग़ྗ y(x) Λௐ΂Δɻ (3.1) ʹ (3.15)

    Λ୅ೖ͢Δͱ y(x) = N ∑ n=1 an tn k(x, xn ) + b (3.18) ͱͳΔɻ·ͨɺan ͷຬͨ͢΂͖৚݅͸ҎԼͰ͋Δɻ an ≥0 (3.19) tn y(xn ) − 1 ≥0 (3.20) an {tn y(xn ) − 1} =0 (3.21) ͜ΕΛΈΔͱɺ௒ฏ໘ y(x) = 0 ͱ࠷୹ڑ཭ʹ͋Δσʔλ఺ (tn y(xn ) − 1 = 0) Ҏ֎͸ an = 0 ͱͳΓɺ࠷୹ڑ཭ʹͳ͍σʔλ఺͸༧ ଌ (3.18) ʹඞཁͳ͍͜ͱ͕Θ͔Δɻ 51 / 74
  50. 3-1. ࠷େϚʔδϯ෼ྨث ·ͨɺ͜͜Ͱͷ࠷େϚʔδϯֶश͸ҎԼͷޡࠩؔ਺Λ࠷খԽ໰୊ͱͯ͠ දݱͰ͖Δɻ N ∑ n=1 E∞ (y(xn )tn

    − 1) + λ∥w∥2 (3.22) ͜͜ͰɺE∞ (z) ͸ z ≥ 0 ͷͱ͖ 0ɺͦΕҎ֎ͷͱ͖͸ ∞ ͱͳΔؔ਺Ͱ ͋Δɻ ͭ·Γɺσʔλ఺ͷू߹ͷதͰͻͱͭͰ΋ (3.8) Λຬͨ͞ͳ͍఺͕͋ͬ ͨΒ͜ͷޡࠩؔ਺͸ൃ͢ΔͷͰɺ࠷খԽ͢Δʹ͸͢΂ͯͷσʔλ͕ (3.8) Λຬͨ͢ඞཁ͕͋Δ͜ͱΛද͢ɻ 52 / 74
  51. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ લͷઅͰ͸ɺσʔλ఺͸ಛ௃ۭؒͰઢܗ෼཭Ͱ͖Δ͜ͱΛԾఆͨ͠ɻ (ೖྗۭؒͰ͸ඞͣ͠΋ઢܗ෼཭Ͱ͖Δͱ͸ݶΒͳ͍ɻ) ࠓճ͸σʔλ఺͕ಛ௃ۭؒͰઢܗ෼཭Ͱ͖ͳ͍ͱ͖ͷ͜ͱΛߟ͑Δɻ ͦ͜Ͱɺσʔλ఺͝ͱʹఆٛ͞ΕΔεϥοΫม਺ ξn (≥ 0) Λಋೖ͢Δɻ

    ͜ͷεϥοΫม਺Λ࢖ͬͯɺ৚݅ࣜ (3.8) ΛҎԼͷΑ͏ʹमਖ਼͢Δɻ tn y(xn ) ≥ 1 − ξn (n = 1, · · · , N) (3.23) ͜͜Ͱɺy(x) ͸ (3.1) Ͱ͋Δɻ ·ͨɺ࠷খԽ͢Δؔ਺͸ (3.10) ͷ ∥w∥2/2 ʹϖφϧςΟ߲ΛՃ͑ͨҎ Լͷؔ਺ͱ͢Δɻ C N ∑ n=1 ξn + 1 2 ∥w∥2 (3.24) ͜͜ͰɺC > 0 Ͱ͋Δɻ 53 / 74
  52. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ ৚݅ (3.23) ͱଛࣦؔ਺ (3.24) ͷҙຯΛߟ͑Δɻ ξn = 0

    ͱͳΔσʔλ͸ɺ(3.23) ΑΓ tn y(xn ) ≥ 1 Λຬͨ͢ͷͰɺਖ਼͘͠ ෼ྨ͞Ε͍ͯͯɺϚʔδϯͷ্ (tn y(xn ) = 1)ɺ΋͘͠͸ϚʔδϯΛ௒ ͑ͯਖ਼͍͠ଆʹଐ͢Δ (tn y(xn ) > 1) લઅͰ͸͢΂ͯͷσʔλ఺͕͜ͷΑ͏ʹਖ਼͘͠෼ྨͰ͖ΔԾఆΛͯ͠ ͍ͨɻ ·ͨɺ0 < ξn ≤ 1 ͸Ϛʔδϯ಺ʹ͋Δ͕ਖ਼͘͠෼ྨ͞Ε͍ͯΔσʔλ Λද͢ɻ ͦͯ͠ɺ1 < ξn ͸ޡ෼ྨ͞Ε͍ͯΔ఺Λද͢ɻ 54 / 74
  53. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ ͜ͷΑ͏ʹࠓճͷ৚݅ (3.23) ͸ (3.8) ͱ͸ҧ͍ɺϚʔδϯ಺ʹ͋Δਖ਼͠ ͘෼ྨ͞Ε͍ͯΔ఺΍ޡ෼ྨ͞Εͨ఺ͷଘࡏ΋ڐ͢ɻ ͨͩ͠ɺଛࣦؔ਺ (3.24)

    ͷҰ߲໨ΑΓɺͦΕΒͷ఺͕͋ͬͨΒଛࣦؔ ਺͕େ͖͘ͳͬͯ͠·͏ (ϖφϧςΟ) ͜ͱ͕Θ͔Δɻ զʑ͕ղ͖͍ͨ໰୊͸ෆ౳ࣜ (3.23) ͱ ξn ≥ 0 ͱ͍͏ 2 छྨͷෆ౳ࣜ৚ ݅ͷԼͰ (3.24) Λ࠷খʹ͢Δ͜ͱͳͷͰɺϥάϥϯδϡؔ਺ L(w, b, ξ, a, µ) ͸ҎԼͷΑ͏ʹ͢Δɻ(PRML ෇࿥ E ࢀর) L(w, b, ξ, a, µ) = 1 2 ∥w∥2 + C N ∑ n=1 ξn − N ∑ n=1 an {tn (wTϕ(xn ) + b) − 1 + ξn } − N ∑ n=1 µn ξn (3.25) 55 / 74
  54. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ ͦͯ͠ɺKKT ৚݅͸ҎԼͰ͋Δɻ an ≥ 0 (3.26) tn (wTϕ(xn

    ) + b) − 1 + ξn ≥ 0 (3.27) an (tn (wTϕ(xn ) + b) − 1 + ξn ) = 0 (3.28) µn ≥ 0 (3.29) ξn ≥ 0 (3.30) µn ξn = 0 (3.31) L(w, b, ξ, a, µ) Λ w, b, ξ Ͱඍ෼ͨࣜ͠Λθϩͱஔ͍ͨࣜ͸ҎԼͰ͋Δɻ w = N ∑ n=1 an tn ϕ(xn ) (3.32) 0 = N ∑ n=1 an tn (3.33) an =C − µn (3.34) 56 / 74
  55. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ (3.32)ʙ(3.34) Λ༻͍Δͱɺϥάϥϯδϡؔ਺ (3.25) ͸ ˜ L(a) = N

    ∑ n=1 an − 1 2 N ∑ n=1 N ∑ m=1 an am tn tm k(xn , xm ) (3.35) ͱͳΔɻ ͜ͷؔ਺ͷܗ͸ઢܗ෼཭ՄೳͳԾఆΛͨ͠ͱ͖ͷؔ਺ (3.11) ͱҰக ͢Δɻ ·ͨɺࠓճ৽ͨʹग़͖ͯͨ৚݅ࣜ (3.34) ͱɺ(3.26), (3.29) ΑΓ an ʹ͸ ҎԼͷ৚͕݅෇͘ɻ 0 ≤ an ≤ C (3.36) 57 / 74
  56. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ ͞Βʹ (3.32) Λ (3.1) ʹ୅ೖ͢Δͱ y(x) = N

    ∑ n=1 an tn k(x, xn ) + b (3.37) ͱͳΓɺઢܗ෼཭ՄೳͳԾఆΛͨ͠ͱ͖ࣜ (3.18) ͱಉࣜ͡Ͱ͋Δ͜ͱ ͕Θ͔Δɻ ·ͣ (3.37) ΑΓɺan = 0 ͱͳΔ఺͸৽ͨͳೖྗͷ༧ଌʹد༩͠ͳ͍ɻ Ұํɺan ̸= 0 ͱͳΔ఺͸৽ͨͳೖྗͷ༧ଌʹد༩͠ɺ (3.28) ΑΓ tn (wTϕ(xn ) + b) − 1 + ξn = 0 (3.38) Λຬͨ͢αϙʔτϕΫτϧͰ͋Δɻ ͜ΕΑΓɺࠓճͷ৔߹͸ઢܗ෼཭ՄೳͳԾఆΛͨ͠ͱ͖ͱ͸ҧ͍ɺξn ͷ஋ʹΑͬͯ͸ɺϚʔδϯ্ʹ৐͍ͬͯͳ͍఺Ͱ΋αϙʔτϕΫτϧʹ ͳΓ͏Δɻ 58 / 74
  57. 3-2. ॏͳΓ߹͏Ϋϥε͕͋Δ෼෍ ͨͱ͑͹ɺan ̸= 0 ͱͳΔ఺ͷ಺ɺ0 < an < C

    ͷͱ͖͸ɺ(3.34) ΑΓ µn > 0 ͱͳΓɺ (3.31) ΑΓ ξn = 0 ͱͳΔ఺Ͱ͋Δ͜ͱ͕Θ͔Δɻ ξn = 0 ͱͳΔ఺͸ tn (wTϕ(xn ) + b) = 1 Λຬͨ͢఺ͳͷͰɺϚʔδϯ ڥք্ʹଘࡏ͢Δ఺Ͱ͋Δɻ ·ͨɺan = C ͷͱ͖͸ɺ(3.34) ΑΓ µn = 0 ͱͳΓɺ (3.30), (3.31) Α Γ ξn ≥ 0 ͱͳΔ఺Ͱ͋Δ͜ͱ͕Θ͔Δɻ Ҏલͷٞ࿦ʹΑΓɺξn = 0 ͱͳΔ఺͸Ϛʔδϯڥք্ʹଘࡏ͠ɺ 0 < ξn ≤ 1 ͷ఺͸Ϛʔδϯ಺ʹ͋Δ͕ਖ਼͘͠෼ྨ͞Ε͍ͯΔσʔλͰ ͋Γɺͦͯ͠ɺ1 < ξn ͸ޡ෼ྨ͞Ε͍ͯΔ఺Ͱ͋Δɻ an ̸= 0 ͱͳΔ఺͕αϙʔτϕΫτϧͳͷͰɺ͔ͨ͠ʹϚʔδϯ্ʹ ৐͍ͬͯͳ͍఺Ͱ΋αϙʔτϕΫτϧʹͳΓ͏Δ͜ͱ͕͋Δɻ 59 / 74
  58. 3-3. ճؼͷͨΊͷ SVM ͜Ε·Ͱ͸෼ྨ໰୊ʹରͯ͠ SVM Λద༻͖͕ͯͨ͠ɺࠓ౓͸ճؼ໰୊ ʹద༻ͯ͠ΈΔɻ ͦ͜Ͱ͍ͭ΋ͷΑ͏ʹҎԼͰఆٛ͞ΕΔਖ਼ଇԽ͞Εͨޡࠩؔ਺Λ࠷খ Խ͢Δ͜ͱΛߟ͑Δɻ 1

    2 N ∑ n=1 {yn − tn }2 + λ 2 ∥w∥2 (3.39) ͜͜Ͱɺyn = y(xn ) = wTϕ(xn ) + b Ͱ͋Δɻ ࠓճ͸͜ͷޡࠩؔ਺ͷೋ৐࿨ͷ෦෼Λ ϵ ڐ༰ޡࠩؔ਺ Eϵ (yn − tn ) ʹஔ ͖׵͑Δɻ ϵ ڐ༰ޡࠩؔ਺ Eϵ (yn − tn ) ͱ͸ɺҎԼͷΑ͏ͳؔ਺Ͱ͋Γɺ Eϵ (yn − tn ) = { 0 (|yn − tn | ≤ ϵ) |yn − tn | − ϵ (otherwise) (3.40) |yn − tn | ͷ஋͕͋Δ ϵ (> 0)(ϋΠύʔύϥϝʔλ) ະຬͷͱ͖͸ 0(ڐ ༰)ɺϵ Ҏ্ͷͱ͖͸ઢܗͷίετΛ༩͑Δؔ਺Ͱ͋Δɻ 60 / 74
  59. 3-3. ճؼͷͨΊͷ SVM ͜ͷ ϵ ڐ༰ޡࠩؔ਺Λ༻͍ͯɺޡࠩؔ਺ (3.39) ΛҎԼͷΑ͏ʹमਖ਼ ͢Δɻ C

    N ∑ n=1 Eϵ (yn − tn ) + 1 2 ∥w∥2 (3.41) ͜͜ͰɺC ͸ਖ਼ଇԽύϥϝʔλ λ ͷٯ਺Ͱ͋Δɻ ϵ ڐ༰ޡࠩؔ਺ Eϵ (yn − tn ) ͸ 0 ͱͳΔͷ͸ɺyn − ϵ ≤ tn ≤ yn + ϵ ͱͳ Δͱ͖Ͱ͋Γɺ͜ͷൣғΛ ϵ νϡʔϒͱ͍͏ɻ ճؼؔ਺ y(x) ͱ ϵ νϡʔϒͱσʔλ఺ͷਤ͕ҎԼͰ͋Δɻ 61 / 74
  60. 3-3. ճؼͷͨΊͷ SVM ෼ྨ໰୊ͷ SVM ͱಉ༷ʹɺճؼͰ΋εϥοΫม਺Λ༻͍ͯ࠷దԽ໰୊ Λදݱ͢Δ͜ͱ͕Ͱ͖Δɻ ࠓճ͸ 2 ͭͷਖ਼ͷεϥοΫม਺

    ξn ͱ ˆ ξn Λ༻ҙ͠ɺϵ νϡʔϒͷதʹ͋ Δσʔλ఺͸ yn − ϵ ≤ tn ≤ yn + ϵ Λຬͨ͢͜ͱ͔Βɺϵ νϡʔϒͷ্ଆ ʹ֎ΕΔ఺ͱԼଆʹ֎ΕΔ఺Λڐ༰͢Δ੍໿৚݅͸ҎԼͷΑ͏ʹͳΔɻ tn ≤ y(xn ) + ϵ + ξn (3.42) tn ≥ y(xn ) − ϵ − ˆ ξn (3.43) ͜ͷεϥοΫม਺Λ༻͍Δͱɺޡࠩؔ਺͸ҎԼͷΑ͏ʹमਖ਼Ͱ͖Δɻ C N ∑ n=1 (ξn + ˆ ξn ) + 1 2 ∥w∥2 (3.44) 62 / 74
  61. 3-3. ճؼͷͨΊͷ SVM ৚݅ (3.42), (3.43) ͱଛࣦؔ਺ (3.44) ͷҙຯΛߟ͑Δɻ ϵ

    νϡʔϒͷதʹ͋Δσʔλ఺͸ y(xn ) − ϵ ≤ tn ≤ y(xn ) + ϵ Ͱ͋Γɺ ৚݅ (3.42), (3.43) ΑΓ ξn = ˆ ξn = 0 ͱ͢Δ͜ͱ͕Ͱ͖ɺ(3.44) ͷҰ߲ ໨ʹ࠷খͷد༩ (= 0) Λ༩͑Δɻ ϵ νϡʔϒ͔Β্ଆʹ֎Ε͍ͯΔσʔλ͸ y(xn ) − ϵ ≤ tn ͸ຬͨ͢ͷ Ͱɺˆ ξn = 0 ͱͰ͖Δ͕ɺtn ≤ y(xn ) + ϵ ͸ຬͨ͞ͳ͍ͷͰɺξn > 0 ͱͳ Β͟ΔΛಘͳ͍ɻ ξn > 0 ͸ (3.44) ͷҰ߲໨ʹਖ਼ͷد༩ (ϖφϧςΟ) Λ༩͑Δɻ ಉ༷ͷٞ࿦ʹΑΓɺϵ νϡʔϒ͔ΒԼଆʹ֎Ε͍ͯΔσʔλ͸ ξn = 0 ͱͰ͖Δ͕ɺˆ ξn > 0 ͱͳΒ͟ΔΛಘͳ͍ɻ 63 / 74
  62. 3-3. ճؼͷͨΊͷ SVM ࣍ʹෆ౳ࣜ৚݅ (3.42), (3.43) ͱ ξn ≥ 0,

    ˆ ξn ≥ 0 ͷԼͰ (3.44) Λ࠷খʹ ͢Δ͜ͱΛߟ͑Δɻ ͳͷͰɺϥάϥϯδϡؔ਺͸ҎԼͷΑ͏ʹ͢Δɻ L =C N ∑ n=1 (ξn + ˆ ξn ) + 1 2 ∥w∥2 − N ∑ n=1 (µn ξn + ˆ µn ˆ ξn ) − N ∑ n=1 an (ϵ + ξn + yn − tn ) − N ∑ n=1 ˆ an (ϵ + ˆ ξn − yn + tn ) (3.45) ͜͜Ͱɺyn = y(xn ) = wTϕ(xn ) + b Ͱ͋Δɻ 64 / 74
  63. 3-3. ճؼͷͨΊͷ SVM ͦͯ͠ɺKKT ৚݅͸ҎԼͰ͋Δɻ an ≥ 0, ˆ an

    ≥ 0 (3.46) ϵ + ξn + yn − tn ≥ 0, ϵ + ˆ ξn + yn − tn ≥ 0 (3.47) an (ϵ + ξn + yn − tn ) = 0, ˆ an (ϵ + ˆ ξn + yn − tn ) = 0 (3.48) µn ≥ 0, ˆ µn ≥ 0 (3.49) ξn ≥ 0, ˆ ξn ≥ 0 (3.50) µn ξn = 0, ˆ µn ˆ ξn = 0 (3.51) 65 / 74
  64. 3-3. ճؼͷͨΊͷ SVM ·ͨɺϥάϥϯδϡؔ਺ (3.45) Λ w ͱ b ͱ

    ξn ͱ ˆ ξn Ͱඍ෼ͨࣜ͠Λθ ϩͱஔ͍ͨࣜ͸ҎԼͰ͋Δɻ w = N ∑ n=1 (an − ˆ an )ϕ(xn ) (3.52) 0 = N ∑ n=1 (an − ˆ an ) (3.53) an = C − µn , ˆ an = C − ˆ µn (3.54) ͜ΕΒͷࣜΛ༻͍Δͱɺϥάϥϯδϡؔ਺ (3.45) Λ an ͱ ˆ an ͷΈͷؔ ਺Ͱ͔͚ͯɺҎԼͷΑ͏ʹͳΔɻ ˜ L = − 1 2 N ∑ n=1 N ∑ m=1 (an − ˆ an )(am − ˆ am )k(xn , xm ) − ϵ N ∑ n=1 (an + ˆ an ) + N ∑ n=1 (an − ˆ an )tn (3.55) 66 / 74
  65. 3-3. ճؼͷͨΊͷ SVM ·ͨɺ৚݅ࣜ (3.54) ͱɺ(3.46), (3.49) ΑΓ an ͱ

    ˆ an ʹ͸ҎԼͷ৚͕݅ ෇͘ɻ 0 ≤ an ≤ C (3.56) 0 ≤ ˆ an ≤ C (3.57) ͞Βʹɺ(3.52) Λ y(x) = wTϕ(x) + b ʹ୅ೖ͢Δͱɺ y(x) = N ∑ n=1 (an − ˆ an )k(x, xn ) + b (3.58) ͱͳΔɻ 67 / 74
  66. 3-3. ճؼͷͨΊͷ SVM ༧ଌʹد༩͢ΔαϙʔτϕΫτϧͷੑ࣭ΛٻΊΔɻ ·ͣɺ(3.48) ΑΓɺan ͕θϩҎ֎ͷ఺͸ ϵ + ξn

    + yn − tn = 0 Λຬͨ͢ɻ ͜Ε͸ ϵ νϡʔϒͷڥք্ (ξn = 0) ΋͘͠͸ ϵ νϡʔϒͷ্ଆ (ξn > 0) ͷ఺Ͱ͋Δɻ ·ͨɺˆ an ͕θϩҎ֎ͷ఺͸ ϵ + ˆ ξn − yn + tn = 0 Λຬͨ͢ɻ ͜Ε͸ ϵ νϡʔϒͷڥք্ (ˆ ξn = 0) ΋͘͠͸ ϵ νϡʔϒͷԼଆ (ˆ ξn > 0) ͷ఺Ͱ͋Δɻ ͞Βʹɺϵ + ξn + yn − tn = 0 ͱ ϵ + ˆ ξn − yn + tn = 0 ͕ಉ࣌ʹ੒Γཱͭ ͱԾఆ͢Δͱɺ͜ΕΒΛ଍͢ͱ 2ϵ + ξn + ˆ ξ = 0 (3.59) ͱͳΓɺϵ > 0 ͔ͭ ξn ≥ 0 ͔ͭ ˆ ξn ≥ 0 ΑΓɺ(3.59) ͸੒ཱ͠ͳ͍ɻ (ໃ६) Αͬͯɺϵ+ξn +yn −tn = 0 ͱ ϵ+ ˆ ξn −yn +tn = 0 ͸ಉ࣌ʹ੒Γཱͨͳ ͍͜ͱ͕Θ͔Γɺ͜Ε͸ an ͱ ˆ an ͷͲͪΒ͔͸θϩʹͳΔ͜ͱΛද͢ɻ 68 / 74
  67. 3-3. ճؼͷͨΊͷ SVM (3.58) ΑΓɺ༧ଌʹد༩͢Δ఺͸ an ΋͘͠͸ ˆ an ͷͲͪΒ͔͕θϩͰͳ

    ͍఺Ͱ͋Δɻ(্ͷٞ࿦ΑΓɺan ͱ ˆ an ͷͲͪΒ͔͸ඞͣθϩʹͳΔɻ) ͭ·Γɺan = ˆ an = 0 ͷ఺͸༧ଌʹد༩͠ͳ͍఺Ͱ͋Γɺ͜ΕΒ͸ ϵ νϡʔϒ಺ʹ͋Δ఺Ͱ͋Δɻ ҰํͰɺan ΋͘͠͸ ˆ an ͷͲͪΒ͔͕θϩͰͳ͍఺͸ ϵ νϡʔϒͷڥք ্ɺ΋͘͠͸ ϵ νϡʔϒͷ֎ଆʹ͋Δ఺Ͱ͋Γɺ͜ΕΒ͕αϙʔτϕΫ τϧͱͳΔɻ Αͬͯɺ༧ଌʹ͓͍ͯɺૄͳղ͕ಘΒΕɺαϙʔτϕΫτϧͷΈߟྀ͢ Ε͹͍͍͜ͱ͕Θ͔Δɻ 69 / 74
  68. 3-4. ճؼͷͨΊͷ RVM SVM Ͱ͸ɺ৽͘͠༩͑ΒΕͨະ஌ͷೖྗ x ʹରͯ͠ɺy(x)(ࣝผؔ਺) ͷਖ਼ෛʹΑΓΫϥε෼͚Λ͢Δ͚ͩͰ͋Γɺ໨ඪϥϕϧ t ͷ༧ଌ෼෍͸

    ༩͑ͳ͍ɻ ͦΕʹରͯ͠ɺRVM ͸֬཰࿦Λ༻͍Δ͜ͱͰɺϕΠζਪ࿦ʹج͖ͮ t ͷ༧ଌ෼෍Λ༩͑Δ͜ͱ͕Ͱ͖Δɻ RVM ͸ SVM ͱಉ༷ʹճؼʹ΋෼ྨʹ΋༻͍Δ͜ͱ͕Ͱ͖ɺ·ͣ͸ճ ؼʹ͍ͭͯઆ໌͢Δɻ ·ͣɺग़ྗ y(x) ΛҎԼͷΑ͏ʹؔ਺ k(x, xn ) Λ༻͍ͯల։͢Δɻ y(x) = N ∑ n=1 wn k(x, xn ) + b (3.60) ͜͜Ͱɺ{wn } ͱ b ͸ύϥϝʔλͰ͋Γɺ(3.60) ͸ SVM ͷճؼͷ࣌ʹग़ ͖ͯͨग़ྗ (3.58) ͱಉ͡ܗΛ͍ͯ͠Δɻ ͨͩ͠ɺؔ਺ k(x, x′) ͸೚ҙͷؔ਺Ͱ͋Δ఺͕ (3.58) ͱҟͳΔɻ 70 / 74
  69. 3-4. ճؼͷͨΊͷ RVM ೖྗσʔλͷू߹ X = {x1 , x2 ,

    · · · , xN } ͱͦΕͧΕʹରԠ͢Δ໨ඪม ਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺt = (t1 , t2 , · · · , tN )T ͱ͢Δͱɺ໬ ౓ؔ਺͸ҎԼͷΑ͏ʹͳΔɻ p(t|X, w, β) = N ∏ n=1 p(tn |xn , w, β) (3.61) ·ͨɺࣄલ෼෍ͱͯ͠ҎԼΛԾఆ͢Δɻ p(w|α) = N ∏ i=1 N(wi |0, α−1 i ) (3.62) ͜͜Ͱɺύϥϝʔλ wi ͝ͱʹ௒ύϥϝʔλ αi Λఆ͍ٛͯ͠Δ͜ͱʹ ஫ҙɻ 71 / 74
  70. 3-4. ճؼͷͨΊͷ RVM ໬౓ؔ਺ͱࣄલ෼෍Λ༻͍ΔͱɺҎԼͷୈೋछ໬౓ؔ਺͕ٻ·Δɻ p(t|X, α, β) = ∫ p(t|X,

    w, β)p(w|α) dw (3.63) ͜ͷୈೋछ໬౓ؔ਺Λ༻͍ͯɺΤϏσϯεۙࣅ (PRML 3.5) ʹΑΓɺϋ Πύʔύϥϝʔλ α, β ΛܾΊΔ͜ͱ͕Ͱ͖Δɻ ࣮ࡍʹ͜ͷϞσϧઃఆͰ α ΛٻΊͯΈΔͱɺ{αi } ͷҰ෦͕ແݶେʹൃ ࢄ͢Δɻ ͜Ε͸ɺରԠ͢Δ wi ͕ฏۉɺ෼ࢄͱ΋ʹθϩͷΨ΢ε෼෍ʹै͏͜ͱ Λҙຯ͢Δɻ(PRML 3.5) Αͬͯɺ(3.60) ΑΓɺରԠ͢Δڭࢣσʔλ xi ͸༧ଌʹ͸د༩ͤͣɺૄ ͳղ͕ಘΒΕΔɻ 72 / 74
  71. 3-5. ෼ྨͷͨΊͷ RVM ࠓ౓͸ RVM Λ෼ྨ໰୊ (ೋ஋෼ྨ) ʹద༻͢Δɻ ೋ஋෼ྨͰ͸ɺग़ྗ͸ 0

    ≤ y(x) ≤ 1 ͱͳΔ΂͖ͳͷͰɺ(3.60) ʹϩδ εςΟοΫγάϞΠυͰม׵ͨ͠ҎԼΛߟ͑Δɻ y(x) = σ ( N ∑ n=1 wn k(x, xn ) + b ) (3.64) ·ͨɺೖྗσʔλͷू߹ X = {x1 , x2 , · · · , xN } ͱͦΕͧΕʹରԠ͢Δ ໨ඪม਺ͷू߹ {t1 , t2 , · · · , tN } Λ༻ҙ͠ɺt = (t1 , t2 , · · · , tN )T ͱ͢Δ ͱɺ໬౓ؔ਺͸ 2-4 ͷͱ͖ͱಉ͡Α͏ʹɺϕϧψʔΠ෼෍ͷੵͱͳΔɻ p(t|w) = N ∏ n=1 ytn n (1 − yn )1−tn (3.65) ͜͜Ͱɺyn = y(xn ) Ͱ͋Δɻ 73 / 74
  72. 3-5. ෼ྨͷͨΊͷ RVM ·ͨɺࣄલ෼෍ͱͯ͠ҎԼΛԾఆ͢Δɻ p(w|α) = N ∏ i=1 N(wi

    |0, α−1 i ) (3.66) ͜͜Ͱ΋ɺύϥϝʔλ wi ͝ͱʹ௒ύϥϝʔλ αi Λఆ͍ٛͯ͠Δ͜ͱ ʹ஫ҙɻ ໬౓ؔ਺ͱࣄલ෼෍Λ༻͍ΔͱɺҎԼͷୈೋछ໬౓ؔ਺͕ٻ·Δɻ p(t|X, α, β) = ∫ p(t|X, w, β)p(w|α) dw (3.67) ͨͩ͠ɺϩδεςΟοΫγάϞΠυؔ਺͕ݪҼͰ͜ͷੵ෼͸ղੳతʹ࣮ ߦෆՄೳɻ PRML Ͱ͸ϥϓϥεۙࣅ (PRML 4.5.1) Λ༻͍ͯۙࣅతʹੵ෼Λ࣮ߦ͠ ͍ͯΔɻ 3-4 ͷճؼͷͱ͖ͱಉ༷ʹ α ΛٻΊͯΈΔͱɺ{αi } ͷҰ෦͕ແݶେʹ ൃࢄ͢ΔͷͰɺૄͳղ͕ಘΒΕΔɻ 74 / 74