Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Watanabe 6.3

Naoya Umezaki
October 25, 2018
340

Watanabe 6.3

Sumio Watanabe, Algebraic Geometry and Statistical Learning Theory (Cambridge Monographs on Applied and Computational Mathematics)のゼミでの発表資料。4章の復習と6.3について。汎化誤差の漸近挙動を調べる。

Naoya Umezaki

October 25, 2018
Tweet

Transcript

  1. Gibbsਪଌ ࣄޙ෼෍ʹैͬͯύϥϝʔλ ˆ wΛαϯϓϦϯά ͠ɺˆ p(x) = p(x| ˆ w)Λ༧ଌ෼෍ͱ͢Δɻ

    ൚ԽޡࠩGg q(x)ͱ ˆ p(x)ͷKL divergenceΛwʹ͍ͭͯࣄޙ෼ ෍p(w|Dn )Ͱੵ෼ͨ͠΋ͷ Gg = ∫ W K(w)p(w|Dn )dw
  2. ໰୊ n → ∞ͰnGg ͕ͲͷΑ͏ͳ֬཰ม਺ʹऩଋ͢ Δ͔ʁ ओཁ߲ Gg (ϵ) =

    ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ p(w|Dn )dw ิ୊ 1 (Lemma 6.3). nGg − nGg (ϵ)͸0ʹ֬཰ ऩଋ͢Δɻ
  3. ओཁ߲ͷධՁ Gg (ϵ) = ∫ K(w)≤ϵ K(w)p(w|Dn )dw ∫ K(w)≤ϵ

    p(w|Dn )dw = E[K(w)|K(w)≤ϵ ] ͷධՁΛ͍͕ͨ͠௚઀͸೉͍͠ɻ ಛҟ఺ղফΛ࢖͏
  4. ඪ४ܗ f(x, g(u)) = log( q(x) p(x|g(u)) ) = a(x,

    u)uk ͱ͠ K(g(u)) = u2k Kn (g(u)) = u2k − 1 √ n ukξn (u) ξn (u) = 1 √ n n ∑ i=1 {a(Xi , u) − EX [a(X, u)]}
  5. ξn (u)͸αϯϓϧDn ʹґଘͨ֬͠཰աఔɻξn ͸ Gaussաఔξʹ๏ଇऩଋ͢Δɻ ิ୊ 2 (6.51). G∗ g

    (ξn ) = Ey,t [t|ξn ] ͱఆٛ͢Δͱ nGg (ϵ) − G∗ g (ξn ) →P 0
  6. ಛҟ఺ղফM ্Ͱͷੵ෼Ey,t ͱEu • ξ(u): M ্ͷC1 ڃؔ਺ʢαϯϓϧͷ֬཰աఔʣ • f(u):

    M ্ͷؔ਺ʢK(w)ʹର͠f(u) = u2kʣ • 0 ≤ σ ≤ 1 Eσ u [f(u)|ξ] = ∑ α∈A ∫ [0,b]d f(u)Z(u, ξ)du ∑ α∈A ∫ [0,b]d Z(u, ξ)du A͸࠲ඪۙ๣ͷʢ༗ݶʣू߹ɻ
  7. Z(u, ξ)͸ uhϕ∗(u) exp(−βnu2k+β √ nukξ(u)−σuka(X, u)) ࣄޙ෼෍p(w|Dn )ͱͷؔ܎ɻ •

    uhϕ∗(u)͕ࣄલ෼෍ϕ(w)ʹରԠɻ • σ = 0ͱͯ͠ Z0 n p(w|Dn ) = exp(−nβKn (w)) = exp(−βnu2k + β √ nukξn (u))
  8. ຊ࣭త෦෼ ࠲ඪu = (x, y)ͱຊ࣭త෦෼A∗ ⊂ AʢK(w)ͷ ಛҟ఺ղফ͔Βܾ·Δʣ Ey,t [f(y,

    t)|ξ] = ∑ α∈A∗ ∫ dt ∫ [0,b]d−m f(y, t)Z0 (y, t, ξ)du ∑ α∈A∗ ∫ dt ∫ [0,b]d−m Z0 (y, t, ξ)du
  9. Z0 (y, t, ξ) = γb yµtλ−1 exp(−βt+β √ tξ0

    (y))ϕ∗ 0 (y) ิ୊ 4 (Lemma 6.6, p = 1, f = 1, ξ = ξn ). |E0 u [nu2k|ξn ] − Ey,t [t|ξn ]| ≤ D(ξn , 1, ϕ∗) log n ͜Εͷূ໌ʹ4ষͰͷ෼഑ؔ਺ͷܭࢉΛ༻͍Δɻ
  10. ऩଋઌͷߏ੒ ఆٛ 1 (6.46). M ্ͷؔ਺ψ(u)ʹର͠ G∗ g (ψ) =

    Ey,t [t|ψ] ͜ΕΛ࢖͖ͬͯͬ͞ͷิ୊Λॻ͖௚͢ͱ ิ୊ 5. |nGg (ϵ) − G∗ g (ξn )| ≤ D(ξn , 1, ϕ∗) log n
  11. ݁࿦ ξn ͕ξʹ๏ଇऩଋ͢Δ͜ͱ͔Β • ิ୊4Λ༻͍ͯnGg (ϵ) − G∗ g (ξn

    ) → 0 • G∗ g (ξn ) − G∗ g (ξ) → 0 ͕ݴ͑Δɻ શͯ߹ΘͤͯnGg − G∗ g (ξ) → 0͕ূ໌Ͱ͖ͨɻ G∗ g (ξ)͸ξʹґଘͨ֬͠཰ม਺Ͱ͋Δɻ
  12. 4ষͷ෮श θʔλؔ਺ ։ू߹U ⊂ Rd ্ͷඇෛղੳతؔ਺K(w)ͱίϯ ύΫτ୆C∞ ؔ਺ϕ(w)ʹର͠ɺ ζ(z) =

    ∫ K(w)zϕ(w)dw ͱఆٛ͢Δɻ͜ΕͷۃͷҐஔͱͦͷҐ਺͸ͲͷΑ ͏ͳ৘ใΛ͔࣋ͭʁ
  13. K ʹ͍ͭͯͷಛҟ఺ղফʹΑΓɺnormal crossing ͷ৔߹ͷੵ෼Z(n, ξ, ϕ)Λ༻͍ͯ Z = ∑ α

    Z(n, ξ ◦ gα , ϕ ◦ gα |g′ α |) ͱॻ͚ΔͷͰɺZ(n, ξ, ϕ)ʹ͍ͭͯௐ΂Δͷ͕4.4 ͷ໨ඪɻ
  14. Zp(n, ξ, ϕ) = ∫ [0,b]r dx ∫ [0,b]s dyK(X,

    y)pxhyh′ ϕ(x, y) exp(−nβK(x, y)2 + √ nβK(x, y)ξ(x, y)) ͱఆٛ͢Δɻ
  15. ͞Βʹ͜ΕͰξ = 0, ϕ = 1ͱஔ͍ͨ΋ͷΛ Zp(n) = ∫ [0,b]r

    dx ∫ [0,r]s dy K(x, y)pxh, yh′ exp(−nβK(x, y)2) ͱॻ͘͜ͱʹ͢Δɻ
  16. ఆཧ 1 (Theorem 4.7). hi + 1 2ki = λ

    ͕ҰఆͰ h′ j + 1 2k′ j > λ ͱ͢ΔɻK(x, y) = xkyk′ ͷͱ͖ʹɺ͋Δ a1 , a2 > 0͕ଘࡏͯ͠೚ҙͷnʹରͯ͠ a1 (log n)r−1 nλ+p ≤ Zp(n) ≤ a2 (log n)r−1 nλ+p