Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ベイズ深層学習(5.1~5.2)

catla
February 28, 2020

 ベイズ深層学習(5.1~5.2)

内容:ベイズニューラルネットワーク(5.1節),近似ベイズ推論の高速化(5.2節)

catla

February 28, 2020
Tweet

More Decks by catla

Other Decks in Science

Transcript

  1. ϕΠζχϡʔϥϧωοτϫʔΫϞσϧ ɹઃఆ ɹɹೖྗσʔλ ɼ؍ଌσʔλ ͓Αͼύϥϝʔλͷಉ࣌෼෍ ΛҎԼͷΑ͏ʹ͓͘ɽ   ɹɹ؍ଌσʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱԾఆ͢Δɽ 

     ɹɹ ͸χϡʔϥϧωοτͷؔ਺஋ ͸ݻఆͷϊΠζύϥϝʔλɽ ɹɹύϥϝʔλ͸ɼҎԼͷ෼෍͔ΒಘΒΕΔͱઃఆ͢Δɽ ɹ  ɹ ͸ݻఆͷϊΠζύϥϝʔλɽ ɹ ɹɹ X = {x1 , …, xN } Y = {y1 , ⋯, yn } p(Y, W|X) = p(W) N ∏ n=1 p(yn |w, xn ) p(yn |xn , W) = (yn | f(xn ; W), σ2 y I) f(xn ; W) σ2 y p(w) = (w|0,σ2 w ) where w ∈ W σ2 w
  2. ϥϓϥεۙࣅʹΑΔֶश ϥϓϥεۙࣅ p(Z|X) ≈ (Z|ZMAP , {Λ(ZMAP )} −1 )

    Λ(Z) = − ∇2 Z log p(Z|X) ɹ؆୯ͷͨΊʹ//ͷग़ྗͷ࣍ݩΛͱ͢Δɽ ࣄޙ෼෍ͷۙࣅ ɹࣄޙ෼෍ͷ."1ਪఆ஋ΛٻΊΔɽ ɹɹ  Ͱ࠷େΛऔΔύϥϝʔλ ΛٻΊΔɽ ɹࣄޙ෼෍࠷େԽɹʹɹର਺ࣄޙ෼෍࠷େԽɹͳͷͰɼର਺ࣄޙ෼෍ͷޯ഑Λར༻͢Δ ͱɼҎԼͷΑ͏ͳ࠷దԽʹΑͬͯ."1ਪఆ஋͕ٻΊΒΕΔɽ ɹ   ͸ֶश཰ɽ ⟹ p(W|Y, X) WMAP Wnew = Wold + α∇W log p(W|Y, X)| W=Wold α
  3. ϥϓϥεۙࣅʹΑΔֶश ࣄޙ෼෍ͷۙࣅ ɹࣄޙ෼෍ͷޯ഑͸ɼҎԼͷΑ͏ʹٻΒΕΔɽɹɹɹ ɹɹɹɹɹɹɹɹɹɹ  Αͬͯɼ ɹɹɹɹɹɹɹɹɹ  ύϥϝʔλ Ͱภඍ෼͢ΔͱɼҎԼͷΑ͏ʹίετؔ਺ͷඍ෼ͱͳΔɽ

    ɹɹɹɹɹɹɹɹɹ   ͸ɼͦΕͧΕ//ͷޡࠩؔ਺ͱ֤ύϥϝʔλͷࣄલ෼෍ʹ༝དྷ͢Δਖ਼ଇԽ ߲Ͱ͋Δɽ p(W|Y, X) = p(W)p(Y|X, W) p(X|Y) ∝ p(W)p(Y|X, W) log p(W|Y, X) = log p(Y|X, W) + log p(W) + c = N ∑ n=1 log p(yn |xn , W) + ∑ w∈W log p(w) + c w ∈ W ∂ ∂w log p(W|Y, X) = − { 1 σ2 y ∂ ∂w E(W) + 1 σ2 w ∂ ∂w ΩL2 (W) } E(W), ΩL2 (W)
  4. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹϥϓϥεۙࣅΛ༻͍Δͱɼ༧ଌ෼෍͸ҎԼͷΑ͏ʹۙࣅͰ͖Δɽ ɹ  ɹ͔͠͠ɼ ͷதʹ//ؚ͕·Ε͍ͯΔͷͰɼղੳతܭࢉ͕ෆՄೳɽ ɹ͜͜Ͱɼύϥϝʔλͷࣄޙ෼෍ͷີ౓͕."1ਪఆ஋ͷपลʹूத͓ͯ͠Γɼ͔ͭͦͷ খ͞ͳൣғʹ͓͍ͯ͸ ͕

    ͷઢܕؔ਺ͰΑۙ͘ࣅͰ͖Δͱ͍͏ԾઆΛ͓͘ɽ͜ͷ Ծઆ͔Βɼςʔϥʔల։Ͱ ͷؔ਺ Λ ·ΘΓͰ࣍ۙࣅ͢ΔͱɼҎԼͷΑ͏ ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ  p(y* |x* , Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW p(y* |x* , W) f(x* |W) W W f(x* |W) WMAP f(x* ; W) ≈ f(x* ; WMAP ) + gT(W − WMAP ) g = ∇W f(x* ; W)| W=WMAP
  5. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ  ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,

    Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g
  6. ϥϓϥεۙࣅʹΑΔֶश ༧ଌ෼෍ͷۙࣅ ɹΑͬͯɼ·ͱΊΔͱҎԼͷۙࣅ͕ࣜಘΒΕΔɽ ɹ  ɹ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(y* |x* ,

    Y, X) = p(y* |x* ) = ∫ p(y* |x* , W)p(W|X, Y)dW ≈ ∫ p(y* |x* , W)q(W)dW = ∫ (yn | f(xn ; W), σ2 y )(W|WMAP , {Λ(WMAP )}−1)dW = ∫ (yn | f(x* ; WMAP ) + gT(W − WMAP ), σ2 y ) (W|WMAP , {Λ(WMAP )}−1)dW = (y* | f(x* ; WMAP ), σ2(x* )) σ2(x* ) = σ2 y + gT{Λ(WMAP )}−1g ϥϓϥεۙࣅ ςʔϥʔల։ͷҰ࣍ۙࣅ
  7. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹϋΠύʔύϥϝʔλͰ͋Δ ΍ ʹ΋ͦΕͧΕࣄલ෼෍Λ༩͑Δ͜ͱͰ ͱಉ࣌ʹ ਪ࿦ՄೳͰ͋Δɽ ɹ ɹਫ਼౓ύϥϝʔλ Λಋೖ͠ɼҎԼͷΑ͏ʹࣄલ෼෍ΛΨϯϚ෼෍Ͱఆٛ͢Δɽ

      ɹಉ༷ʹ ʹରͯ͠΋ɼҎԼͷΑ͏ʹఆٛ͢Δɽ  σw σy W γw = σ−2 w p(γw ) = Gam(γw |aw , bw ) (aw , bw ͸ਖ਼ͷݻఆ஋) γy = σ−2 y p(γy ) = Gam(γy |ay , by ) (ay , by ͸ਖ਼ͷݻఆ஋)
  8. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹϞσϧʢύϥϝʔλͷಉ࣌෼෍ʣΛվΊͯॻ͘ͱɼҎԼͷΑ͏ʹͳΔɽ  ɹ p(Y, W, γw , γy

    |X) = p(γw )p(γy )p(W|γw ) N ∏ n=1 p(yn |xn , W, γy ) n = 1,…, N xn yn W γy γw ɹࣄޙ෼෍͸ɼҎԼͷΑ͏ʹͳΔɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(W, γw , γy |X, Y) αy βw βy αw
  9. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹΪϒεαϯϓϦϯάΛ༻͍ͯɼ ΛαϯϓϦϯά͢Δɽ w  ͷαϯϓϦϯά ɹɹɹઌ΄Ͳͱಉ༷ʹɼ).$๏Ͱαϯϓϧ͢Δɽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ 

    w  ͷαϯϓϦϯά ɹɹɹ  ɹɹɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ʢΨ΢ε෼෍ͷڞ໾ࣄલ෼෍ʣͳͷͰɼ ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ   ͨͩ͠ɼ ͸ॏΈύϥϝʔλͷ૯਺ɽ W, γw , γy W W ∼ p(W|Y, X, γw , γy ) γw p(γw |Y, X, W, γy ) ∝ p(W|γw )p(γw ) p(W|γw ) p(γw ) p(γw |Y, X, W, γy ) γw ∼ Gam( ̂ aw , ̂ bw ) ̂ aw = aw + Kw 2 ̂ bw = bw + 1 2 ∑ w∈W w2 Kw
  10. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ w  ͷαϯϓϦϯά ɹɹɹ  ɹɹɹ ͸Ψ΢ε෼෍ͷ૯৐ͳͷͰΨ΢ε෼෍ɼ ͸ΨϯϚ෼෍ΑΓɼ

    ɹɹɹ ͸ΨϯϚ෼෍Ͱ͋ΔɽΑͬͯɼ    γy p(γy |Y, X, W, γw ) ∝ p(γw ) N ∏ n=1 p(yn |xn , W, γr ) N ∏ n=1 p(yn |xn , W, γr ) p(γy ) p(γy |Y, X, W, γw ) γy ∼ Gam( ̂ ay , ̂ by ) ̂ ay = ay + N 2 ̂ by = by + 1 2 N ∑ n=1 {yn − f(xn ; W)}2
  11. ϋϛϧτχΞϯϞϯςΧϧϩ๏ʢ).$๏ʣʹΑΔֶश ϋΠύʔύϥϝʔλͷਪ࿦ ɹΨϯϚ෼෍ ͷฏۉ͸ ɼ෼ࢄ͸ ͳͷͰɼ ͕େ͖͍΄Ͳ ʹΑΔ  ͷਪఆਫ਼౓͕ѱ͘ɼ؍ଌʹର͢Δ෼ࢄ͕େ͖͘ͳΔΑ͏ʹֶश͞ΕΔɽ

    ɹ ɹࠓճ͸ɼॏΈύϥϝʔλͷਫ਼౓ύϥϝʔλ͸ɼશମʹ౉ͬͯڞ௨ͷ Ͱ͓͍͍͕ͯͨɼ //ͷ֤૚͝ͱʹਫ਼౓ύϥϝʔλ ͱ͓͘͜ͱ΋ՄೳͰ͋Δɽ Gam(a, b) a/b a/b2 ̂ by f(xn |W) yn γw (γ(1) w , …, γ(L) w )
  12. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹ֬཰తޯ഑߱Լ๏ͱϥϯδϡόϯಈྗֶ๏Λ૊Έ߹Θͤͨɹ֬཰తޯ഑ϥάδϡόϯ ಈྗֶ๏ɹΛར༻ֶͨ͠शΛߟ͑Δɽ ɹύϥϝʔλͷߋ৽Λɹ ͱද͢ɽ ɹ֬཰తޯ഑߱Լ๏Ͱ͸ɼύϥϝʔλͷߋ৽෯ΛҎԼͷΑ͏ʹॻ͚Δɽ   ͨͩ͠ɼ

    ͸αϒαϯϓϧͷେ͖͞Ͱ͋ΓɼՃ͑ͯɼϩϏϯεɾϞϯϩʔΞϧΰϦζϜͷ ࿮૊Έʹ͢ΔͨΊʹɼεςοϓ໨ʹ͓͚Δֶश཰ ҎԼͷ৚݅Λຬͨ͢Α͏ʹઃఆ͢ Δɽ  Wnew = Wold + ΔW ΔW = αt 2 ∇W log p(W|Xs , Ys ) = αt 2 { N M ∑ n∈S ∇W log p(yn |xn , W) + ∇W log p(W) } M t αt ∞ ∑ i=1 αt = ∞, ∞ ∑ i=1 α2 t < ∞
  13. ֬཰తޯ഑ϥϯδϡόϯಈྗֶ๏ʹΑΔֶश ʲֶशʳ ɹҰํͰɼόονֶशΞϧΰϦζϜͷϥϯδϡόϯಈྗֶ๏ͷαϯϓϧΛಘΔͨΊʹඞ ཁͳεςοϓ͸ɼϙςϯγϟϧΤωϧΪʔΛ ɼεςοϓαΠζΛ    ΛӡಈྔϕΫτϧͱ͢Δͱɼύϥϝʔλͷߋ৽෯͸ҎԼͷΑ͏ʹͳΔɽ 

     ɹ Λখ͘͢͞Ε͹ɼ.)๏ʹ͓͚Δड༰཰ΛݶΓͳ͘·Ͱ͚ۙͮΒΕΔɽ = − log p(W|X, Y) ϵ = αt p ΔW = − ϵ2 2 ∇W + ϵp = αt 2 ∇W log p(W|X, Y) + αt p = αt 2 { N ∑ n=1 ∇W log p(yn |xn , W) + ∇W log p(W) } + αt p, p ∼ (0, I) . αt
  14. ֬཰తม෼ਪ࿦๏ ɹޮ཰ԽͷͨΊʹϛχόονΛಋೖ͢Δɼ ɹ    ɹϛχόονͰܭࢉ͞Εͨ ͸ ʹର͢ΔෆภਪఆྔͱͳΔɽ 

     ɹ͕ͨͬͯ͠ɼ Λ௚઀࠷େԽ͢Δ୅ΘΓʹɼ Λ࠷େԽ͢Δ͜ͱʹΑͬͯɼޮ཰ Α͘ύϥϝʔλͷࣄޙ෼෍ΛۙࣅͰ͖Δɽ ℒ(ξ) = N ∑ n=1 ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒS (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] ℒs ℒ S [ℒs (ξ)] = ℒ(ξ) ℒ(ξ) ℒs (ξ) ϛχόονԽ
  15. ޯ഑ͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ ੵ෼আڈͰ͖ͳ͍ɽ ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ 

    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW
  16. ޯ഑ͷϞϯςΧϧϩۙࣅ ɹχϡʔϥϧωοτϫʔΫͷ&-#0࠷େԽͰ͸ɼ&-#0ʹ͓͚Δύϥϝʔλ ͸ղੳతʹ ੵ෼আڈͰ͖ͳ͍ɽ ɹ ޯ഑߱Լ๏ʹΑͬͯ Λ࠷େԽɽ ɹޯ഑߱Լ๏Λ࢖͏ͨΊʹ Λม෼ύϥϝʔλʹΑΔޯ഑ܭࢉΛ͢Δඞཁ͕͋Δɽ 

    ͸ɼͲͪΒ΋Ψ΢ε෼෍ͳͷͰղੳతʹޯ഑ܭࢉͰ͖ΔɽҰํͰɼର ਺໬౓ ͸ղੳతʹੵ෼Ͱ͖ͳ͍ɽ W ⟹ ℒS (ξ) ℒS (ξ) ξ DKL [q(W; ξ)||p(W)] ∫ q(W; ξ)log p(yn | f(xn ; W))dW ɹϞϯςΧϧϩ๏Ͱੵ෼ʢର਺໬౓ʣΛۙࣅͯ͠ɼޯ഑ͷਪఆΛಘΑ͏ʂ
  17. ޯ഑ͷϞϯςΧϧϩۙࣅ ʲ໨ඪʳ ɹύϥϝʔλ ʹରͯ͠ɼ͋Δ෼෍ ͱ෼෍ Λߟ͑ɼ࣍ͷޯ഑Λਪ࿦͢ Δ͜ͱɽ   ʲܭࢉํ๏ʳ

    ɹείΞؔ਺ਪఆɼ࠶ύϥϝʔλԽޯ഑ɼҰൠԽ࠶ύϥϝʔλԽޯ഑ɼӄؔ਺ඍ෼ͳͲ w ∈ ℝ f(w) q(w; ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw
  18. ޯ഑ͷϞϯςΧϧϩۙࣅ είΞؔ਺ਪఆ ɹҎԼͷΑ͏ʹ Λมܗ͢Δɽ   ɹ͕ͨͬͯ͠ɼ ͔Β Λෳ਺αϯϓϦϯά͔ͯ͠Βඍ෼ΛධՁ͢Δ͜ͱͰ ͷෆ

    ภਪఆྔ͕ಘΒΕΔɽ ʲద༻Ͱ͖Δ৚݅ʳɹ ͷඍ෼͕ܭࢉՄೳɽ ʲ໰୊఺ʳɹ࣮༻্͸ඇৗʹߴ͍෼ࢄ͕ൃੜͯ͠͠·͏ɽ ʲղܾࡦʳɹ੍ޚมྔ๏ͳͲͷ෼ࢄݮগख๏ͱ૊Έ߹ΘͤΔɽ I(ξ) I(ξ) = ∇ξ ∫ f(w)q(w; ξ)dw = ∫ f(w)∇ξ q(w; ξ)dw = ∫ f(w)q(w; ξ)∇ξ log q(w; ξ)dw = q(w;ξ) [ f(w)∇ξ log q(w; ξ)] q(w; ξ) w I(ξ) log q(w; ξ)
  19. ޯ഑ͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ഑ ɹ Λ ͔Β௚઀αϯϓϦϯά͢Δ୅ΘΓʹɼʹґଘ͠ͳ͍ ͔ΒΛαϯϓϦϯ ά͠ɼม׵ Λద༻͢Δ͜ͱͰؒ઀తʹ ͷαϯϓϦϯάΛ͢Δ͜ͱΛߟ͑Δɽ ɹ͕ͨͬͯ͠ɼҎԼͷΑ͏ʹޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ

      ʲ۩ମྫʳɹ ɼ ͷ৔߹ ɹ ɼ ͱ͢Δ͜ͱͰɼ ͸ ͔ΒαϯϓϦϯ άͰ͖Δɽม෼ύϥϝʔλʹؔ͢Δޯ഑ͷඍ෼͸ɼ࣍ͷΑ͏ʹͳΓɼ֤ม෼ύϥϝʔλ ͷޯ഑ͷෆภਪఆྔ͕ಘΒΕΔɽ ɹɹɹɹ  ɹɹɹɹ w q(w; ξ) ξ q(ϵ) ϵ w = g(ξ, ϵ) w q(ϵ) [ f′(g(ξ; ϵ))∇ξ g(ξ; ϵ)] = I(ξ) ξ = { ̂ μ, ̂ σ2} q(w; ξ) = (w| ̂ μ, ̂ σ2) ˜ ϵ ∼ (0,1) = q(ϵ) ˜ w = g(ξ; ϵ) = ̂ μ + ̂ σϵ ˜ w ( ̂ μ, ̂ σ2) ∂ ∂ ̂ μ ∫ f(w)q(w; ξ)dw = ∫ f′(w)q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [ f′(w)] ∂ ∂ ̂ σ ∫ f(w)q(w; ξ)dw = ∫ f′(w) (w − ̂ μ) ̂ σ q(w; ξ)dw ∴ I( ̂ μ) = q(w;ξ) [f′(w) (w − ̂ μ) ̂ σ ]
  20. ޯ഑ͷϞϯςΧϧϩۙࣅ ࠶ύϥϝʔλԽޯ഑ͷҰൠԽ ʲ࠶ύϥϝʔλԽޯ഑ͷར఺ʳ ɹɹείΞؔ਺ਪఆͱൺ΂ͯޯ഑ͷ෼ࢄΛখ͘͞཈͑ΒΕΔɽ ʲ࠶ύϥϝʔλԽޯ഑ͷ໰୊఺ʳ ɹɹม਺ม׵ ͕ඞཁɽʢશͯͷ෼෍Ͱద༻Ͱ͖ΔΘ͚Ͱ͸ͳ͍ɽʣ ʲղܾࡦɹྫɿʳɹҰൠԽ࠶ύϥϝʔλԽޯ഑ ɹɹ ʹؔ͢Δ੍໿Λ؇Ίɼଟ͘ͷछྨͷ෼෍ʹରͯ͠ద༻Մೳͱͨ͠΋ͷɽ

    ɹɹ ͷΑ͏ʹม෼ύϥϝʔλͷґଘੑΛ࢒͢͜ͱΛڐ͢ɽ ʲղܾࡦɹྫɿʳɹӄؔ਺ඍ෼ ɹʲ࢖͑Δ৚݅ʳ w  ΛٻΊΔ͜ͱ͸ࠔ೉͕ͩɼٯม׵ ͸༰қʹಘΒΕΔɽ w ࿈ଓ஋ͷ෼෍ ɹɹ ΛͰඍ෼͢Δ͜ͱͰظ଴஋ͷޯ഑ΛಘΔɽ g g q(ϵ; ξ) g g−1 ϵ = g−1(ϵ; ξ) ξ
  21. ޯ഑ۙࣅʹΑΔม෼ਪ࿦๏ ɹ࣮ࡍʹ࠶ύϥϝʔλԽޯ഑Λར༻ͯ͠ϕΠζχϡʔϥϧωοτͷ&-#0Λ࠷େԽ͢Δɽ ᶃ ϛχόον Λσʔληοτ ͔ΒϥϯμϜʹநग़͢Δɽ ᶄ .ݸʢϛχόονͷαϯϓϧ਺ʣͷϊΠζΛऔಘ͢Δɽ ɹ 

    ᶅ ม෼ύϥϝʔλʹؔ͢Δޯ഑Λܭࢉ͢Δɽ   ᶆ &-#0ͷ૿Ճํ޲ʹม෼ύϥϝʔλΛߋ৽͢Δɽ  s ˜ ϵi ∼ (0, I) ℒs (ξ) = N M ∑ n∈S ∫ q(W; ξ)log p(yn | f(xn ; W))dW − DKL [q(W; ξ)||p(W)] = N M ∑ n∈S ∫ p(ϵ)log p(yn | f(xn ; g(ξ; ϵ)))dϵ − DKL [q(W; ξ)||p(W)] ≈ ℒS,ϵ (ξ) ( ∵ ,ϵ [ℒS,ϵ (ξ)] = ℒ(ξ)) = N M ∑ n∈S log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − DKL [q(W; ξ)||p(W)], ∇ξ ℒs (ξ) ≈ ∇ξ ℒS,ϵ (ξ) = N M ∑ n∈S ∇ξ log p(yn | f(xn ; g(ξ; ˜ ϵn ))) − ∇ξ DKL [q(W; ξ)||p(W)] . ξ ← ξ + α∇ξ ℒS,ϵ (ξ)
  22. ظ଴஋఻೻๏ʹΑΔֶश Ϟσϧ ʲઃఆʳ ɹɹ ͱ͠ɼपล໬౓ΛҎԼͷΑ͏ʹఆٛ͢Δɽ     ɹ

    ͷ׆ੑԽؔ਺ʹ͸ਖ਼نԽઢܗؔ਺ʢ3F-6ʣΛ༻͍Δɽ ɹɹύϥϝʔλ ͸ɼಠཱͳΨ΢ε෼෍ʹै͏ͱ͢Δɽ     ʲ໨ඪʳ ɹɹҎԼͷࣄޙ෼෍Λۙࣅਪ࿦͢Δ͜ͱɽ  yn ∈ ℝ p(Y|X, W, γr ) = N ∏ n=1 (yn | f(xn ; W), γ−1 y ) p(γy ) = Gam(γr |αγy 0 , βγy 0 ) f(xn ; W) W p(W|γw ) = L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |0,γ−1 w ) p(γw ) = Gam(γw |αγw 0 , βγw 0 ) p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γw )p(γy )p(γw )
  23. ظ଴஋఻೻๏ʹΑΔֶश ۙࣅ෼෍ ɹ֬཰తٯ఻೻๏͸ɼԾఆີ౓ϑΟϧλϦϯάʹج͍͍ͮͯΔɽ ɹύϥϝʔλͷۙࣅ෼෍Λ࣍ͷΑ͏ʹ͓͘ɽ   ɹ ɹ্ͷࣜΛԾఆີ౓ϑΟϧλϦϯάʹ͓͚ΔϞʔϝϯτϚονϯάͰஞ࣍తʹߋ৽ͯ͠ ͍͘ɽ q(W,

    γy , γw ) = Gam(γy |αγy , βγy )Gam(γw |αγw , βγw ) L ∏ l=1 Hl ∏ i=1 Hl−1 ∏ j=1 (w(l) i,j |m(l) i,j , v(l) i,j ) = q(γy )q(γw )q(W) Ծఆີ౓ϑΟϧλϦϯά qi+1 (θ) ≈ ri+1 = 1 Zi+1 fi+1 (θ)qi (θ)  ɿҼࢠ fi (θ)
  24. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲॳظԽʳ ɹɹۙࣅ෼෍͕ແ৘ใʹͳΔΑ͏ʹɼ ɼ ɼ ɼ ɼ ɼ 

    ͰॳظԽ͢Δɽ ʲࣄલ෼෍Ҽࢠͷಋೖʳ ɹ໨ඪͷࣄޙ෼෍ͷҼࢠΛͭͭ௥Ճ͢Δ͜ͱͰۙࣅ෼෍Λߋ৽͢Δɽ ɹࠓճͷϞσϧʹ͓͚Δࣄલ෼෍Ҽࢠ͸ҎԼͷΑ͏ʹͳΔɽ ɹ m(l) i,j = 0 v(l) i,j = ∞ αγy = 1 βγy = 0 αγw = 1 βγw = 0 p(γr ), p(γw ), {p(w(l) i,j |γw )}i,j,l ࣄޙ෼෍ɿɹ  ۙࣅ෼෍ɿɹ p(W, γy , γw |) ∝ p(Y|X, W, γr )p(W|γy )p(γw )p(γw ) q(W, γy , γw ) = q(γy )q(γw )q(W)
  25. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͓Αͼ ͷ௥Ճɽ ɹۙࣅ෼෍ Λࣄલ෼෍ ͱಉ͡΋ͷʹ͍ͯ͠ΔͷͰɼҼࢠͷߋ৽ ͸ҎԼͷΑ͏ʹͳΔɽ

      ɹɹɹɹɹɹɹɹ ɼ ɼ ɼ  ͭ·Γɼ  ɼ p(γw ) p(γy ) q(γy ), q(γw ) p(γy ), p(γw ) qnew(γy )qnew(γw )qnew(W) ≈ p(γy )p(γw )q(W) αnew γy = αγy 0 βnew γy = βγy 0 αnew γw = αγw 0 βnew γw = βγw 0 q(γr ) ← p(γr ) q(γw ) ← p(γw ) Ծఆີ౓ϑΟϧλϦϯά qnew(γy )qnew(γw )qnew(W) ≈ r = 1 Z f new(γy , γw , W)q(γy )q(γw )q(W)
  26. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճ   ɹҎ߱Ͱ͸ɼΠϯσοΫε Λলུ͢Δɽ ɹߋ৽͞ΕΔͷ͸ɼ

    ͓Αͼ Ͱ͋ΔɽΑͬͯɼͦΕͧΕΛҎԼͷΑ͏ʹߋ৽ ͢Δɽ     ɹԼઢ෦ΛҼࢠͱΈͳ͢ɽ஫ҙ͢΂͖͸ɼͭ໨ͷ෼෍ͷߋ৽ʹͭ໨ͷ৽ͨʹߋ৽͞ Εͨ෼෍͸࢖༻͍ͯ͠ͳ͍఺ͳͷͰɼߋ৽ॱʹؔ܎͸ͳ͍͜ͱɽ p(w(l) i,j |γw ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γy )q(γw )q(W) ⇔ qnew(γw )qnew(W) ≈ 1 Z p(w(l) i,j |γw )q(γw )q(W) i, j, l q(W) q(γw ) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw )
  27. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw

    ) q(W) qnew(W) ≈ 1 Z0 p(w|γw )q(γw )q(W) ɹ ͸Ψ΢ε෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨ΢ε෼෍ͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ      q(W) mnew = m + v ∂ ∂m log Z0 vnew = v − v2 {( ∂ ∂m log Z0) 2 − 2 ∂ ∂v log Z0} Z0 = Z(αγw , βγw ) = ∫ p(w|γw )q(W)q(γw )dwdγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw
  28. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ wҼࢠ ͷ௥Ճɿ ͷߋ৽ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ p(w(l) i,j |γw

    ) q(γw ) qnew(γw ) ≈ 1 Z0 p(w|γw )q(W)q(γw ) ɹ ͸ΨϯϚ෼෍Ͱ͋Δ͜ͱ͔ΒɼͷΨϯϚ෼෍ͷྫʢQʣͱಉ༷ʹ ϞʔϝϯτϚονϯάʹΑͬͯɼҎԼͷΑ͏ʹۙࣅ෼෍͕ߋ৽͞ΕΔɽ   ɹɹɹɹɹɹɹɹ  ͨͩ͠ɼ ɼ q(γw ) αnew γw = { Z0 Z2 Z−2 1 αγw + 1 αγw − 1 } −1 βnew γw = { Z2 Z−1 1 αγw + 1 βγw − Z1 Z−1 0 αγw βγw } −1 Z1 = Z(αγw + 1,βγw ) Z2 = Z(αγw + 2,βγw )
  29. ظ଴஋఻೻๏ʹΑΔֶश ॳظԽͱࣄલ෼෍Ҽࢠͷಋೖ ʲࣄલ෼෍Ҽࢠͷಋೖʳ ɹਖ਼نԽఆ਺ ͸ݫີʹٻΊΒΕͳ͍ͷͰɼܭࢉ్தͰݱΕΔενϡʔσϯτ ͷU෼෍Λɼฏۉͱ෼ࢄͷ౳͍͠Ψ΢ε෼෍Ͱۙࣅ͢Δɽ  Z(αγw , βγw

    ) Z(αγw , βγw ) = ∫ (w|0,γ−1 w )q(W, γy , γw )dWdγy dγw = ∫ (w|0,γ−1 w )(w|m, v)Gam(γw |αγw , βγw )dwdγw = ∫ St(w|0,αγw /βγw ,2αγw )(w|m, v)dw ≈ ∫ (w|0,(αγw − 1)/βγw )(w|m, v)dw = (w|0,(αγw − 1)/βγw + v) U෼෍Λฏۉͱ෼ࢄ͕ ౳͍͠Ψ΢ε෼෍ʹ ۙࣅɽ
  30. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹࣄલ෼෍ͷ֤Ҽࢠ͕௥Ճ͞Εͨޙ͸ɼ໬౓ ͷҼࢠΛͭͣͭ௥Ճ͢Δɽ   ɹ ͸Ψ΢ε෼෍ɼ ͸ΨϯϚ෼෍ͳͷͰɼઌ΄Ͳͷߋ৽ͱಉ༷ʹߦ͏ɽ 

        ৽͘͠ೖ͖ͬͯͨ໬౓ͷҼࢠ ʹର͢Δਖ਼نԽఆ਺ʢ ͷ௥ Ճ࣌ͱҟͳΔߋ৽෦෼ʣΛܭࢉ͢Δ͜ͱ͕໨ඪɽ ɹ p(Y|X, W, γy ) qnew(γy )qnew(γw )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γy )q(γw )q(W) ⇔ qnew(γr )qnew(W) ≈ 1 Z p(yi |xi , W, γy )q(γr )q(W) q(W) q(γy ) qnew(W) ≈ 1 Z0 p(yi |xi , W, γy )q(γw )q(W) qnew(γw ) ≈ 1 Z0 p(yi |xi , W, γy )q(W)q(γw ) ⟹ p(yi |xi , W, γy ) p(w(l) i,j |γw )
  31. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ൪໨ͷ໬౓Λ௥Ճͨ͠ͱ͖ͷਖ਼نԽఆ਺Λɼ࣍ͷΑ͏ʹۙࣅతʹٻΊΔɽ   ɹ i Z(αγy , βγy

    ) = ∫ (yi | f(xi , W), γy )q(W, γy , γw )dWdγy dγw = ∫ (yi | f(xi , W), γy )q(W, γy )dWdγy ≈ ∫ (yi |z(L), γy )(z(L) |mz(L) , vz(L) )Gam(γy |αγy , βγy )dz(L)dγy = ∫ St(yi |z(L), αγy /βγy ,2αγy )(z(L) |mz(L) , vz(L) )dz(L) ≈ ∫ (yi |mz(L) , (αγy − 1)/βγy )(z(L) |mz(L) , vz(L) )dw = (yi |mz(L) , (αγy − 1)/βγy + vz(L) ) U෼෍Λฏۉͱ෼ࢄ͕ ౳͍͠Ψ΢ε෼෍ʹ ۙࣅɽ ૚໨ͷӅΕϢχοτ  ͕ฏۉ ɼ ෼ࢄ ʹै͏ͱԾఆɽ ʢ࣍ͷεϥΠυͰৄ͘͠ʣ l z(l) ∈ ℝHl mz(l) vz(l)
  32. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙
  33. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙  ૚໨ͷӅΕϢχοτͷฏۉ ͱ ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l)
  34. ظ଴஋఻೻๏ʹΑΔֶश ໬౓Ҽࢠͷಋೖ ɹ ͷฏۉ ͱ෼ࢄ ͸ɼ࠶ؼతͳܭࢉʹΑͬͯۙࣅతʹಘΒΕΔɽ ʲܭࢉํ๏ʳ ɹ૚໨ͷӅΕϢχοτͷ஋ ͕ฏۉ ɼ෼ࢄ

    Λ࣋ͭͱԾఆ͢Δɽ· ͨɼ૚໨ͷॏΈߦྻ Λ͔͚ͨޙͷϕΫτϧʢ׆ੑʣΛ  ͱ͓͘ɽ ͷฏۉͱ෼ࢄ͸ҎԼͷΑ͏ʹͳΔɽ     ͨͩ͠ɼ ͷ੒෼͸ɼ֤ύϥϝʔλͷฏۉ ͱ෼ࢄ Ͱ͋Δɽ· ͨɼ ͸ΞμϚʔϧੵɽ (z(L) |mz(L) , vz(L) ) mz(L) vz(L) l z(l) ∈ ℝHl mz(l) vz(l) l W(l) ∈ ℝHl ×Hl−1 a(l) = W(l)z(l−1)/ Hl−1 a(l) ma(l) = M(l)mz(l−1) / Hl−1 va(l) = {(M(l) ⊙ M(l))vz(l−1) + V(l)(mz(l−1) ⊙ mz(l−1) ) + V(l)vz(l−1) }/Hl−1 M(l), V(l) ∈ ℝHl ×Hl−1 m(l) i,j v(l) i,j ⊙  ૚໨ͷӅΕϢχοτͷฏۉ ͱ ෼ࢄ ͔Β૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͕ٻ·Δɽ l − 1 mz(l−1) vz(l−1) l ma(l) va(l) ૚໨ͷ׆ੑͷฏۉ ͱ෼ࢄ ͔Β ૚໨ͷӅΕϢχοτͷฏۉ ͱ෼ࢄ  ͕ٻ·Ε͹࠶ؼతʹܭࢉՄೳɽ l ma(l) va(l) l mz(l) vz(l)
  35. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ɹ׆ੑ ͷ෼෍ Λܭࢉ͢Δɽத৺ۃݶఆཧΑΓɼӅΕϢχοτ਺  ͕େ͖͍৔߹ɼ ͸ۙࣅతʹΨ΢ε෼෍ʹै͏ɽ  

    ɹΨ΢ε෼෍ʹै͏ม਺͕3F-6Λ௨ΔͱɼਤͷӈਤͷΑ͏ʹ෼෍ͷࠞ߹෼෍ʹͳ Δɽ ᶃ ෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼฏۉ ɼ෼ࢄ ͷΑ͏ͳ࣭఺ʹͳ Δɽ ᶄ ඇෛͷೖྗΛ௨͖ͬͯͨαϯϓϧ͸ɼҎԼ͕࡟ΒΕͨஅยΨ΢ε෼෍ʹͳΔɽ a(l) p(a(l) |W(l), z(l−1)) Hl−1 a(l) p(a(l) |W(l), z(l−1)) ≈ q(a(l)) = (a(l) |ma(l) , va(l) ) μp = 0 σp = 0
  36. ظ଴஋఻೻๏ʹΑΔֶश ׆ੑͷ෼෍ ʲ׆ੑͷࠞ߹෼෍ʹద༻ʳɹ ɹɹ࣭఺ͱஅยΨ΢ε෼෍ͷࠞ߹܎਺ΛͦΕͧΕ ɼ ͱ͢Δɽͭ·Γɼ ɽ ɹ ͸ɼ ͱ͓͘ͱɼҎԼͷΑ͏ʹͳΔɽ

      ɹ͕ͨͬͯ͠ɼ੾அΨ΢ε෼෍ͷ܎਺͸ҎԼͷΑ͏ʹٻΊΒΕΔɽ   ɹ<4,PU[ >ΑΓɼஅยΨ΢ε෼෍ͷฏۉ ͱ෼ࢄ ͸ҎԼͷΑ͏ʹͳΔɽ     ɹҰൠࣜʹ͓͚Δ ɼ ʹ౰ͯ͸ΊΔͱɼͷฏۉͱ෼ࢄ͕ಘΒΕΔɽ πp πt πp + πp = 1 πp ¯ μ = − μ/σ πp = ∫ 0 −∞ (x|μ, σ2)dx = Φ(−μ/σ) = Φ( ¯ μ) πt = 1 − πp = Φ(− ¯ μ) μt σt μt = μ + σ ( ¯ μ|0,1) Φ(− ¯ μ) σ2 t = σ2 {1 + ¯ μ ( ¯ μ|0,1) Φ(− ¯ μ) − ( ¯ μ|0,1) Φ(− ¯ μ) − 2} ( ¯ μ|0,1) Φ(− ¯ μ) [xmix ] [xmix ] z
  37. ظ଴஋఻೻๏ʹΑΔֶश ޯ഑ʹجֶͮ͘श ɹ ͸ɼฏۉ ɼ෼ࢄ ͱͯ͠ѻ͏ʢ࠶ؼܭࢉͷॳظ஋ ɼ ʣɽ dͰ͸ɼ ૚໨ͷग़ྗ

    ͔Β׆ੑ Λ௨͠ɼ૚໨ͷग़ྗ  ͷฏۉͱ෼ࢄΛٻΊΔʢத৺ۃݶఆཧΑΓΨ΢ε෼෍ʹۙࣅͰ͖ΔɽʣҰ࿈ͷྲྀΕΛ঺ հͨ͠ɽ͜ͷۙࣅ݁ՌΛ࠶ؼతʹ༻͍Δ͜ͱͰɼ࠷ऴ૚ ͷ෼෍ΛΨ΢ε෼෍  Ͱۙࣅ͢Δ͜ͱ͕Ͱ͖Δɽ ɹ͕ͨͬͯ͠ɼਖ਼نԽఆ਺ͷۙࣅදݱ͕ಘΒΕΔɽ   ɹਖ਼نԽఆ਺Λಘͨޙ͸ɼύϥϝʔλʹΑΔඍ෼Λܭࢉ͢Δ͜ͱͰޯ഑͕ܭࢉͰ͖Δɽ z(0) xi 0 mz(0) vz(0) l − 1 z(l−1) a(l) l z(l) z(L) (z(L) |mz(L) , v(L) z ) Z(αγy , βγy ) ≈ (yi |mz(L) , (αγy − 1)/βγy + vz(L) )
  38. ظ଴஋఻೻๏ʹΑΔֶश ֬཰తٯ఻೻๏ͷ·ͱΊ Ϟσϧͷఆٛɿ p(W, γy , γw |) ∝ p(Y|X,

    W, γr )p(W|γw )p(γy )p(γw ) ۙࣅ෼෍ͷಋೖɿ q(W, γy , γw ) = q(γy )q(γw )q(W) ۙࣅ෼෍ͷॳظԽɿ q0 (γy ), q0 (γw ), q0 (W) ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ Ҽࢠ ͷ௥Ճɿ  Ҽࢠ ͷ௥Ճɿ p(γr ) q(γr ) ← p(γr ) p(γw ) q(γw ) ← p(γw )
  39. ظ଴஋఻೻๏ʹΑΔֶश ֬཰తٯ఻೻๏ͷ·ͱΊ ࣄલ෼෍Ҽࢠͷಋೖʢͦͷʣɿ for l = 1 to L do

    for j = 1 to Hl−1 do for i = 1 to Hl do Ҽࢠp(w(l) i,j |γw )ͷ௥Ճɿ ⋅ q(W)ͷߋ৽ ⋅ q(γw )ͷߋ৽ ॱ఻೻ɿ p(yi |xi , W, γy ) where i ∈ s ӅΕϢχοτͱ׆ੑͷฏۉͱ෼ࢄΛ࠶ؼܭࢉ ໬౓Ҽࢠ ͷಋೖɿ ͷߋ৽ p(yi |xi , W, γy ) q(W), q(γy )