Upgrade to Pro — share decks privately, control downloads, hide ads and more …

近似動的計画入門

 近似動的計画入門

青山学院の小林先生のご講演スライド

Avatar for MIKIO KUBO

MIKIO KUBO

May 26, 2025
Tweet

More Decks by MIKIO KUBO

Other Decks in Research

Transcript

  1. ಈతܭը๏ ྫ ࢝఺ ͔Βऴ఺ ·Ͱͷ࠷୹ܦ࿏ s t ࢝఺s ऴ఺t ఺

    ͔Βऴ఺·Ͱͷ࠷୹ڑ཭ pi : i i j k csi ఺ ͔Β఺ ʹ௚઀Ҡಈ͢Δίετ cij : i j csj csk pk pi pj 
  2. ಈతܭը๏ ྫ ࢝఺ ͔Βऴ఺ ·Ͱͷ࠷୹ܦ࿏ s t ࢝఺s ऴ఺t ఺

    ͔Βऴ఺·Ͱͷ࠷୹ڑ཭ pi : i i j k csi ఺ ͔Β఺ ʹ௚઀Ҡಈ͢Δίετ cij : i j csj csk pk pi pj ps = min{csi + pi , csj + pj , csk + pk } 
  3. ۙࣅಈతܭը๏ ࢝఺s ऴ఺t ʹෆ࣮֬ੑ͕͋Δͱ͖ʹۙࣅΛ༻͍Δ pi i j k csi csj

    csk pk ???? pi ???? pj ???? ps = min{csi + pi , csj + pj , csk + pk } ͕ܭࢉͰ͖ͳ͍ 
  4. ଟஈ֊ͷҙࢥܾఆ໰୊ w ଟ͘ͷҙࢥܾఆ໰୊͸ɼଟஈ֊ͷҙࢥܾఆ໰୊ w ྫ ੜ࢈؅ཧ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ

    w ݄೔ͷੜ࢈ܭըͷܾఆɼੜ࢈ͷ࣮ߦɼੜ࢈ޙͷঢ়ଶʢࡏݿྔͳͲʣ ෆ࣮֬ੑΛ࣋ͭ΋ͷ wੜ࢈ػցͷ༧ఆ֎ͷނো wੜ࢈඼ͷ༧ఆ֎ͷधཁ૿Ճ wੜ࢈඼ͷ༌ૹͷ஗Ԇ 
  5. ଟஈ֊ͷҙࢥܾఆ໰୊ 1PXFMMʹΑΔه๏ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 ,

    S1 , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ظ ͔Βظ ΁ͷਪҠͷաఔͰ໌Β͔ʹͳͬͨ৘ใ Wt+1 : t t + 1 S0 x0 W1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ S1 x1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ ࣌ؒͷܦա w࣮ݱͨ͠ੜ࢈ྔ w൑໌ͨ͠धཁྔ \  W2
  6. 8#1PXFMMʹΑΔه๏ɾఆࣜԽ 8#1PXFMM 1SPGFTTPS&NFSJUVT 1SJODFUPO6OJWFSTJUZ 0QUJNBM%ZOBNJDT w 8#1PXFMM "QQSPYJNBUF%ZOBNJD1SPHSBNNJOH 8JMFZ 

    w 8#1PXFMM .JOJTZNQPTJVN"6OJ fi FE'SBNFXPSLGPS 0QUJNJ[BUJPO6OEFS6ODFSUBJOUZ *OGPSNT 8BTIJOHUPO %$  :PVUVCFͰࢹௌՄ
  7. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ظ ͔Βظ ΁ͷਪҠͷաఔͰ໌Β͔ʹͳͬͨ৘ใ Wt+1 : t t + 1 S0 x0 W1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ S1 x1 ঢ়ଶ ܾఆ ʢधཁ༧ଌɼࡏݿྔʣ ʢੜ࢈ྔʣ ࣌ؒͷܦա w࣮ݱͨ͠ੜ࢈ྔ w൑໌ͨ͠धཁྔ \ Θ͔͍ͬͯΔ΋ͷ ܾΊΔ΋ͷ ෆ࣮֬ͳ΋ͷ͕࣮ݱͨ͠΋ͷ 
  8. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ৽͍͠৘ใ Wt+1 : د༩ؔ਺  C(St , xt ) ظ ʹ͓͚ΔҙࢥܾఆΛ ͱ͢Δ͜ͱʹΑΔد༩ DPOUSJCVUJPO Λදؔ͢਺ t xt د༩ɹίετɼརӹͳͲ ྫ ظ Ͱͷधཁ༧ଌɼࡏݿྔͳͲΛݟܾͯఆͨ͠ྔ Λੜ࢈͢ΔͨΊͷੜ࢈ίετ t xt 
  9. ଟஈ֊ͷҙࢥܾఆ໰୊ w ঢ়ଶɼҙࢥܾఆɼ৽͍͠৘ใ S0 , x0 , W1 , S1

    , x1 , W2 , . . . , St , xt , Wt+1 , . . . ظ ʹ͓͚ΔγεςϜͷঢ়ଶ St : t ظ Ͱͷҙࢥܾఆ xt : t ৽͍͠৘ใ Wt+1 : ҙࢥܾఆ  xt = Xπ(St ) wҙࢥܾఆ ͸ɼํࡦ QPMJDZ  ʹΑܾͬͯΊΔɽ wํࡦ͸ɼঢ়ଶ ʹґଘ͢Δ wಉ͡ঢ়ଶ Ͱ͋ͬͯ΋ɼํࡦ ͱํࡦ ͱͰ͸ҙࢥܾఆ ͸ҟͳΓ͏Δ wد༩ Λ࠷దԽ͢Δํࡦ ΛٻΊ͍ͨ xt π St St π1 π2 xt C(St , xt ) π 
  10. ଟஈ֊ͷҙࢥܾఆ໰୊ͷఆࣜԽ w ෆ࣮֬ͳཁૉ͕ͳ͚Ε͹਺ཧ࠷దԽ໰୊ͱͯ͠ఆࣜԽ ྫ ઢܗ࠷దԽ min x cx Ax =

    b TU x ≥ 0 ྫ ଟظؒͷઢܗ࠷దԽ min x0 ,x1 ,...,xT T ∑ t=0 ct xt At xt − Bt−1 xt−1 = bt TU xt ≥ 0 Dt xt ≤ ut 
  11. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  12.  ঢ়ଶม਺   ෺ཧঢ়ଶ  ෺ཧҎ֎ͷ৘ใɼ ҙࢥܾఆͷͨΊͷલఏ৘ใͳͲ  ܾఆม਺

     ܾఆม਺ͷ஋͸ɼํࡦ ʹΑͬͯఆΊΔ  ֎ੑ৘ใ   ͱ ͷؒʹ໌Β͔ʹͳͬͨɼ͋Δ͍͸࣮ݱͨ͠৘ใ St = (Rt , It , Bt ) Rt : It Bt : (xt , at , ut ) Xπ(St |θ) Wt+1 t t + 1  ਺ཧϞσϧͷͭͷཁૉʙ
  13.  ભҠؔ਺  ঢ়ଶ ͷԼͰํࡦʹΑΓܾఆม਺ ͷ஋ΛఆΊͯɼ΍͕ͯɼ৘ใ ͕໌Β ͔ʹͳͬͨޙʹɼظ ʹঢ়ଶ ʹࢸΔ

     ໨తؔ਺  શظؒͷد༩ؔ਺ͷظ଴஋Λ࠷େԽ͢ΔΑ͏ͳํࡦ ΛٻΊ͍ͨ  St+1 = SM(St , xt , Wt+1 ) St xt Wt t + 1 St+1 max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) π  ਺ཧϞσϧͷͭͷཁૉʙ
  14. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  15. ঢ়ଶม਺ ͷྫ St = (Rt , It , Bt )

    ෺ཧঢ়ଶ  ෺ཧҎ֎ͷ৘ใɼ ҙࢥܾఆͷͨΊͷલఏ৘ใ w ݪࡐྉͷঢ়ଶɼӡൖंͷҐஔɼ঎඼ࡏݿྔ w  ݪࡐྉՁ֨ɼఱީ w  Ձ֨ʹର͢Δࢢ৔ͷ൓Ԡɼੜ࢈ઃඋͷՔಇঢ়گ Rt : It Bt : Rt It Bt = 
  16. ෩ྗൃిɾ஝ిγεςϜͷྫ w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w ظ ͰͷిྗάϦουͰͷిྗՁ֨

    w ࠓޙͷ෩ྗ༧ଌ w ঢ়ଶม਺  Rt t Dt t pt t Bt = St = (Rt , Dt , pt , Bt )  Rt Dt pt Bt
  17. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  18. ෩ྗൃిɾ஝ిγεςϜͷྫ ܾఆม਺ w άϦου͔Βͷిྗͷߪೖྔɽ ͕ൢചྔɼ ͕ߪೖྔ w ੍໿৚݅ w ʢൢചͰ͖Δͷ͸஝ిྔͷൣғ಺ʣ

    w ʢߪೖྔ͸஝ి༰ྔͷ࢒Γͷൣғ಺ʣ w ํࡦ  xt xt > 0 xt < 0 xt ≤ Rt −xt ≤ Rmax − Rt xt = Xπ(St )  Rt Dt pt Bt xt = Xπ(St ) St = (Rt , Dt , pt , Bt )
  19. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  20. ֎ੑ৘ใ ͷྫ Wt w ظ Ͱ໌Β͔ʹͳΔ৘ใ   ઃඋނোɼ஗Ԇɼ৽نܖ໿ӡൖंɼͳͲ ސ٬ͷ৽ͨͳधཁ

    Ձ֨ͷมԽ ఱީͳͲ؀ڥͷ৘ใ Wt t ( ̂ Rt , ̂ Dt , ̂ Et , ̂ pt ) ̂ Rt ̂ Dt ̂ pt ̂ Et w ͸ ͱ ʹґଘ͠͏ΔͷͰɼ ͱॻ͘͜ͱ΋͋Δ Wt+1 St xt Wt+1 (St , xt ) 
  21. ෩ྗൃిɾ஝ిγεςϜͷྫ ֎ੑ৘ใ w  w ظ ͱظ ͷؒͷిྗྔʢൃిʹΑΔʣ ͷมԽ w

    ظ ͱظ ͷؒʹ໌Β͔ʹͳͬͨిྗध ཁ w ظ ͱظ ͷؒͷిྗՁ֨ͷมԽ Wt+1 = ( ̂ Et+1 , ̂ Dt+1 , ̂ pt+1 ) ̂ Et+1 = t t + 1 ̂ Dt+1 = t t + 1 ̂ pt+1 = t t + 1  Rt Dt pt Bt xt = Xπ(St )
  22. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  23. ભҠؔ਺ ͷྫ St+1 St+1 = SM(St , xt , Wt+1

    )  ࡏݿอଘ  ظ ͷՁ֨͸ ʹՁ֨มԽ͕൓ө͞Εܾͯ·Δ  ໌Β͔ʹͳͬͨظ ͷधཁ Rt+1 = Rt + xt + ̂ Rt+1 pt+1 = pt + ̂ pt+1 t + 1 pt Dt+1 = ̂ Dt+1 t + 1  Rt Dt pt Bt xt = Xπ(St )
  24. ਺ཧϞσϧͷͭͷཁૉ  ঢ়ଶม਺   ܾఆม਺   ֎ੑ৘ใ 

     ભҠؔ਺   ໨తؔ਺ St = (Rt , It , Bt ) (xt , at , ut ) Wt+1 St+1 = SM(St , xt , Wt+1 ) max π 𝔼 S0 𝔼 W1 ,...,WT |S0 T ∑ t=0 C(St , Xπ(St )) 
  25. ෩ྗൃిɾ஝ిγεςϜͷྫ ໨తؔ਺ w ظ ͰͷిྗചΓ্͛ɹɹ ɹ͜ͷ ͸ํࡦ ʹΑܾͬͯ·ΔͷͰɼ ͱॻ͘ w

     Ͱͷد༩ؔ਺ͷ࿨ͷظ଴஋Λɼॳظঢ়ଶ ͷ΋ͱͰ࠷େԽ͢Δ ֤ظͷܾఆม਺ ΛఆΊΔํࡦ Λ࠷దԽ͢Δ C(St , xt ) = pt xt t xt π Xπ (St) max π 𝔼 { T ∑ t=0 C (St , Xπ (St)) |S0} t = 0,1,...,T S0 xt π  Rt Dt pt Bt xt = Xπ(St )
  26. ํࡦ ͷྫ π ํࡦΛύϥϝʔλΛ࣋ͭؔ਺ͰఆΊΔ৔߹   ͷ৔߹   

    ͷ৔߹    ͷ৔߹  Xπ (St |ρ) = pt < ρDIBSHF ρDIBSHF < pt < ρEJTDIBSHF ρDIBSHF < pt ͸ظ ʹͳͬͯ͸͡Ίͯ൑໌͢Δ ظͰ Λʢྫ͑͹਺ཧ࠷దԽ໰୊ͷ࠷దղͱͯ͠ʣܾΊΔ͜ͱ͸Ͱ͖ͳ͍ pt t xt 
  27. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt 
  28. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt ظ ʹ͓͍ͯ ͕ ͱΓ͏Δ஋ͷू߹ t xt 
  29. ֬ఆతͳ໰୊ͱ֬཰తͳ໰୊ min x0 ,..,xT T ∑ t=0 ct xt ໨తؔ਺

    ܾఆม਺ (x0 , . . . , xT) ੍໿ ظt At xt = Rt xt ≥ 0 ^ 𝒳 t ભҠؔ਺ Rt+1 = bt+1 + Bt xt max π 𝔼 { T ∑ t=0 Ct (St , Xπ t (St), Wt+1 |S0) } ํࡦ Xπ : S → 𝒳 ੍໿ xt = Xπ t (St) ∈ 𝒳 t ભҠؔ਺ St+1 = SM(St , xt , Wt+1 ) ֎ੑ৘ใ (S0 , W1 , W2 , . . . , WT ) 
  30. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ໨తؔ਺ 
  31. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ≈ 1 N N ∑ n=1 T ∑ t=0 ct Xπ t (Sn t (ωn)) ໨తؔ਺ 
  32. ֬཰తͳଟஈ֊ͷઢܗܭը໰୊ max x0 ,...,xT T ∑ t=0 ct xt ໨తؔ਺

    Λ ʹஔ͖׵͑ͯɼظ଴஋ΛͱΔ xt Xπ (St) max π 𝔼 T ∑ t=0 ct Xπ t (St) ≈ 1 N N ∑ n=1 T ∑ t=0 ct Xπ t (Sn t (ωn)) ໨తؔ਺ αϯϓϧʹΑΔظ଴஋ͷධՁ 
  33. ෩ྗൃిɾ஝ిγεςϜͷྫ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t 
  34. ෩ྗൃిɾ஝ిγεςϜͷྫ ܾఆม਺  w ظ ͰͷάϦου͔ΒόοςϦʔ΁ͷిྗྔ w ظ ͰͷάϦου͔Βຬͨ͞ΕΔిྗधཁ w

    ظ Ͱͷ෩ྗൃిॴ͔ΒόοςϦʔ΁ྲྀΕΔ஝ిྔ w ظ Ͱͷ෩ྗൃిॴ͔ΒͷిྗͰຬͨ͞ΕΔిྗधཁ w ظ ͰͷόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ xt = (xGB t , xGD t , xEB t , xED t , xBD t ) xGB t = t xGD t = t xEB t = t xED t t xBD t t 
  35. ෩ྗൃిɾ஝ిγεςϜͷྫ ੍໿৚݅  w  w  w xEB t

    + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt xGD t , xEB t , xED t , xBD t ≥ 0 ˡάϦου͔ΒɼόοςϦʔ͔Βɼൃిॴ͔Βͷ ిྗͰधཁ͕શͯຬͨ͞ΕΔ ˡόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ͸஝ిྔΛ௒͑ͳ͍ 
  36. ෩ྗൃిɾ஝ిγεςϜͷྫ ੍໿৚݅  w  w  w xEB t

    + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt xGD t , xEB t , xED t , xBD t ≥ 0 ˡάϦου͔ΒɼόοςϦʔ͔Βɼൃిॴ͔Βͷ ిྗͰधཁ͕શͯຬͨ͞ΕΔ ˡόοςϦʔ͔Βຬͨ͞ΕΔిྗधཁ͸஝ిྔΛ௒͑ͳ͍ ํࡦ ͸ɼ͜ΕΒͷ੍໿Λຬͨ͢ ΛఆΊͳ͚Ε͹ͳΒͳ͍ Xπ(St ) x 
  37. ෩ྗൃిɾ஝ిγεςϜͷྫ ֎ੑ৘ใ  w ظ ͔Βظ Ͱͷ෩ྗൃి͔ΒͷൃిྔͷมԽ w ظ ͔Βظ

    ͰͷిྗधཁͷมԽ w ظ ͰͷάϦουͰͷిྗՁ֨ͷมԽ Wt+1 = ( ̂ Et+1 , ̂ Dt+1 , ̂ pt+1 , ) ̂ Et+1 = t t + 1 ̂ Dt+1 = t t + 1 ̂ pt+1 = t + 1 
  38. ෩ྗൃిɾ஝ిγεςϜͷྫ ભҠؔ਺  w ˡόοςϦʔ಺ͷ஝ిྔͷมԽ w ˡ࣮ݱͨ͠ظ ʹ͓͚Δൃిྔ w ˡ൑໌ͨ͠ظ

    ʹ͓͚Δधཁྔ w St = SM (St , xt , Wt+1) Rt+1 = Rt + η (xGB t + xEB t − xBD t ) Et+1 = Et + ̂ Et+1 t + 1 Dt+1 = Dt + ̂ Dt+1 t + 1 pt+1 = ̂ pt+1 
  39. ෩ྗൃిɾ஝ిγεςϜͷྫ ໨తؔ਺  ͨͩ͠ɼ w  w ͕ط஌ͱ͢Δ max π

    𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} St+1 = SM (St , xt = Xπ (St), Wt+1) (S0 , W1 , W2 , . . . , WT) 
  40. ࣌ܥྻϞσϧΛ࢖͏ྫʢՁ֨ʣ w ʢ"3*."Ϟσϧʣ ͜ͷϞσϧΛఆΊΔύϥϝʔλ͸ɼ    pt+1 = θ0

    pt + θ1 pt−1 + θ2 pt−2 + ϵp t+1 (θ0 , θ1 , θ2) pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 ¯ θ¯ pt + ϵp t+1  ¯ θ = θ0 θ1 θ2 , ¯ pt = pt pt−1 pt−2 
  41. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t 
  42. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 
  43. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 
  44. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  ੍໿৚݅  w  w St =

    (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) xEB t + xED ≤ Et xGD t + xBD t + xED t = Dt xBD t ≤ Rt pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 ભҠؔ਺ 
  45. ύϥϝʔλͷධՁɾߋ৽  ͷධՁ͕ඞཁ pt+1 = θ0 pt + θ1 pt−1

    + θ2 pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt = (¯ θt0 , ¯ θt1 , ¯ θt2) 
  46. ύϥϝʔλͷධՁɾߋ৽  ͷධՁ͕ඞཁ pt+1 = θ0 pt + θ1 pt−1

    + θ2 pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt = (¯ θt0 , ¯ θt1 , ¯ θt2) ͱ͓͘ ¯ Ft (¯ pt | ¯ θt) = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 = ¯ θt ⊤ ¯ pt ϵt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 − pt+1 = ¯ Fprice t (¯ pt |θt) − pt+1 ¯ pt = (pt , pt−1 , pt−2 )⊤ 
  47. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ pt = (pt , pt−1 , pt−2 )⊤ ¯ θt+1,0 = ¯ θt,0 + 1 γt + m0 pt + m1 pt−1 + m2 pt−2 ظ ʹ͓͚ Λɼ Λ༻͍ͯߋ৽͢Δʢஞ࣍࠷খೋ৐ʣ t + 1 ¯ θt+1,0 ¯ θt,0 , pt , pt−1 , pt−2 
  48. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1 ¯ θt+1,0 ¯ θt+1,1 ¯ θt+1,2 = ¯ θt,0 ¯ θt,1 ¯ θt,2 + 1 γt m00 m01 m02 m10 m11 m12 m20 m21 m22 pt pt−1 pt−2 ¯ pt = (pt , pt−1 , pt−2 )⊤ ظ ʹ͓͚ Λɼ Λ༻͍ͯߋ৽͢Δʢஞ࣍࠷খೋ৐ʣ t + 1 ¯ θt+1,0 ¯ θt,0 , pt , pt−1 , pt−2 
  49. ύϥϝʔλͷධՁɾߋ৽ pt+1 = θ0 pt + θ1 pt−1 + θ2

    pt−2 + ϵt+1 → pt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 + ϵt+1   ϵt+1 = ¯ θt0 pt + ¯ θt1 pt−1 + ¯ θt2 pt−2 − pt+1 = ¯ Fprice t (¯ pt |θt) − pt+1 ¯ θt+1 = ¯ θt + 1 γt Mt ¯ pt ϵt+1 Mt+1 = Mt − 1 γt Mt ¯ pt (¯ pt) ⊤ Mt  γt+1 = 1 − (¯ pt) ⊤ Mt ¯ pt 
  50. ෩ྗൃిɾ஝ిγεςϜͷྫʢ࠶ܝʣ ঢ়ଶม਺  w ظ ͰͷόοςϦʔ಺ͷ஝ిྔ w ظ Ͱͷిྗधཁ w

    ظ Ͱͷൃిྔ w ظ ͰͷిྗάϦουͰͷిྗՁ֨ St = (Rt , Dt , Et , pt ) → (Rt , Dt , Et , (pt , pt−1 , pt−2)) Rt = t Dt = t Et = t pt t pt+1 = θ0 pt + θ1 pt−1 + θ2 pt−2 + ϵt+1 , ¯ θt+1 = ¯ θt + 1 γt Mt ¯ pt ϵt+1 (Rt , Dt , Et , (pt , pt−1 , pt−2), (¯ θt , Mt)) 
  51. ܾఆ ͷఆΊํ xt ܾఆ ΛܾΊΔํ๏ͷ͜ͱΛɼํࡦ ͱ͍͏ w ঢ়ଶ ͱύϥϝʔλ ͕༩͑ΒΕͨͱ͖ͷܾఆ

     ͜ͷΑ͏ͳํࡦ ΛٻΊΔ͜ͱ͕Ͱ͖Δ͔ʁɹ xt π xt = Xπ (St |θ) St θ xt π 
  52. ܾఆ ͷఆΊํ xt ֶशʹΑΔํࡦͷܾΊํ  ͱͯ͠े෼ͳֶशσʔλ͕ඞཁ  ࢀরද  ύϥϝτϦοΫϞσϧ

     ϊϯύϥϝτϦοΫϞσϧ min f,θ 1 N N ∑ n=1 (yn − f(xn |θ)) 2 (xn, yn) ଟஈ֊ͷҙࢥܾఆ max π 1 N N ∑ n=1 T ∑ t=0 C (St , xt) 
  53. ܾఆ ͷఆΊํ xt ֶशʹΑΔํࡦͷܾΊํ  ͱͯ͠े෼ͳֶशσʔλ͕ඞཁ min f,θ 1 N

    N ∑ n=1 (yn − f(xn |θ)) 2 (xn, yn) ଟஈ֊ͷҙࢥܾఆ max π 1 N N ∑ n=1 T ∑ t=0 C (St , xt) ؔ਺Λ୳͢ ํࡦΛ୳͢ 
  54. ํࡦͷઃܭํ਑ " ํࡦ୳ࡧ 1PMJDZ4FBSDI  ໨తΛ࠷దԽ͢Δؔ਺Λݟ͚ͭΔ  # ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO

     ݱࡏͷܾఆ͕কདྷʹٴ΅͢ӨڹΛۙࣅ͢Δ max π=( f,θ) 𝔼 { T ∑ t=0 C (St , Xπ t (St |θ)) |S0} X* t (St) = BSHNBY ( C(St , xt ) + 𝔼 { max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt}) xt 
  55. ํࡦͷઃܭํ਑ " ํࡦ୳ࡧ 1PMJDZ4FBSDI  ໨తΛ࠷దԽ͢Δํࡦ Λݟ͚ͭΔɹʢʹؔ਺ ͱͦͷύϥϝʔλ Λݟ͚ͭΔʣ 

    # ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO  ݱࡏͷܾఆ͕কདྷʹٴ΅͢ӨڹΛۙࣅ͢Δ͜ͱͰɼܾఆ ΛఆΊΔํ๏Λಛఆ͢Δ π f θ max π=( f,θ) 𝔼 { T ∑ t=0 C (St , Xπ t (St |θ)) |S0} xt X* t (St) = BSHNBY ( C(St , xt ) + 𝔼 { max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt}) xt 
  56. "ํࡦ୳ࡧ 1PMJDZTFBSDI  ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶃ ࢀরද ᶄ ύϥϝτϦοΫؔ਺

    ᶅ ϊϯύϥϝτϦοΫؔ਺  ίετؔ਺ۙࣅ DPTUGVODUJPOBQQSPYJNBUJPOT  w ֬ఆతϞσϧΛ֬཰తཁૉΛѻ͏ͨΊʹमਖ਼͢Δ XCFA(St |θ) = BSHNBY xt ¯ Cπ (St , xt |θ) 
  57. "ํࡦ୳ࡧ 1PMJDZTFBSDI  ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶃ ࢀরද w ৔߹Θ͚धཁ͕˓˓ͷͱ͖͸ɼ99͚ͩੜ࢈͢Δ

    ᶄ ύϥϝʔλΛ࣋ͭؔ਺ w ࡏݿ؅ཧͷ ํࡦࡏݿ͕ ·ͰݮͬͨΒ ͚ͩิॆ͢Δɼχϡʔϥϧ ωοτ ᶅ ϊϯύϥϝτϦοΫؔ਺ w Χʔωϧճؼɼਂ૚χϡʔϥϧωοτϫʔΫ (s, S) s S − s 
  58. "ํࡦ୳ࡧ w ํ਑ ؔ਺ʹج͍ͮͨํࡦͷ৔߹ʢྫ͑͹ɼࡏݿ؅ཧͷ ํࡦʣ (s, S) w खॱB ύϥϝʔλΛ࣋ͭؔ਺ΛఆΊΔ

    w खॱC ύϥϝʔλͷνϡʔχϯάΛߦ͏ ྫ ࡏݿ؅ཧ w खॱB  ํࡦΛ࠾༻͢Δ͜ͱʹ͢Δ w खॱC  ͷ஋ͱ ͷ஋ΛνϡʔχϯάͰఆΊΔ (s, S) s S 
  59. "ํࡦ୳ࡧ w ໨తؔ਺Λઃఆ͠ɼαϯϓϧʹରͯ͠Α͘ৼΔ෣͏Α͏ʹํࡦΛܾఆ͢Δ ํࡦ ͸ ͱ ʹΑͬͯఆ·Δɽɹɹ π f θ

    π = (f, θ) ࣍ͷؔ਺Λ࠷దԽ͢ΔΑ͏ʹɼํࡦ ΛఆΊΔ π max π 𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} 
  60. "ํࡦ୳ࡧ w ໨తؔ਺Λઃఆ͠ɼαϯϓϧʹରͯ͠Α͘ৼΔ෣͏Α͏ʹํࡦΛܾఆ͢Δ ํࡦ ͸ ͱ ʹΑͬͯఆ·Δɽɹɹ π f θ

    π = (f, θ) ࣍ͷؔ਺Λ࠷దԽ͢ΔΑ͏ʹɼํࡦ ΛఆΊΔ π max π 𝔼 S0 𝔼 W1 ,...,WT |S0 { T ∑ t=0 C (St , Xπ (St)) |S0} w ॳظঢ়ଶ ͔Β͸͡Ίͯɼํࡦ ʹΑͬͯظ ͷܾఆ ΛఆΊΔɹ w ֤ظͷد༩ ͷ࿨ͷظ଴஋Λ࠷େԽ͢ΔΑ͏ͳํࡦΛٻΊΔ S0 π t Xπ (St) C (St , Xπ (St)) 
  61. "ํࡦ୳ࡧ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0}  ࠷దԽ͸೉͍͠ͷͰɼͭͷઓུͰۙࣅ͢Δ ᶃ ํࡦؔ਺ۙࣅ QPMJDZGVODUJPOBQQSPYJNBUJPOT  ᶄ د༩ؔ਺ۙࣅ DPTUGVODUJPOBQQSPYJNBUJPOT 
  62. "ํࡦ୳ࡧᶃํࡦؔ਺ۙࣅ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0} ํࡦؔ਺ Λۙࣅ͢ΔʢࢀরදɼύϥϝʔλΛ࣋ͭؔ਺౳ʣ ྫʣઢܗؔ਺ۙࣅ  ଞʹ΋ɼඇઢܗؔ਺ɼχϡʔϥϧωοτͳͲΛ࢖͏͜ͱͰ͖Δ Xπ (St) Xπ (St |θ) = θ0 + θ1 ϕ1 (St) + θ2 ϕ2 (St) 
  63. "ํࡦ୳ࡧᶄد༩ؔ਺ۙࣅ max π 𝔼 S0 𝔼 W1 ,...,WT |S0 {

    T ∑ t=0 C (St , Xπ (St)) |S0} د༩ؔ਺ Λ࠷େԽ͢ΔܾఆΛ༻͍Δ  C (St , Xπ(St )) Xπ (St |θ) = BSHNBY ¯ Cπ t (St , x|θ) x ∈ 𝒳 π t (θ) ¯ Cπ t (St , x|θ) C(St , Xπ(St )) Λ Ͱۙࣅ͢Δ 
  64. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  65. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  66. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  67. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 
  68. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 S2 
  69. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt xt Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) 
  70. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt xt Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} Λ ɹͰۙࣅ͢Δ Vt+1 (St+1 ) 
  71. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 ঢ়ଶ x1 ҙࢥܾఆޙঢ়ଶ (S1 , x1 ) ֬཰తਪҠ ҙࢥܾఆ xt W1 W2 S2 
  72. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S0 (S0 , x0 ) x0 S1 x1 (S1 , x1 ) xt W1 W2 S2 ɹ Vt+1 (St+1 ) 
  73. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO X* t (St) = BSHNBY C(St , xt

    ) + 𝔼 max π { 𝔼 T ∑ t′  =t+1 C(St′  , Xπ t′  (St′  ))|St+1} |St , xt S11 S12 S10 S0 (S0 , x0 ) x0 S1 xt W1 ɹ Vt+1 (St+1 ) ɹ V1 (S10 ) ɹ V1 (S11 ) ɹ V1 (S12 ) X* t (St) = BSHNBY (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) 
  74. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) 
  75. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt Λ 
  76. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt Λ ۙࣅ͢Δ 
  77. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) 
  78. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) 
  79. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) C(St , xt ) 
  80. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vt (St ) = max xt (C(St , xt ) + Vx t (Sx t )) Vt (St ) C(St , xt ) Vx t (Sx t ) 
  81. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } 
  82. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) 
  83. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) Wt+1 
  84. ܾఆޙঢ়ଶ ঢ়ଶ ͷ΋ͱͰ Λܾఆͨ͠ঢ়ଶ Sx t : St xt ֬཰తਪҠ

    St Sx t St+1 Sx t+1 St+2 Vx t (Sx t ) = 𝔼 Wt+1 {Vt+1 (St+1 )|Sx t } Vx t (Sx t ) Wt+1 Vt+1 (St+1 ) 
  85. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) 
  86. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO     X* t

    (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) Vx t (Sx t ) ͸Θ͔Βͳ͍ʢ֓೦తͳ΋ͷʣ ԿΒ͔ͷํ๏Ͱද͢ඞཁ͕͋Δˠ ¯ Vx t (St ) 
  87. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO      X*

    t (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) = BSHNBY xt (C(St , xt ) + ¯ Vx t (Sx t )) 
  88. #ઌಡΈۙࣅ -PPLBIFBEBQQSPYJNBUJPO Ձ஋ؔ਺ۙࣅ WBMVFGVODUJPOBQQSPYJNBUJPO      

    2MFBSOJOH X* t (St) = BSHNBY xt (C(St , xt ) + 𝔼 {Vt+1 (St+1 )|St , xt}) = BSHNBY xt (C(St , xt ) + Vx t (Sx t )) = BSHNBY xt (C(St , xt ) + ¯ Vx t (Sx t )) = BSHNBY xt ¯ Qt (St , xt )