Upgrade to Pro — share decks privately, control downloads, hide ads and more …

多様かつ継続的に変化する環境に適応する情報システム/thesis-defense-presen...

 多様かつ継続的に変化する環境に適応する情報システム/thesis-defense-presentation

九州大学 大学院システム情報科学府 情報知能工学専攻
2024.08.22 学位論文公聴会

monochromegane

August 22, 2024
Tweet

More Decks by monochromegane

Other Decks in Research

Transcript

  1. • ଟ༷͔ͭܧଓతʹมԽ͢Δ؀ڥͷதͰɺ৘ใγεςϜ͕ܧଓతʹػೳ͢Δʹ ͸ɺͦͷߏ੒΍ϩδοΫΛߋ৽͠มԽʹ௥ै͢Δඞཁ͕͋Δ • → ྫʣ৘ใγεςϜͷෛՙঢ়گɺར༻ऀͷߦಈͷมԽ౳ • ͜Ε·Ͱɺ͜ͷ௥ै͸ɺӡ༻ҡ࣋ۀ຿ͱͯ͠ӡ༻ऀ͕୲͖ͬͯͨ  7

    ৘ใγεςϜͱ؀ڥมԽ • ਓखʹΑΔ؀ڥͷมԽݕ஌΍৘ใγεςϜͷߋ৽͸ɺ௥ै΁ͷ࣌ؒࠩΛ൐͏ • ݁Ռͱͯ͠ɺ҆ఆੑ΍ར༻ऀͷຬ଍౓ͷ௿Լɺӡ༻ऀͷෛ୲ͷ૿ՃΛট͘
  2. • ैདྷͷγεςϜ։ൃͰ͸ɺར༻ऀ͔Βͷೖྗ  ʹରͯ͠ग़ྗ  Λܾఆ͢Δؔ਺  Λઃܭ͍ͯ͠Δ x y

    f  10 దԠత৘ใγεςϜͷ࣮ݱʹ޲͚ͨΞϓϩʔν • ਓखʹΑΔ؀ڥͷมԽݕ஌΍৘ใγεςϜͷߋ৽͸ɺ௥ै΁ͷ࣌ؒࠩΛ൐͏ • ݁Ռͱͯ͠ɺ҆ఆੑ΍ར༻ऀͷຬ଍౓ͷ௿Լɺӡ༻ऀͷෛ୲ͷ૿ՃΛট͘ y = f(x) IUUQTJDPOTDPN
  3. • ػցֶश͸ʮؔ਺ͷઃܭʯΛࣗಈԽ͢Δʢσʔλ͔ΒύϥϝʔλΛٻΊΔʣ  11 దԠత৘ใγεςϜͷ࣮ݱʹ޲͚ͨΞϓϩʔν ᶃ ೖग़ྗͷରԠؔ܎Λఆٛ ᶄ ༧ଌʹର͢ΔζϨΛఆٛ ᶅ

    ֶशσʔλʹର͢Δ ༧ଌͷζϨΛ࠷খԽ͢Δ ᶆ ύϥϝʔλ͕ܾ·Δ ʢσʔλ͔Βؔ਺ͷઃܭ͕Ͱ͖ͨʣ y = f(x) IUUQTJDPOTDPN
  4. • ৘ใγεςϜʹ͓͚Δ৘ใաଟ໰୊Λղܾ͢ΔɺਪનγεςϜͷಋೖ • ͳΜΒ͔ͷػցֶशϞσϧʢ=ਪનख๏ʣʹج͖ͮଟ਺ͷબ୒ࢶ͔Βར༻ऀ ͕ڵຯΛ࣋ͭ΋ͷΛఏҊ͢ΔγεςϜ • ਺ଟ͘ͷਪનख๏͕ఏҊ͞Ε͍ͯΔ → ޮՌతͳʮਪનख๏ͷબఆʯ͕ॏཁ 

    17 • ޮՌతͳਪનख๏͸ঢ়گʹΑͬͯҟͳΔ • ͔͠͠ͳ͕Βɺ࣮؀ڥͰͷܧଓతͳਪનख๏ͷධՁʹ͸ػձଛࣦ͕൐͏ ӡ༻্ͷ՝୊ ۩ମྫʹݟΔదԠత৘ใγεςϜͷ࣮ݱʹ޲͚ͨ՝୊
  5. • ࠷΋୯७ͳํࡦͰ͋Δ  -Greedy͸ɺൺ཰  Ͱۉ౳ʹ࿹Λબ୒ʢ୳ࡧʣ͠ɺൺ཰  Ͱͦͷ࣌఺ͷฏۉใु͕࠷΋େ͖͍࿹Λબ୒ʢ׆༻ʣ͢Δ ϵ ϵ

    1 − ϵ  22 ଟ࿹όϯσΟοτํࡦ argmaxl=1,L ̂ y(l),1 − ϵ ∀a ∈ A, ϵ/L Bandit A/B testing ∀a ∈ A, ϵ/L
  6. ଟ࿹όϯσΟοτํࡦΛ༻͍ͨదԠత৘ใγεςϜ  23 User(s) System Exploitation and Exploration using Multi-armed

    bandits Feedback Estimation Truth t-1 t-1 t-1 • ଟ࿹όϯσΟοτํࡦʹΑΔػցֶशϞσϧͷબఆʹ͓͚Δػձଛࣦͷ௿ݮ • → దԠత৘ใγεςϜͷ࣮ӡ༻΁ͷద༻ͷোนΛऔΓআ͘ IUUQTJDPOTDPN
  7. ଟ࿹όϯσΟοτํࡦΛ༻͍ͨదԠత৘ใγεςϜ  24 User(s) System Exploitation and Exploration using Multi-armed

    bandits Feedback Estimation Truth t-1 t-1 t-1 • ଟ࿹όϯσΟοτํࡦʹΑΔػցֶशϞσϧͷબఆʹ͓͚Δػձଛࣦͷ௿ݮ • → దԠత৘ใγεςϜͷ࣮ӡ༻΁ͷద༻ͷোนΛऔΓআ͘ ػցֶशϞσϧͷಛੑΛ౿·͑ͨ ࣮༻తͳํࡦ IUUQTJDPOTDPN
  8. ຊݚڀͷ၆ᛌਤ  29 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity

    Online performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ
  9.  30 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity Online

    performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ [48] ࡾ୐ ༔հ, ็ ߃ݑ, Synapse: จ຺ʹԠͯ͡ܧଓతʹਪનख๏ͷબ୒Λ࠷దԽ͢Δਪનγε ςϜ, ిࢠ৘ใ௨৴ֶձ࿦จࢽD, Vol.J103-D, No.11, pp.764-775, Nov 2020.
  10.  31 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity Online

    performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ [49] ࡾ୐ ༔հ, ็ ߃ݑ, Synapse: จ຺ͱ࣌ؒܦաʹԠͯ͡ਪનख๏ͷબ୒Λ࠷దԽ͢Δϝλਪનγε ςϜ, ిࢠ৘ใ௨৴ֶձ࿦จࢽD, Vol.J105-D, No.11, pp.641-652, Nov. 2022.
  11.  32 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity Online

    performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU [50] Yusuke Miyake, Tsunenori Mine, Contextual and Nonstationary Multi-armed Bandits Using the Linear Gaussian State Space Model for the Meta-Recommender System, 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp.3138-3145, Oct 2023.
  12.  33 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity Online

    performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU [51] Yusuke Miyake, Ryuji Watanabe, Tsunenori Mine, Online Nonstationary and Nonlinear Bandits with Recursive Weighted Gaussian Process, The 48th IEEE International Conference on Computers, Software, and Applications (COMPSAC 2024), pp.11-20, Jul 2024.
  13. ຊݚڀͷ၆ᛌਤ  35 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity

    Online performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ
  14.  38 AdaptEx [30] [W. Black 2023] • ଟ࿹όϯσΟοτํࡦΛ༻͍ͨར༻ऀମݧͷݸผԽϓϥοτϑΥʔϜ •

    ΦϯϥΠϯཱྀߦαΠτʢExpediaʣͰར༻ऀͷମݧ ʢ࿹ͱͯ͠ͷ৘ใͷ಺༰΍දࣔҐஔʣΛ࠷దԽ͢Δ • ମݧʢ࿹ʣͷ։ൃऀ͸ɺઐ໳஌ࣝΛ࣋ͨͣʹଟ࿹όϯσΟοτʹΑΔൺֱධ ՁΛಋೖɾల։Մೳ • جຊత΋͘͠͸จ຺෇͖ͷଟ࿹όϯσΟοτํࡦ͔Βબ୒Մೳ • → ࿹ͷಛੑʹԠͨ͡ํࡦͷબఆʹΑΔػձଛࣦͷ௿ݮ͕ظ଴Ͱ͖Δ
  15.  39 • ଟ࿹όϯσΟοτํࡦΛ༻͍ͨར༻ऀମݧͷݸผԽϓϥοτϑΥʔϜ • جຊత΋͘͠͸จ຺෇͖ͷଟ࿹όϯσΟοτํࡦ͔Βબ୒Մೳ • → ࿹ͷಛੑʹԠͨ͡ํࡦͷબఆʹΑΔػձଛࣦͷ௿ݮ͕ظ଴Ͱ͖Δ •

    ͜ΕΒͷطଘͷํࡦͰ͸ɺػցֶशϞσϧͷ࣋ͭಛੑΛશͯߟྀͰ͖ͳ͍ • → ෳ਺ํࡦͷ࢖͍෼͚ʹՃ͑ɺํࡦͦͷ΋ͷͷ։ൃ͕ٻΊΒΕΔ AdaptEx [30] [W. Black 2023]
  16.  44 1. ਪનख๏ͷಛੑʹԠͨ͡બ୒ 2. ೚ҙͷਪનख๏Λൺֱର৅ͱͯ͠ར༻ 3. ܧଓతͳਪનख๏ͷ༗ޮੑͷධՁ 4. ධՁʹ·ͭΘΔػձଛࣦͷ௿ݮ

    ᶄ ਪનख๏Λొ࿥͢Δ ڞ௨ͷΠϯλʔϑΣʔεΛຬͨ͢ਪન ख๏Ͱ͋Ε͹೚ҙͷख๏Λར༻Մೳ ఏҊγεςϜʢSynapseʣ
  17.  45 1. ਪનख๏ͷಛੑʹԠͨ͡બ୒ 2. ೚ҙͷਪનख๏Λൺֱର৅ͱͯ͠ར༻ 3. ܧଓతͳਪનख๏ͷ༗ޮੑͷධՁ 4. ධՁʹ·ͭΘΔػձଛࣦͷ௿ݮ

    ᶅ ਪનख๏Λબ୒ɾධՁ͢Δ ଟ࿹όϯσΟοτํࡦΛ༻͍ͯਪનख ๏Λબ୒͢Δɻ ར༻ऀͷߦಈ݁Ռ͸อଘ͞ΕҰఆظؒ ͝ͱʹධՁ͕ߋ৽͞ΕΔɻ ఏҊγεςϜʢSynapseʣ
  18. • ఏҊγεςϜͰ͸ɺඪ४తͳํࡦͱͯ͠ɺैདྷͷจ຺෇͖ํࡦͰ͋Δɺ Linear Thompson SamplingʢLTSʣΛ࠾༻͢Δ  46 ܧଓతʹਪનख๏ͷબ୒Λ࠷దԽ͢ΔϝλਪનγεςϜ  l*

    t = argmaxl=1,L (x⊤ t ˜ w(l) N , ˜ w(l) N ∼ 𝒩 D (A−1 N bN , σ2 ϵ A−1 N )) ֶशʹΑΓੜ੒͞ΕΔཚ਺͕ɺίϯςΩετ৘ใʹ Ԡͨ͡෼෍΁ͱมԽ͍༷ͯ͘͠ࢠ • LTSʹ͓͚Δ࿹ͷબఆʢ֬཰Ұக๏ʣ • ਪఆͨ͠ใुͷฏۉͱɺࢼߦճ਺ʹԠͨ͡ෆ࣮֬ੑͷදݱͰ͋Δڞ෼ࢄߦ ྻʹै͏ཚ਺ͱɺίϯςΩετ৘ใͷ಺ੵ͕࠷΋େ͖ͳ࿹Λબఆ͢Δ
  19. • બ୒ͨ͠ਪનख๏͔ΒಘΒΕΔΫϦοΫ਺ͷγϛϡϨʔγϣϯ • ํࡦʹΑΓબ୒͞Εͨਪનख๏͸ɺઃఆͨ͠ΫϦοΫ཰ͷϕϧψʔΠ෼෍ʹै͍ਪ ન݁Ռ͕ΫϦοΫ͞ΕΔ΋ͷͱ͢Δ • ֤ਪનख๏͸঎඼ΧςΰϦ਺ͱ౳͍͠18࣍ݩͷύϥϝʔλ  Λ࣋ͭ •

    ΫϦοΫ཰͸  ͱίϯςΩετ৘ใ  ͷ಺ੵͰܭࢉ͞ΕΔ • ίϯςΩετ৘ใ  ͸ɺ࣌఺  ʹ͓͍ͯར༻ऀ͕Ӿཡ͍ͯ͠Δ঎඼ΧςΰϦͷ1-hot ϕΫτϧͱͯ͠දݱ͞ΕΔ • ࣮ࡍͷਪનγεςϜͷڍಈͱ߹ΘͤΔͨΊɺใु͸1࣌ؒ͝ͱʹ·ͱΊͯϑΟʔυ όοΫ͞ΕΔ΋ͷͱ͢Δ ˜ w(l) t ˜ w(l) t xt xt t  51 ධՁํ๏ʢ1/2ʣ
  20. • จ຺ͱ࣌ؒͷܦաͷߟྀͷͦΕͧΕͷد༩౓Λ໌Β͔ʹ͢Δ4άϧʔϓͷγ ϛϡϨʔγϣϯΛ࣮ࢪ  52 ධՁํ๏ʢ2/2ʣ ࣌ؒͷܦա º ˓ จ຺

    º "ىटͷ࠷ળͳਪનख๏ΛશظؒҰ؏ ͯ͠༻͍Δ $࣌఺͝ͱʹධՁͷߴ͍ਪનख๏Λόϯ σΟοτΛ༻͍ͯબఆ ˓ #จ຺͝ͱʹ࠷ળͳਪનख๏Λશظؒ Ұ؏ͯ͠༻͍Δ %จ຺͝ͱ࣌఺͝ͱʹධՁͷߴ͍ਪનख ๏ΛόϯσΟοτΛ༻͍ͯબఆ • จ຺ʹ͸ɺਪન࣌ʹӾཡதͷ঎඼ΧςΰϦΛ༻͍Δ • ଟ࿹όϯσΟοτํࡦ͸ɺ -Greedy(B) ɺLinUCB(D) ɺLTS(D) ϵ
  21. • จ຺ͱ࣌ؒͷܦաͷߟྀͷͦΕͧΕͷد༩౓Λ໌Β͔ʹ͢Δ4άϧʔϓͷγ ϛϡϨʔγϣϯΛ࣮ࢪ  53 ධՁํ๏ʢ2/2ʣ ࣌ؒͷܦա º ˓ จ຺

    º "ىटͷ࠷ળͳਪનख๏ΛશظؒҰ؏ ͯ͠༻͍Δ $࣌఺͝ͱʹධՁͷߴ͍ਪનख๏Λόϯ σΟοτΛ༻͍ͯબఆ ˓ #จ຺͝ͱʹ࠷ળͳਪનख๏Λશظؒ Ұ؏ͯ͠༻͍Δ %จ຺͝ͱ࣌఺͝ͱʹධՁͷߴ͍ਪનख ๏ΛόϯσΟοτΛ༻͍ͯબఆ ΦϯϥΠϯͰͷܧଓతͳධՁ͕ͦ΋ͦ΋༗ޮͰ͋Δ͔ ಛੑʹԠͨ͡ํࡦͷ࢖͍෼͚͕༗ޮͰ͋Δ͔
  22.  54 • Bʢจ຺ʣɿ঎඼ΧςΰϦ͝ͱͷظट࣌఺ ʹ͓͚Δ࠷దͳਪનख๏͕ҟͳΔͨΊɺ͜ ΕʹԠ͡Δ͜ͱͰվળ • Cʢ࣌ؒͷܦաʣɿશͯͷ঎඼ΧςΰϦΛ ௨ͯ͠ͷ࠷దͳਪનख๏͸ظؒதมԽ͠ͳ ͍ͨΊɺ݁Ռతʹ୳ࡧ͕ແବͱͳͬͨ

    • → ࠷దͳਪનख๏͕໌Β͔ͰมԽ͕ͳ͍ͳΒ͹ํࡦ ͷಋೖ͕൓ରʹػձଛࣦΛੜΉ • Dʢจ຺ͱ࣌ؒͷܦաʣɿ঎඼ΧςΰϦ͝ ͱͷมԽʹ௥ैͨ͜͠ͱͰվળɻํࡦ͝ͱ ʹޮՌͷ͕ࠩݟΒΕͨ ධՁ݁Ռ: AάϧʔϓΛج४ͱͨ͠ྦྷੵใुͷࠩͷൺֱ จ຺ͷΈ ࣌ؒͷܦաͷΈ จ຺ͱ࣌ؒͷܦաΛߟྀ͢Δ ඪ४ํࡦʹΑͬͯ໿2%૿Ճ
  23. • Dάϧʔϓͷඪ४ํࡦʢLTSʣͷྦྷੵใु Λ෼ੳ͢Δͱɺશͯͷ঎඼ΧςΰϦʹ͓͍ ͯɺ୳ࡧͷίετΛճऩͰ͖͍ͯΔΘ͚Ͱ ͸ͳ͍ • ͨͩ͠ɺظؒதʹ࠷దͳਪનख๏͕੾Γସ Θͬͨ঎඼ΧςΰϦʹ͓͍ͯେ͖͘վળ͢ ΔҰํͰɺͦ͏Ͱͳ͍৔߹ʹ͸ෆཁͳ୳ࡧ Λ཈͑ͨ͜ͱͰɺશମͱͯ͠ͷػձଛࣦΛ

    ௿ݮͨ͠ • → ۉ౳ͳ୳ࡧͷ  -Greedy΍ɺใु͕஗ΕΔ؀ڥͰ࿹ ͕ݻఆ͞ΕΔLinUCB͸ຊධՁઃఆͰ͸ෆརͱͳͬͨ ϵ  55 ධՁ݁Ռ: ঎඼ΧςΰϦ͝ͱͷྦྷੵใुͷ಺༁ %άϧʔϓͷඪ४ํࡦʢ-54ʣͷྦྷੵใुͱɺ#άϧʔ ϓͷظटͷ࠷దͳਪનख๏ΛҰ؏ͨ͠৔߹ͱͷൺֱ ˎʢʣ಺͸ͦͷൺ཰
  24. ຊݚڀͷ၆ᛌਤ  60 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity

    Online performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ
  25. • ࿹͝ͱʹෳ਺ͷจ຺͕͋Γɺจ຺ʹԠͯ͡ใु෼෍͕ܾ·Δଟ࿹όϯσΟοτ ໰୊ͷઃఆ • ຊݚڀͰ͸ɺจ຺͸ɺෳ਺ͷཁҼͷύϥϝʔλͷ૊Έ߹ΘͤͰදݱ͞Εͨ ঢ়ଶͷ͜ͱΛࢦ͢ • → ཁҼύϥϝʔλͷ஋͕{0,1}ͷ৔߹ɺจ຺͸ཁҼ਺ 

    ʹରͯ͠  ύλʔϯ d 2d  63 จ຺෇͖ଟ࿹όϯσΟοτ໰୊ • จ຺෇͖ଟ࿹όϯσΟοτํࡦͰ͸ɺจ຺ͷ֬཰෼෍Ͱ͸ͳ͘ɺཁҼ͝ͱͷ܎ ਺ʢઢܗύϥϝʔλʣΛਪఆ͢Δ͜ͱͰ֤จ຺ʹ͓͚ΔใुΛ༧૝͢Δ
  26. • ಉ͡จ຺ʹ͓͍ͯ΋࣌ؒܦաʹΑͬͯใु෼෍͕มԽ͢Δଟ࿹όϯσΟοτ໰ ୊ͷ໰୊ઃఆ • पظతͳมԽͰ͋Ε͹ཁҼͷύϥϝʔλʹؚΊΔ͜ͱͰରԠͰ͖Δ͕ෆن ଇͳ৔߹ʢඇఆৗͳมԽͷ৔߹ʣ͸͜ͷݶΓͰ͸ͳ͍  64 ඇఆৗͳଟ࿹όϯσΟοτ໰୊ •

    ඇఆৗͳଟ࿹όϯσΟοτํࡦͰ͸ɺաڈʹ؍ଌͨ͠ใुʹଊΘΕͣ࿹ͷධՁ Λਝ଎ʹߋ৽͢Δ͜ͱͰ֤จ຺ʹ͓͚ΔใुΛ༧૝͢Δ ๨٫ܕ εϥΠσΟϯά ΢Οϯυ΢ܕ มԽݕग़ܕ ঢ়ଶۭؒϞσϧܕ
  27. ঢ়ଶۭؒϞσϧܕ • ঢ়ଶͷਪҠաఔΛঢ়ଶۭؒϞσϧͰѻ͍ɺஞ࣍ਪఆͨ͠ঢ়ଶΛར༻ • ใुܥྻͷ࣌ؒతͳมԽΛࣗવʹѻ͑Δɻಛʹ؇΍͔ͳมԽͷઃఆʹద͢Δ  65 • աఔͷදݱྗͱਪఆਫ਼౓ʹੑೳ͕ґଘɻ +

    ଟมྔΛ૝ఆͨ͠ঢ়ଶͱ؍ଌ஋ͷؔ܎ੑ΋දݱͰ͖ ΔͨΊɺจ຺෇͖ͷઃఆ΋ࣗવʹѻ͑Δ ঢ়ଶۭؒϞσϧܕ State space model State Context ඇఆৗ͔ͭจ຺෇͖ଟ࿹όϯσΟοτํࡦ IUUQTJDPOTDPN
  28.  70 Time-varying Thompson Sampling (TVTP) [42] [C. Zeng 2016]

    • ঢ়ଶۭؒϞσϧܕͷඇఆৗ͔ͭจ຺෇͖ํࡦ • ࿹ͷ༗༻ੑͷมಈΛදݱ͢ΔͨΊͷDynamic Context drift ModelingΛఏҊ • ঢ়ଶͷෳࡶͳύϥϝʔλਪఆͷͨΊཻࢠϑΟϧλΛ༻͍Δ • → աఔͷදݱྗͱਪఆਫ਼౓ͷ՝୊Λղܾ • ҰํͰɺࢼߦճ਺ͷ૿Ճʹ൐͍୳ࡧͷׂ߹͕ٸܹʹ௿Լ͢Δ • → ൪ڰΘͤ΁ͷ௥ै͕े෼Ͱ͸ͳ͍
  29.  72 <>'JH(SBQIJDBMNPEFMSFQSFTFOUBUJPOGPSCBOEJUQSPCMFN ίϯςΩετ ఆৗ߲ ඇఆৗ߲ ʢεέʔϧ߲ɺυϦϑτ߲ʣ ؍ଌޡࠩ Time-varying Thompson

    Sampling (TVTP) [42] [C. Zeng 2016] y(l) t ∼ 𝒩 (x 𝖳 t (c(l) + θ(l) ⊙ η(l) t ), σ2(l) ϵ ) k → l, cwk → c(l), θk → θ(l), ηk,t → η(l) t , σ2 k → σ2(l) ϵ μc → μ(l) w , Σc → Σ(l) w , α → α(l) ϵ , β → β(l) ϵ , μθ → μ(l) θ , Σθ → Σ(l) θ ˎ ຊ࿦จͷදهͱͷ౷ҰͷͨΊݩ࿦จͷFig.2ʢӈ্ʣͷه๏͸ҎԼʹஔ׵͢Δ • ঢ়ଶۭؒϞσϧ • จ຺෇͖ɺ͔ͭɺඇఆৗͳใुͷมԽ Λѻ͏ͨΊใुͷมಈΛ૊ΈࠐΜͩϞ σϧΛ༻͍Δ
  30.  73 <>'JH(SBQIJDBMNPEFMSFQSFTFOUBUJPOGPSCBOEJUQSPCMFN Time-varying Thompson Sampling (TVTP) [42] [C. Zeng

    2016] y(l) t ∼ 𝒩 (x 𝖳 t (c(l) + θ(l) ⊙ η(l) t ), σ2(l) ϵ ) ΧϧϚϯϑΟϧλ ཻࢠϑΟϧλ k → l, cwk → c(l), θk → θ(l), ηk,t → η(l) t , σ2 k → σ2(l) ϵ μc → μ(l) w , Σc → Σ(l) w , α → α(l) ϵ , β → β(l) ϵ , μθ → μ(l) θ , Σθ → Σ(l) θ ˎ ຊ࿦จͷදهͱͷ౷ҰͷͨΊݩ࿦จͷFig.2ʢӈ্ʣͷه๏͸ҎԼʹஔ׵͢Δ • ঢ়ଶͷਪఆ • ใुϞσϧͷύϥϝʔλͷࣄޙ෼෍ͱυ Ϧϑτ߲ͷજࡏঢ়ଶͷஞ࣍ਪఆʹཻࢠ ϑΟϧλͱΧϧϚϯϑΟϧλΛ༻͍Δ
  31.  74 <>'JH(SBQIJDBMNPEFMSFQSFTFOUBUJPOGPSCBOEJUQSPCMFN Time-varying Thompson Sampling (TVTP) [42] [C. Zeng

    2016] y(l) t ∼ 𝒩 (x 𝖳 t (c(l) + θ(l) ⊙ η(l) t ), σ2(l) ϵ ) k → l, cwk → c(l), θk → θ(l), ηk,t → η(l) t , σ2 k → σ2(l) ϵ μc → μ(l) w , Σc → Σ(l) w , α → α(l) ϵ , β → β(l) ϵ , μθ → μ(l) θ , Σθ → Σ(l) θ ˎ ຊ࿦จͷදهͱͷ౷ҰͷͨΊݩ࿦จͷFig.2ʢӈ্ʣͷه๏͸ҎԼʹஔ׵͢Δ ࣄޙ෼෍ ࣄޙ෼෍ l* t = argmaxl=1,L x⊤ ¯ w(l) t−1 ¯ w(l) t−1 ∼ 𝒩 D ( ¯ μ(l) w , ¯ Σ(l) w ) • ֬཰Ұக๏ • ֤࿹ͰٻΊͨύϥϝʔλͷࣄޙ෼෍ʹै ͍αϯϓϦϯάͨ݁͠ՌΛ࿹ͷબఆʹ༻ ͍Δ͜ͱͰଟ࿹όϯσΟοτํࡦͱ౷߹
  32.  75 TVTPͷ՝୊ • ࢼߦճ਺ͷ૿Ճʹ൐͏࿹ͷબఆͷภΓ • ͋Δ࣌఺ͰධՁͷ௿͍࿹Λ୳ࡧ͢Δػձ͕ۃ୺ʹ௿Լ • ൪ڰΘͤͷঢ়گ΁ͷ௥ै͕஗ΕΔ ࢼߦճ਺ͷ૿Ճʹ൐͏ٸܹͳݮগʹΑΓ

    ͋Δ࣌఺ͷධՁʹج͍ͮͨબ୒ʹݻఆ͞ΕΔ l* t = argmaxl=1,L x⊤ ¯ w(l) t−1 ¯ w(l) t−1 ∼ 𝒩 D ( ¯ μ(l) w , ¯ Σ(l) w ) ¯ Σ(l) w = 1 p2 p ∑ i=1 σ2(l,i) ϵ Σ(l,i) w , XIFSF Q JT OVNCFS PG QBSUJDMFT
  33. ¯ Σ(l) w = 1 p2 p ∑ i=1 σ2(l,i)

    ϵ Σ(l,i) w , XIFSF Q JT OVNCFS PG QBSUJDMFT  76 ఏҊํࡦ: Aggressive Exploration TVTPʢAE-TVTPʣ • ࢼߦճ਺ͷ૿Ճʹ൐͏࿹ͷબఆͷภΓΛղফ • ͋Δ࣌఺ͰධՁͷ௿͍࿹Λੵۃతʹ୳ࡧ͢ΔػձΛઃ͚Δ • ൪ڰΘͤͷঢ়گʹ͓͚Δ௥ैੑͷ޲্ΛਤΔ ཻࢠͷฏۉ ཻ֤ࢠͰͷ৐ࢉͷΈ l* t = argmaxl=1,L x⊤ ¯ w(l) t−1 ¯ w(l) t−1 ∼ 𝒩 D ( ¯ μ(l) w , ¯ Σ(l) w )
  34. • બ୒ͨ͠ਪનख๏͔ΒಘΒΕΔΫϦοΫ਺ͷγϛϡϨʔγϣϯ • ํࡦʹΑΓબ୒͞Εͨਪનख๏͸ɺઃఆͨ͠ΫϦοΫ཰ͷϕϧψʔΠ෼෍ʹै͍ਪ ન݁Ռ͕ΫϦοΫ͞ΕΔ΋ͷͱ͢Δ • ֤ਪનख๏͸঎඼ΧςΰϦ਺ͱ౳͍͠18࣍ݩͷύϥϝʔλ  Λ࣋ͭ •

    ΫϦοΫ཰͸  ͱίϯςΩετ৘ใ  ͷ಺ੵͰܭࢉ͞ΕΔ • ίϯςΩετ৘ใ  ͸ɺ࣌఺  ʹ͓͍ͯར༻ऀ͕Ӿཡ͍ͯ͠Δ঎඼ΧςΰϦͷ1-hot ϕΫτϧͱͯ͠දݱ͞ΕΔ • ࣮ࡍͷਪનγεςϜͷڍಈͱ߹ΘͤΔͨΊɺใु͸1࣌ؒ͝ͱʹ·ͱΊͯϑΟʔυ όοΫ͞ΕΔ΋ͷͱ͢Δ ˜ w(l) t ˜ w(l) t xt xt t  80 ධՁํ๏ʢ1/2ʣʢ3ষͱಉҰʣ
  35. • จ຺ͱ࣌ؒͷܦաͷߟྀͷͦΕͧΕͷد༩౓Λ໌Β͔ʹ͢Δ4άϧʔϓͷγ ϛϡϨʔγϣϯΛ࣮ࢪ  81 ධՁํ๏ʢ2/2ʣʢํࡦΛআ͖ɺ3ষͱಉҰʣ ࣌ؒͷܦա º ˓ จ຺

    º "ىटͷ࠷ળͳਪનख๏ΛશظؒҰ؏ ͯ͠༻͍Δ $࣌఺͝ͱʹධՁͷߴ͍ਪનख๏Λόϯ σΟοτΛ༻͍ͯબఆ ˓ #จ຺͝ͱʹ࠷ળͳਪનख๏Λશظؒ Ұ؏ͯ͠༻͍Δ %จ຺͝ͱ࣌఺͝ͱʹධՁͷߴ͍ਪનख ๏ΛόϯσΟοτΛ༻͍ͯબఆ • จ຺ʹ͸ɺਪન࣌ʹӾཡதͷ঎඼ΧςΰϦΛ༻͍Δ • ଟ࿹όϯσΟοτํࡦ͸ɺLTS(จ຺) ɺTVTP(จ຺/࣌ؒͷܦա) ɺAE-TVTP(จ຺/࣌ؒͷܦա)
  36. • จ຺ͱ࣌ؒͷܦաͷߟྀͷͦΕͧΕͷد༩౓Λ໌Β͔ʹ͢Δ4άϧʔϓͷγ ϛϡϨʔγϣϯΛ࣮ࢪ  82 ࣌ؒͷܦա º ˓ จ຺ º

    "ىटͷ࠷ળͳਪનख๏ΛશظؒҰ؏ ͯ͠༻͍Δ $࣌఺͝ͱʹධՁͷߴ͍ਪનख๏Λόϯ σΟοτΛ༻͍ͯબఆ ˓ #จ຺͝ͱʹ࠷ળͳਪનख๏Λશظؒ Ұ؏ͯ͠༻͍Δ %จ຺͝ͱ࣌఺͝ͱʹධՁͷߴ͍ਪનख ๏ΛόϯσΟοτΛ༻͍ͯબఆ ඇఆৗɾ൪ڰΘͤͷߟྀ͢Δํࡦ͕༗ޮͰ͋Δ͔ ධՁํ๏ʢ2/2ʣʢํࡦΛআ͖ɺ3ষͱಉҰʣ
  37. • Bάϧʔϓʢจ຺ʣ͸ظट࣌఺ʹ͓͍ͯAά ϧʔϓͱจ຺ʹΑΔࠩҟ͕΄΅ͳ͍ͨΊ݁ Ռ΋ࠩҟͳ͠ • Cάϧʔϓʢ࣌ؒͷܦաʣ͸ਪનख๏ͷ༗ ޮੑͷมԽʹ௥ैͨ͜͠ͱͰվળ͕ݟΒΕ Δ • Dάϧʔϓʢจ຺ͱ࣌ؒͷܦաʣ͸঎඼Χ

    ςΰϦ͝ͱͷมԽʹ௥ैͨ͜͠ͱͰߋͳΔ վળ͕ݟΒΕΔ  83 ධՁ݁Ռ: AάϧʔϓΛج४ͱͨ͠ྦྷੵใुͷࠩͷൺֱ จ຺ͱ࣌ؒͷܦաͷߟྀͳΒͼʹɺٯస؀ڥͷ௥ ैੑΛߴΊͨఏҊํࣜʹΑͬͯ໿૿Ճ จ຺ͷΈ ࣌ؒͷܦաͷΈ
  38. • Dάϧʔϓͷจ຺͝ͱͷվળ݁ՌΛ෼ੳ͢ Δͱɺਪનख๏ͷ༗ޮੑͷมಈͷগͳ͍จ ຺Ͱ͸ɺશͯͷํࡦʹ͓͍ͯ୳ࡧͷίετ ΛճऩͰ͖͍ͯͳ͍ • มಈͷେ͖͍จ຺Ͱ͸ੵۃతͳ୳ࡧʹΑΓ AE-TVTP͕େ͖͘վળͨ͠ɻTVTP͸ਪન ख๏͕ݻఆ͞ΕվળʹࢸΒͳ͔ͬͨ 

    84 ධՁ݁Ռ: ਪનख๏ͷ༗ޮੑͷมಈ౓߹͍ͱվળ཰ • ԣ࣠ͷਪનख๏ͷ༗ޮੑͷมಈͷେ͖͞͸ɺظटʹ࠷΋ΫϦοΫ཰͕ߴ͔ͬͨਪનख๏ʹରͯ͠ɺ֤࣌఺Ͱ࠷େͷΫϦοΫ཰ͱͷࠩͷɺظؒ·Ͱͷ߹ܭ • ॎ࣠ͷྦྷੵใुͷվળ཰͸ɺDάϧʔϓͷ֤ํࡦͷྦྷੵใुͱɺBάϧʔϓͷ͏ͪਪનख๏ΛҰ؏ͯ͠༻͍ͨ݁Ռʹର͢Δൺ
  39.  85 ߟ࡯ • ਪનख๏ͷ༗ޮੑ͕ٯస͢Δࠨྻʹ͓͍ͯ ͸ఏҊख๏͕༗ޮ • ӈྻʹ͓͍ͯ͸ɺఏҊख๏ͷੵۃతͳ୳ࡧ ʹىҼͯ͠ɺظؒதܧଓతʹྦྷੵϦάϨο τ͕૿Ճ͢Δɻ3ׂఔ౓ͷ঎඼ΧςΰϦͰ

    ಉ༷ͷࣄ৅Λ֬ೝɻ ਪનख๏ͷ༗ޮੑʹٯస͕͋Δ঎඼ΧςΰϦ ࠨ ͱɺ ͳ͍঎඼ΧςΰϦ ӈ ʹ͓͚ΔྦྷੵϦάϨοτͷਪҠ ˎྦྷੵϦάϨοτ͸ਪનख๏ͷ͏ͪ࠷େͷظ଴஋ͱબ୒ͨ͠ਪનख ๏ͷظ଴஋ͷࠩΛظؒ·Ͱʹ߹ܭͨ͠΋ͷ • มԽͷͳ͍ظؒʹ͓͍ͯ΋ػձଛࣦΛ௿ݮ ͢ΔదԠతͳ୳ࡧख๏ͷݚڀ΁
  40. • ఏҊํࡦAE-TVTPͰ͋ͬͯ΋ɺ௕ظతʹ͸ڞ෼ࢄߦྻ  ͷཁૉ͸খ͍͞஋ʹ ऩଋ͢Δ͜ͱɺ·ͨɺ୳ࡧଅਐ͸ɺબ୒ͨ͠࿹ͷΈʹର͢ΔॲஔͰ͋Δ͜ͱ͔ Βɺ௕ظؒͷӡ༻࣌ʹ͓͚Δ௥ैੑͷ௿Լ͕ݒ೦͞ΕΔ • ঢ়ଶਪఆͰ༻͍ΔཻࢠϑΟϧλ͸ɺཻࢠ͝ͱͷෳ਺ճͷٯߦྻܭࢉΛؚΉͨ ΊɺίϯςΩετ৘ใͷ࣍ݩ਺ͱཻࢠ਺ͷ૿Ճʹ൐͍ܭࢉ͕࣌ؒ૿Ճ͢Δɻ •

    → ཻࢠ਺͸ɺࣄޙ෼෍ͷਪఆਫ਼౓ʹӨڹ͢ΔͨΊɺਫ਼౓ͱ࣮ߦ࣌ؒͷτϨʔυΦϑ͕ੜ͡Δ ¯ Σ(l) w  86 ఏҊʹؔ͢Δٞ࿦ ¯ Σ(l) w = 1 p p ∑ i=1 Σ(l,i) w l* t = argmaxl=1,L x⊤ ¯ w(l) t−1 ¯ w(l) t−1 ∼ 𝒩 D ( ¯ μ(l) w , ¯ Σ(l) w )
  41. ຊݚڀͷ၆ᛌਤ  90 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity

    Online performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ
  42.  91 ໨తͱఏҊͷࠎࢠ • ػցֶशϞσϧͷ༏ྼ͸ɺจ຺΍࣌ؒͷܦա͚ͩͰͳ͘ɺԠ౴ੑɾ௥ैੑʹΑͬ ͯ΋ࠨӈ͞ΕΔ • ޮՌతͳϞσϧΛػձଛࣦ͕ͳ͍Α͏จ຺ͱ࣌ؒͷܦաʹԠͯ͡࢖͍෼͚ͨ ͍ɻͦΕΛ࣮ߦ࣌ؒͷ؍఺Ͱҙࣝͤ͞Δ͜ͱͳ͘ߦ͍͍ͨ •

    จ຺ͱ࣌ؒͷܦաʹԠͨ͡બ୒ͷ࠷దԽΛɺඇఆৗ͔ͭจ຺෇͖ଟ࿹όϯ σΟοτ໰୊ͱΈͳ͠ɺਝ଎ʹ͜ΕΛղ͘ɺઢܗΨ΢εঢ়ଶۭؒϞσϧͱઢܗ ΧϧϚϯϑΟϧλΛ༻͍ͨํࡦΛఏҊ
  43. • ஞ࣍తʹಘΒΕΔใुɾίϯςΩετ৘ใʹର͢Δஞ࣍తͳֶशͷඞཁੑ • ஞ࣍తͳֶशʹ͓͚Δֶश࣌ؒ΁ͷओͨΔӨڹཁҼ  93 ଟ࿹όϯσΟοτํࡦͷ࣮ߦ࣌ؒʹؔ͢Δ՝୊ͱରࡦ 1. શͯͷֶशσʔλΛ༻͍ͨ࠶ܭࢉ →

     ૊·Ͱͷܭࢉ݁Ռͱ  ૊໨Ͱͷ؍ଌ஋ͷΈΛ༻͍Δ࠶ؼతֶशػߏ 2. ٯߦྻͷܭࢉ → ্هͷ࠶ؼతֶशʹ͓͚Δ ɹ Woodburyͷ߃౳ࣜͷద༻ N N + 1 ஞ࣍࠷খೋ৐๏
  44.  94 ํࣜ ໛ࣜਤ ՝୊ ๨٫ܕ ๨٫܎਺෇͖ɺ͔ͭɺਖ਼ଇԽΛؚΉஞ࣍࠷খೋ৐๏͸ ൚Խੑೳ͕௿Լ [41] εϥΠσΟϯά

    ΢Οϯυ΢ܕ ैདྷํࡦͰ͸ஞ࣍࠷খೋ৐๏ͷద༻ํ๏͸ࣔ͞Εͳ͍ [69] มԽݕग़ܕ มԽݕग़Ϟσϧͷ࣮ߦ࣌ؒ͸ଟมྔʹରԠ͢ΔͨΊ૿Ճ [43,70,71] ঢ়ଶۭؒ Ϟσϧܕ ෳࡶͳաఔΛදݱͰ͖ΔϞσϧʹର͢Δਪఆख๏͸ ࣮ߦ͕࣌ؒ௕͍ [42] Regression model Regression model Regression model Detection model State space model State γ0 γ1 γ2 γ3 w దԠͷߴ଎Խͷ՝୊ • จ຺ͱඇఆৗੑΛಉ࣌ʹߟྀ͢Δࡍͷ࣮ߦ࣌ؒʹؔ͢Δ՝୊ IUUQTJDPOTDPN
  45.  95 Phased Initial Exploration of System [73] [J. Cornet

    2022] • ঢ়ଶۭؒϞσϧܕͷඇఆৗ͔ͭจ຺෇͖ํࡦ • ࿹ͷ༗༻ੑͷมಈΛදݱ͢ΔͨΊɺঢ়ଶ΍؍ଌ஋ͷؔ܎ੑ͕ɺઢܗ͔ͭޡ ͕ࠩਖ਼ن෼෍ʹै͏ͱԾఆͨ͠ઢܗΨ΢εঢ়ଶۭؒϞσϧΛ࠾༻ • ঢ়ଶͷύϥϝʔλਪఆͷͨΊܰྔͳઢܗΧϧϚϯϑΟϧλΛ༻͍Δ • → ঢ়ଶਪҠͷࣗ༝౓Λ੍ݶ͢Δ͜ͱͰ࣮ߦ࣌ؒͷ୹ॖΛਤΔ • ҰํͰɺίϯςΩετ৘ใ͕લͷ࣌఺ͷঢ়ଶʹґଘ͢Δͱ͍͏ԾఆΛ࣋ͭ • → ίϯςΩετ৘ใ͕֎෦͔Β༩͑ΒΕΔจ຺෇͖ͷ໰୊Λѻ͑ͳ͍ Linear Gaussian state space (LGSS)
  46.  96 • ঢ়ଶۭؒϞσϧܕͷඇఆৗ͔ͭจ຺෇͖ํࡦ • ࿹ͷ༗༻ੑͷมಈΛදݱ͢ΔͨΊɺঢ়ଶ΍؍ଌ஋ͷؔ܎ੑ͕ɺઢܗ͔ͭޡ ͕ࠩਖ਼ن෼෍ʹै͏ͱԾఆͨ͠ઢܗΨ΢εঢ়ଶۭؒϞσϧΛ࠾༻ • ঢ়ଶͷύϥϝʔλਪఆͷͨΊܰྔͳઢܗΧϧϚϯϑΟϧλΛ༻͍Δ •

    → ঢ়ଶਪҠͷࣗ༝౓Λ੍ݶ͢Δ͜ͱͰ࣮ߦ࣌ؒͷ୹ॖΛਤΔ • ·ͨɺ୳ࡧΛଅ͢ػߏΛඋ͍͑ͯͳ͍ • → ൪ڰΘͤͷঢ়گ΁ͷରԠ͕ॆ෼Ͱ͸ͳ͍ Phased Initial Exploration of System [73] [J. Cornet 2022]
  47.  97 KF-MANB [40] [O. Granmo 2010] • ঢ়ଶۭؒϞσϧܕͷඇఆৗͳํࡦ •

    ࿹ͷ༗༻ੑͷมಈΛදݱ͢ΔͨΊϩʔΧϧϨϕϧϞσϧΛ࠾༻ • ঢ়ଶͷύϥϝʔλਪఆͷͨΊܰྔͳΧϧϚϯϑΟϧλΛ༻͍Δ • ܽଌ஋ॲཧΛԉ༻ͨ͠ະબ୒ͷ࿹ʹର͢Δ୳ࡧΛఏҊ • → ࣮ߦ࣌ؒͷ୹ॖͱɺ୳ࡧͷภΓͷղফΛ࣮ݱ • ҰํͰɺίϯςΩετ৘ใΛೖྗͰ͖ͳ͍ํࡦ • → จ຺෇͖ͷ໰୊Λѻ͏͜ͱ͕Ͱ͖ͳ͍ ࢼߦ͝ͱʹશͯͷ࿹Λߋ৽͢Δͨ Ίɺܰྔͳਪఆख๏Ͱͷ࠾༻͕๬ ·͍͠ํࣜ
  48. • ఏҊํࡦͰ͸ใुϞσϧ͕ҎԼͷઢܗΨ΢εঢ়ଶۭؒϞσϧʹै͏ͱԾఆ͢Δ ઢܗΨ΢εঢ়ଶۭؒϞσϧΛ༻͍ͨใुϞσϧ  99 • ঢ়ଶ  ͕લͷ࣌఺ͷঢ়ଶ 

    ʹใु  ͕ঢ়ଶ  ʹै͏͜ͱΛදݱ͍ͯ͠Δ αt αt−1 rt αt  Rt  t + 1  Zt  Tt  ηt  ϵt  rt  αt  αt+1  +  + rt = Zt αt + ϵt , ϵt ∼ 𝒩 (0,σ2 ϵ ), αt+1 = Tt αt + ηt Rt , ηt ∼ 𝒩 (0,σ2 η ), t = 1,…, τ α1 ∼ 𝒩 d (μ1 , Σ1 ),
  49. • Ұظઌ༧ଌͱ؍ଌ݁ՌʹΑΔϑΟϧλϦϯάΛ܁Γฦͯ͠ঢ়ଶΛܧଓతʹਪఆ • ঢ়ଶ  ͸ฏۉ  ͱͦͷڞ෼ࢄߦྻ  Ͱදݱ͢Δ͜ͱ͕Ͱ͖Δ

    α μ Σ  100 ઢܗΧϧϚϯϑΟϧλʹΑΔ࿹ͷঢ়ଶਪఆ Ұظઌͷঢ়ଶ༧ଌ Ұظઌͷ؍ଌ༧ଌ ؍ଌͷޡࠩ ϑΟϧλϦϯά ઢܗΧϧϚϯϑΟϧλʹ͓͚Δঢ়ଶߋ৽ͷαΠΫϧ μt+1 = Tμt|t Σt+1 = TΣt|t T 𝖳 + RQR 𝖳 ̂ yt = Zμt vt = yt − ̂ yt Ft = ZΣt Z 𝖳 + H μt|t = μt + Gvt = μt + (Σt Z 𝖳 F−1 t )vt Σt|t = Σt − GFt G 𝖳 t = t + 1
  50.  101 Ұظઌͷঢ়ଶ༧ଌ Ұظઌͷ؍ଌ༧ଌ ؍ଌͷޡࠩ ϑΟϧλϦϯά ઢܗΧϧϚϯϑΟϧλʹ͓͚Δঢ়ଶߋ৽ͷαΠΫϧ μt+1 = Tμt|t

    Σt+1 = TΣt|t T 𝖳 + RQR 𝖳 ̂ yt = Zμt vt = yt − ̂ yt Ft = ZΣt Z 𝖳 + H μt|t = μt + Gvt = μt + (Σt Z 𝖳 F−1 t )vt Σt|t = Σt − GFt G 𝖳 t = t + 1 จ຺ʹԠͨ͡࿹ͷධՁߋ৽ • ઢܗΧϧϚϯϑΟϧλʹ͓͚Δߦྻ  ΍  ͸ϑΟϧλϦϯά΍Ұظઌ༧ଌͷॲ ཧʹ͓͍ͯঢ়ଶͷฏۉ΍෼ࢄڞ෼ࢄͷͲͷཁૉΛߋ৽͢Δ͔ܾఆ͍ͯ͠Δ Z R ঢ়ଶਪҠ࣌ͷޡࠩΛ෇༩ ؍ଌ࣌ͷޡࠩΛ෇༩
  51.  102 Ұظઌͷঢ়ଶ༧ଌ μt+1 = Tμt|t Σt+1 = TΣt|t T

    𝖳 + RQR 𝖳 จ຺ʹԠͨ͡࿹ͷධՁߋ৽ ঢ়ଶਪҠ࣌ͷޡࠩΛ෇༩ ഉଞతͳίϯςΩετͷ࣍ݩʹ΋ޡ͕ࠩ஝ੵͯ͠͠·͏ ίϯςΩετAͰऑ͍࿹͸ɺܽଛ஋ॲཧʹΑΓଞͷίϯςΩετʹରͯ͠΋ޡࠩΛ෇༩͢Δɻ ίϯςΩετBͰ10ճఔ౓ͷ୳ࡧͰྑ͍΋ͷ͕20ճ୳ࡧ͞ΕΔ͔΋͠Εͳ͍ • ઢܗΧϧϚϯϑΟϧλʹ͓͚Δߦྻ  ΍  ͸ϑΟϧλϦϯά΍Ұظઌ༧ଌͷॲ ཧʹ͓͍ͯঢ়ଶͷฏۉ΍෼ࢄڞ෼ࢄͷͲͷཁૉΛߋ৽͢Δ͔ܾఆ͍ͯ͠Δ • → ଟ༷ͳจ຺Λѻ͏ઃఆͰ͸ৗʹಉ͡ߦྻΛ༻͍ͨߋ৽͸ద͞ͳ͍ Z R RQR 𝖳 = σ2 η σ2 η σ2 η σ2 η σ2 η σ2 η σ2 η σ2 η σ2 η R = [ 1 1 1 ] , Q = [σ2 η ]
  52.  103 Ұظઌͷঢ়ଶ༧ଌ μt+1 = Tμt|t Σt+1 = TΣt|t T

    𝖳 + RQR 𝖳 จ຺ʹԠͨ͡࿹ͷධՁߋ৽ ঢ়ଶਪҠ࣌ͷޡࠩΛ෇༩ Rt QR 𝖳 t = σ2 η 0 σ2 η 0 0 0 σ2 η 0 σ2 η Rt = [ 1 0 1 ] , Q = [σ2 η ] • ઢܗΧϧϚϯϑΟϧλʹ͓͚Δߦྻ  ΍  ͸ϑΟϧλϦϯά΍Ұظઌ༧ଌͷॲ ཧʹ͓͍ͯঢ়ଶͷฏۉ΍ڞ෼ࢄߦྻͷͲͷཁૉΛߋ৽͢Δ͔ܾఆ͍ͯ͠Δ • → จ຺ʹԠͨ࣌͡มͳߦྻΛ༻ҙ͢Δ Z R ίϯςΩετͷԠͨ͡  (=  )Λ༻ҙ͢Δʢ  ʹ͍ͭͯ΋ಉ༷ʣ R Rt Z Rt = xt = (1,0,1) 𝖳 (xt ∈ {0,1}m)
  53.  104 จ຺Λߟྀͨ͠܎਺ߦྻ • ఏҊͷ܎਺ߦྻͷઃܭ͸ߏ଄࣌ܥྻϞσϧʹجͮ͘ • ঢ়ଶͷਪҠʹ͓͍ͯਫ४੒෼ͷΈΛѻ͏৔߹ͷઃܭ • ঢ়ଶͷਪҠʹ͓͍ͯਫ४੒෼͚ͩͰͳ͘܏޲੒෼ΛؚΊΔΑ͏༰қʹ֦ுՄೳ Zt

    = (x 𝖳 t , 01×D), Rt = [ xt 0D×1 0D×1 xt ] , Tt = [ ID diag(xt ) 0D×D ID ] xt ∈ {0,1}m αt [0 : D] = wt Zt = x 𝖳 t , Rt = xt , Tt = ID , xt ∈ {0,1}D
  54.  105 ఏҊํࡦ: LGSS bandits 1. ֤࿹  ʹ͓͍ͯɺঢ়ଶ 

    ͷฏۉ  ͱڞ෼ࢄߦྻ  Λਪఆ͢Δ 2. ਪఆ஋Λ༻͍ͨଟมྔਖ਼ن෼෍ʹै͏ཚ਺Λ֤࿹ͷύϥϝʔλ  ͱͯ͠ಘΔ 3. ύϥϝʔλ  ͱίϯςΩετ৘ใ  ͷ಺ੵ͕࠷΋େ͖͍࿹Λબఆ͢Δ l α(l) t ( = w(l) t ) μ(l) t Σ(l) t ˜ w(l) t ˜ w(l) t xt  l* t = argmaxl=1,L (x⊤ t ˜ w(l) t , ˜ w(l) t ∼ 𝒩 D (μ(l) t , Σ(l) t )) • ઢܗΧϧϚϯϑΟϧλͷঢ়ଶਪఆ஋Λ༻͍ͨ֬཰Ұக๏
  55.  106 ઢܗΧϧϚϯϑΟϧλʹΑΔ࿹ͷঢ়ଶਪఆʢ࠶ܝʣ Ұظઌͷঢ়ଶ༧ଌ Ұظઌͷ؍ଌ༧ଌ ؍ଌͷޡࠩ ϑΟϧλϦϯά ઢܗΧϧϚϯϑΟϧλʹ͓͚Δঢ়ଶߋ৽ͷαΠΫϧ μt+1 =

    Tμt|t Σt+1 = TΣt|t T 𝖳 + RQR 𝖳 ̂ yt = Zμt vt = yt − ̂ yt Ft = ZΣt Z 𝖳 + H μt|t = μt + Gvt = μt + (Σt Z 𝖳 F−1 t )vt Σt|t = Σt − GFt G 𝖳 t = t + 1 • Ұظઌ༧ଌͱ؍ଌ݁ՌʹΑΔϑΟϧλϦϯάΛ܁Γฦͯ͠ঢ়ଶΛܧଓతʹਪఆ • ঢ়ଶ  ͸ฏۉ  ͱͦͷڞ෼ࢄߦྻ  Ͱදݱ͢Δ͜ͱ͕Ͱ͖Δ α μ Σ ෼ࢄڞ෼ࢄΛখ͘͞ ෼ࢄڞ෼ࢄΛେ͖͘
  56.  107 Ұظઌͷঢ়ଶ༧ଌ ϑΟϧλϦϯά μt+1 = Tμt|t Σt+1 = TΣt|t

    T 𝖳 + RQR 𝖳 μt|t = μt Σt|t = Σt t = t + 1 ෼ࢄڞ෼ࢄΛେ͖͘ ܽଌ஋Λ༻͍ͨԾ૝తͳ୳ࡧ • ઢܗΧϧϚϯϑΟϧλͰ͸؍ଌ஋͕ಘΒΕͳ͍৔߹΋ܽଌ஋ͱͯ͠ѻ͑Δ • ͜ͷܽଌ஋ॲཧΛબఆ͞Εͳ͔ͬͨ࿹ʹର͢Δߋ৽ૢ࡞ͱͯ͠औΓೖΕΔ ઢܗΧϧϚϯϑΟϧλʹ͓͚Δܽଌ஋ॲཧͷαΠΫϧ • ظ଴͞ΕΔޮՌ • ֬཰Ұக๏ͷ࢓૊ΈʹΑΓબఆػձͷগͳ͍࿹ʹର͢Δ୳ࡧ͕ଅਐ͞ΕΔ
  57. • ࣮ࡍͷECαΠτ͔Β࠾औͨ͠4ͭͷਪનख๏ͷ঎඼ΧςΰϦ͝ͱͷΫϦοΫ཰ ͷਪҠ࣮੷σʔλΛ༻͍ͯఏҊγεςϜͷ༗ޮੑΛධՁ͢Δ • 4ষͱಉҰͷσʔλ͕ͩɺධՁ࣌ؒͷ੍ݶʹΑΓର৅ظؒΛ୹ॖͨ͠ • 2019/6/20ʙʮ8/4·Ͱͷ໿225ສճʯ͔Βʮ7/22·Ͱͷ໿149ສճʯͷਪ નσʔλ΁  109

    ධՁσʔλͱਪનख๏ 0 100 200 300 400 500 600 Hours 0.05 0.10 0.15 Click-through rate Browsing path Demographic LLR Similar image ࠷΋ߴ͍ΫϦοΫ཰ͷ੾ΓସΘΓɻ ຊظؒʹ͓͍ͯ΋ɺਪનख๏ͷయܕతͳ ಛ௃͕ݱΕ͍ͯΔɻ
  58. • બ୒ͨ͠ਪનख๏͔ΒಘΒΕΔΫϦοΫ਺ͷγϛϡϨʔγϣϯ • ํࡦʹΑΓબ୒͞Εͨਪનख๏͸ɺઃఆͨ͠ΫϦοΫ཰ͷϕϧψʔΠ෼෍ʹै͍ਪ ન݁Ռ͕ΫϦοΫ͞ΕΔ΋ͷͱ͢Δ • ֤ਪનख๏͸঎඼ΧςΰϦ਺ͱ౳͍͠18࣍ݩͷύϥϝʔλ  Λ࣋ͭ •

    ΫϦοΫ཰͸  ͱίϯςΩετ৘ใ  ͷ಺ੵͰܭࢉ͞ΕΔ • ίϯςΩετ৘ใ  ͸ɺ࣌఺  ʹ͓͍ͯར༻ऀ͕Ӿཡ͍ͯ͠Δ঎඼ΧςΰϦͷ1-hot ϕΫτϧͱͯ͠දݱ͞ΕΔ • ࣮ࡍͷਪનγεςϜͷڍಈͱ߹ΘͤΔͨΊɺใु͸1࣌ؒ͝ͱʹ·ͱΊͯϑΟʔυ όοΫ͞ΕΔ΋ͷͱ͢Δ ˜ w(l) t ˜ w(l) t xt xt t  110 ධՁํ๏ʢ1/2ʣʢ4ষͱಉҰʣ
  59.  111 ର৅ ํࡦ උߟ ඇఆৗ ͔ͭ จ຺෇͖ LGSS banditsʢఏҊํࡦʣ

    ঢ়ଶۭؒϞσϧܕ TVTP [42] ঢ়ଶۭؒϞσϧܕɻධՁ࣌ؒʢ24࣌ؒʣ಺ʹύϥϝʔλௐ੔ ͱ࣮ݧͷ࣮ߦ͕׬ྃ͠ͳ͔ͬͨͨΊɺҰ෦ධՁͷΈ࣮ࢪ AdTS [43] มԽݕग़ܕɻ෼ׂͨ͠ܥྻͷฏۉͱڞ෼ࢄߦྻ͔ΒͷϚϋϥϊ Ϗεڑ཭ͷܥྻʹରͯ͠ϒʔτετϥοϓ๏ͰมԽݕग़ Decay LinUCB [68] ๨٫ܕ dLinUCB [70] มԽݕग़ܕɻใु༧ଌͷޡࠩΛݕग़͠ɺ৽ͨͳใु෼෍༻ͷ όϯσΟοτϞσϧΛ௥Ճɻᮢ஋ͷ௿ԼͰఆظతͳॳظԽ͋Γ จ຺෇͖ LTS [47] LinUCB [33] Neural Linear (+ LTS) [76] χϡʔϥϧωοτϫʔΫͷֶशִؒ͸ධՁ࣌ؒͷ੍ݶͱมԽ΁ ͷ௥ैੑͷ؍఺͔Βࢦ਺ؔ਺తͳִؒΛ࠾༻ ධՁํ๏ʢ2/2ʣ • γϛϡϨʔγϣϯʹ༻͍Δํࡦ
  60. ධՁ݁Ռ: Ԡ౴࣌ؒ΁ͷӨڹ  112 0.0 0.2 0.4 ms 0.48 0.19

    0.04 0.24 0.29 0.04 0.27 Select per 1 time 10°1 100 sec (Log) 1.49 0.04 0.06 0.09 0.04 0.05 1.82 Update per 2665 times adaptive thompson sampling decay linucb dynamic linucb LGSS bandits linear thompson sampling linucb neural linear Elapsed time for 2665 times for 18 dimension 4 arms • ࿹ͷબఆʹ͍ͭͯLGSS bandits͸0.24ϛϦඵ • ͢΂ͯͷଌఆ஋͸ 0.5 ϛϦඵະຬͰ͋Γɺ ॏେͳӨڹΛ༩͑ͳ͍े෼ͳύϑΥʔϚϯ εͰ͋Δͱ൑அͰ͖Δɻ • ධՁͷߋ৽ʹ͍ͭͯ୯Ґ࣌ؒʢ1࣌ؒʣͱൺֱ ͯ͠े෼ʹ୹͍ • TVTP΍AdTSɺNeural LinearͰ͸ࢼߦճ਺ ͷ૿Ճʹରͯ͠εέʔϥϏϦςΟ͕ͳ͍ ࿹ͷબఆͱධՁͷߋ৽ͷܦա࣌ؒͷଌఆ ͳ͓ɺTVTPʢཻࢠ਺5ʣͰ͸ɺ ࿹ͷબఆ͸1.32ϛϦඵɺ ධՁߋ৽͸13.6ඵ
  61. ධՁ݁Ռ: ඇఆৗ͔ͭจ຺෇͖ͷઃఆʹର͢Δ༗ޮੑ  113 0 5000 10000 15000 20000 Cumulative

    regret adaptive thompson sampling decay linucb dynamic linucb LGSS bandits linear thompson sampling linucb neural linear select best arm at first select random arm 0 100 200 300 400 500 600 0.0 0.2 0.4 0.6 0.8 1.0 Best arm rate ֤ํࡦʹ͓͚ΔྦྷੵϦάϨοτͱ࠷ద࿹ͷબఆׂ߹ • ఏҊํࡦLGSS bandits͸࠷΋ߴ͍ྦྷੵใु ʢ࠷΋গͳ͍ϦάϨοτʣΛୡ੒ • มԽݕग़ܕͷํࡦAdTS͕࣍఺ • → ͜ΕΒʹ͍࣍ͰɺఆৗͳํࡦͷLTSɺLinUCBͷ݁ Ռ͕ྑ͔ͬͨ͜ͱ͔ΒɺຊධՁͰ͸ɺఆৗ࣌Ͱͷ୳ ࡧΛ཈͑ͨػձଛࣦͷ௿ݮ΋ॏཁͰ͋ͬͨɻ LGSS banditsͱAdTS͸ɺ࠷ద࿹ͷ੾Γସ͚͑࣌ͩͰ ͳ͘ɺఆৗظؒத΋ػձଛࣦΛ௿ݮͰ͖ͨɻ 0 100 200 300 400 500 600 Hours 0.05 0.10 0.15 Click-through rate Browsing path Demographic LLR Similar image ਪનख๏͝ͱͷΫϦοΫ཰ͷਪҠ • શͯͷ঎඼ΧςΰϦΛ௨ͯ͠ظटͷ࠷ద࿹ ΛҰ؏ͯ͠༻͍Δ৔߹ͱൺֱͯ͠ɺఏҊํ ࡦʹΑͬͯྦྷੵใु͕໿6.5%૿Ճ
  62. ධՁ݁Ռ: ඇఆৗ͔ͭจ຺෇͖ͷઃఆʹର͢Δ༗ޮੑ  114 0 5000 10000 15000 20000 Cumulative

    regret adaptive thompson sampling decay linucb dynamic linucb LGSS bandits linear thompson sampling linucb neural linear select best arm at first select random arm 0 100 200 300 400 500 600 0.0 0.2 0.4 0.6 0.8 1.0 Best arm rate ֤ํࡦʹ͓͚ΔྦྷੵϦάϨοτͱ࠷ద࿹ͷબఆׂ߹ • มԽݕग़ܕͷํࡦdLinUCB͸ɺมԽݕग़ޙ ͷ௥ैੑೳʹ༏Ε͍͕ͯͨɺલ൒ͷϦά Ϩοτ૿ՃʹΑΓɺఆৗํࡦΑΓྦྷੵϦά Ϩοτ͕૿Ճͨ͠ • → ఆظతͳόϯσΟοτϞσϧ௥Ճͱ࠶ධՁʹىҼ • ๨٫ܕͷํࡦDecay LinUCB͸ɺίϯςΩ ετ৘ใͷཁҼͷ؍ଌ਺ͷภΓʹΑΓɺ׆ ༻ͱ୳ࡧͷཱ͕྆ࠔ೉ͱͳͬͨ 0 100 200 300 400 500 600 Hours 0.05 0.10 0.15 Click-through rate Browsing path Demographic LLR Similar image ਪનख๏͝ͱͷΫϦοΫ཰ͷਪҠ
  63. ධՁ݁Ռ: ඇఆৗ͔ͭจ຺෇͖ͷઃఆʹର͢Δ༗ޮੑ  115 • ఏҊํࡦͰ͸ɺԾ૝తͳ୳ࡧʹΑΓɺมԽ ݕग़ޙͷ௥ैΛ࣮ݱͭͭ͠ɺఆৗ࣌ͷػձ ଛࣦͷ௿ݮ͕࣮ݱͰ͖ͨ • →

    Ծ૝త୳ࡧ͕ͳ͍ɺAdTS΍dLinUCBͰ͸ɺఆৗ΋͠ ͘͸ඇఆৗͷ͍ͣΕ͔ʹ͔͠ରԠͰ͖͍ͯͳ͍ͨΊɺఏ Ҋํࡦʹൺ΂ͯ૯߹తͳੑೳͰ͸ѱԽͨ͠ 0.00 0.05 0.10 0.15 0.20 Change of click-through rate (Stationery) Browsing path Demographic LLR Similar image 0 100 200 300 400 500 600 0.0 0.2 0.4 0.6 0.8 1.0 Best arm rate (Stationery) adaptive thompson sampling dynamic linucb LGSS bandits linear thompson sampling 0 50 100 150 200 0.0 0.2 0.4 0.6 0.8 1.0 Period 1 240 260 280 300 Period 2 300 400 500 600 Period 3 0.00 0.05 0.10 0.15 0.20 Change of click-through rate (Plants) Browsing path Demographic LLR Similar image 0 100 200 300 400 500 600 0.0 0.2 0.4 0.6 0.8 1.0 Best arm rate (Plants) adaptive thompson sampling dynamic linucb LGSS bandits linear thompson sampling ֤ํࡦʹ͓͚ΔྦྷੵϦάϨοτͱ࠷ద࿹ͷબఆׂ߹ Stationary Plants
  64. • ؍ଌࡶԻͷ෼ࢄ  ͸ΧϧϚϯϑΟϧλͷ࠷໬ਪఆ݁ՌΑΓେ͖͍஋͕ɺঢ়ଶͷ ҆ఆͨ͠ਪఆʹ༗ޮͰ͋ͬͨɻ࣮ӡ༻Ͱͷల։࣌ʹཹҙ͍ͨ͠ • → ܽଌ஋ॲཧʹΑΓঢ়ଶͷڞ෼ࢄߦྻͷ஋͕ߴ͘อͨΕɺঢ়ଶ͕େ͖͘ิ ਖ਼͞ΕΔͨΊɻ •

    ఏҊํࡦͰ͸ɺઢܗΨ΢εঢ়ଶۭؒϞσϧΛԾఆ͢Δ͜ͱͰ࣮ߦ࣌ؒͷ୹ॖΛ ਤ͕ͬͨɺ͜ͷԾఆ͕ຬͨ͞Εͳ͍ඇઢܗͳจ຺෇͖ઃఆʹ͓͍ͯ͸ɺਪఆਫ਼ ౓ͷ௿Լͱػձଛࣦͷ૿Ճ͕ݒ೦͞ΕΔɻ σ2 ϵ  116 ఏҊʹؔ͢Δٞ࿦
  65. ຊݚڀͷ၆ᛌਤ  120 "EBQUBCMF*OGPSNBUJPOTZTUFNTGPS%ZOBNJDBOE&WPMWJOH&OWJSPONFOU Context Non stationarity Context Non stationarity

    Online performance Context (Non-linear) Non stationarity Online performance Multi-armed bandit polices 3ষ 4ষ 5ষ 6ষ
  66. ίϯςΩετ৘ใ  ͱใु  ͷؒͷඇઢܗੑΛѻ͏ํࣜͷݕ౼ 1. ֶशࡁΈϞσϧʹΑΔඇઢܗม׵͞Εͨ  Λೖྗͱ͢Δઢܗͳํࡦ →

    ҟͳΔλεΫྖҬͰಘͨ݁Ռ͕ɺద༻ઌͷใुਪఆʹ༗ޮ͔͸ෆ໌ 2. ඇઢܗํఔࣜΛಋೖͨ͠ঢ়ଶۭؒϞσϧΛ༻͍ͨํࡦ[42,74] → ঢ়ଶํఔࣜͱ؍ଌํఔࣜΛ໌ࣔత͔ͭݸผʹઃܭͯ͠༩͑Δ͜ͱ͸ࠔ೉ 3. ඇઢܗճؼϞσϧΛ༻͍ͨํࡦ[76~82] → ෆ࣮֬ੑͱ࠶ؼతֶशػߏΛఏڙ͠ͳ͍NNํࡦ͸όϯσΟοτʹదͣ͞ xt rt ˜ xt  123 จ຺෇͖ଟ࿹όϯσΟοτ໰୊ͱඇઢܗੑ
  67. ඇઢܗճؼϞσϧͱͯ͠Ψ΢εաఔճؼʢGPʣΛ༻͍Δํࡦͷ՝୊ • GP͸ɺσʔλ͔Βɺ͋Δؔ਺ͷ෼෍Λ֬཰աఔͱͯ͠ٻΊΔ • ଟ਺ͷجఈؔ਺  ʹΑΔߴ͍දݱྗ • Χʔωϧ๏ʹΑͬͯجఈؔ਺ͷ໌͕ࣔෆཁ ϕ:

    ℝD → ℝ  124 จ຺෇͖ଟ࿹όϯσΟοτ໰୊ͱඇઢܗੑ k(p, q) ≜ (p 𝖳 q + c)m, m = 2, p, q ∈ ℝ2 ͜ͷଟ߲ࣜΧʔωϧؔ਺͸࿡ͭͷجఈؔ਺ʹΑΔม׵ͱ಺ੵΛऔͬͨ݁ՌͱҰக
  68. ඇઢܗճؼϞσϧͱͯ͠Ψ΢εաఔճؼʢGPʣΛ༻͍Δํࡦͷ՝୊ • GP͸ɺσʔλ͔Βɺ͋Δؔ਺ͷ෼෍Λ֬཰աఔͱͯ͠ٻΊΔ • ଟ਺ͷجఈؔ਺  ʹΑΔߴ͍දݱྗ • Χʔωϧ๏ʹΑͬͯجఈؔ਺ͷ໌͕ࣔෆཁ •

    ඇઢܗੑͱෆ࣮֬ੑΛѻ͑ΔͨΊଟ࿹όϯσΟοτ໰୊΁ͷ਌࿨ੑ͕ߴ͍ [88,89,90] • → Χʔωϧߦྻ  ͱͦͷٯߦྻͷܭࢉΛؚΉͨΊɺ ɹ ֶशσʔλ਺  ʹରֶͯ͠श͕࣌ؒࢦ਺ؔ਺తʹ૿Ճ ϕ: ℝD → ℝ K ∈ ℝN×N N  125 จ຺෇͖ଟ࿹όϯσΟοτ໰୊ͱඇઢܗੑ  K−1  K ∈ ℝN×N  K−1 →  k(xi , xj )
  69.  126 Weighted GP-UCB [95] [Y. Deng 2022] • ๨٫ܕͷඇఆৗ͔ͭඇઢܗͳจ຺෇͖ํࡦ

    • ཚ୒ԽϑʔϦΤಛ௃Λ༻͍ͯɺΨ΢εաఔճؼͷ༧ଌ෼෍ͷܭࢉΛ  ͱͳΔ  ࣍ݩͷઢܗճؼͷܗࣜͰղ͘ • ೋͭͷॏΈߦྻʹΑͬͯඇఆৗੑΛॊೈʹѻ͏ • → σʔλ਺  ͷ૿Ճʹର͢Δֶश࣌ؒͷࢦ਺త૿ՃΛճආ͢Δ R ⋘ N R N  K  ZZ⊤  ≃  K ∈ ℝN×N  Z ∈ ℝN×R  Z⊤Z  Z⊤Z ∈ ℝR×R  ⋙
  70.  127 • ๨٫ܕͷඇఆৗ͔ͭඇઢܗͳจ຺෇͖ํࡦ • ཚ୒ԽϑʔϦΤಛ௃Λ༻͍ͯɺΨ΢εաఔճؼͷ༧ଌ෼෍ͷܭࢉΛ  ͱͳΔ  ࣍ݩͷઢܗճؼͷܗࣜͰղ͘

    • → σʔλ਺  ͷ૿Ճʹର͢Δֶश࣌ؒͷࢦ਺త૿ՃΛճආ͢Δ • ࠶ؼతֶशػߏ͸ఏڙ͞Εͳ͍ • → σʔλ਺ͷ૿Ճʹର͢Δֶश࣌ؒͷ૿ՃΛ෦෼తʹ͔͠ղܾ͠ͳ͍ R ⋘ N R N Weighted GP-UCB [95] [Y. Deng 2022]  K  ZZ⊤  ≃  K ∈ ℝN×N  Z ∈ ℝN×R  Z⊤Z  Z⊤Z ∈ ℝR×R  ⋙
  71.  128 NysKRLS [41] [T. Zhang 2020] • ඇఆৗͳΧʔωϧஞ࣍࠷খೋ৐๏ •

    NyströmۙࣅΛ༻͍ͯɺΨ΢εաఔճؼͷ༧ଌ෼෍ͷܭࢉΛ  ͱͳΔ  ࣍ݩͷઢܗճؼͷܗࣜͰղ͘ • ஞ࣍࠷খೋ৐๏Λద༻͢Δ • ๨٫܎਺Λಋೖ͢Δ͜ͱͰඇఆৗੑΛѻ͏ • → σʔλ਺  ͷ૿Ճʹର͢Δஞֶ࣍शͷߴ଎ԽΛ࣮ݱ͢Δ R ⋘ N R N  (Z⊤ΓZ + γMΛ)−1 γ γ z xN
  72.  129 • ඇఆৗͳΧʔωϧஞ࣍࠷খೋ৐๏ • NyströmۙࣅΛ༻͍ͯɺΨ΢εաఔճؼͷ༧ଌ෼෍ͷܭࢉΛ  ͱͳΔ  ࣍ݩͷઢܗճؼͷܗࣜͰղ͘

    • → σʔλ਺  ͷ૿Ճʹର͢Δஞֶ࣍शͷߴ଎ԽΛ࣮ݱ͢Δ • ਖ਼ଇԽ߲ʹରͯ͠๨٫ॲཧ͕܁Γฦ͠ద༻͞ΕΔ • → ਖ਼ଇԽޮՌ͕௿ݮ͠աֶशʹΑΔਪఆਫ਼౓ͷྼԽ͕ੜͯ͡͠·͏ • ରࡦख๏Ͱ͋ͬͯ΋௨ৗͱൺ΂ͯܭࢉྔͷ૿Ճ΍࠷ద஋ʹऩଋ͠ͳ͍ [98, 46] R ⋘ N R N NysKRLS [41] [T. Zhang 2020]  (Z⊤ΓZ + γMΛ)−1 γ γ z xN
  73.  K  131 ཚ୒ԽϑʔϦΤಛ௃Λ༻͍ͨΨ΢εաఔճؼʢ1/2ʣ  K ∈ ℝN×N 

    k(xi , xj ) • ཚ୒ԽϑʔϦΤಛ௃͸ɺ͋Δ֬཰෼෍  ͔Βͷ  ݸͷαϯϓϧΛ༻ ͍ͯΧʔωϧؔ਺  Λ  ͱۙࣅ͢Δख๏ • ͳ͓ɺ  • ֬཰෼෍  ͸Χʔωϧؔ਺ͷछྨʹΑܾͬͯ·Δ p(ω) R′  = R/2 k(xi , xj ) ̂ k(xi , xj ) = z(xi )⊤z(xj ) z(xi ) = 1/R′  (cos(ω⊤ 1 xi ), sin(ω⊤ 1 xi ), …, cos(ω⊤ R′  xi ), sin(ω⊤ R′  xi )) p(ω) ݸผͷΧʔωϧؔ਺ʹରͯ͠͸ܭࢉίετ͕ ૿Ճ͢Δ͕ɺجఈؔ਺ͷద༻ͱ಺ੵͷࠞ߹ૢ ࡞ͷ݁ՌΛ෼ղͨ͠ͱݟΔ͜ͱ͕Ͱ͖Δ → ύϥϝʔλͷ࣍ݩ਺Λ  ࣍ݩʹݻఆͰ͖Δ R ϕ(x) = (ϕ1 (x), …, ϕ∞ (x))⊤ ∈ ℝ∞ ϕ(xi )⊤ϕ(xj ) = k(xi , xj ) ≈ z(xi )⊤z(xj )
  74.  132 ཚ୒ԽϑʔϦΤಛ௃Λ༻͍ͨΨ΢εաఔճؼʢ2/2ʣ  Λ = σ−2 w IR •

    ద༻ʹΑΓɺݻఆͷ  ࣍ݩͷϕΠζઢܗճؼϞσϧ૬౰ͷܗ͕ࣜಘΒΕΔ • → ઢܗͳஞ࣍࠷খೋ৐๏ͷద༻͕Մೳ R  K  ZZ⊤  ≃  K ∈ ℝN×N  Z ∈ ℝN×R  Z⊤Z  Z⊤Z ∈ ℝR×R  ⋙ k(xi , xj ) ≃ z(xi )⊤z(xj ) • ݩͷΨ΢εաఔճؼϞσϧͷ༧ଌ෼෍ͱͷൺֱ
  75. • Ψ΢εաఔճؼϞσϧ΁ͷ๨٫ػߏͷಋೖ • ؍ଌޡࠩͷै͏֬཰෼෍ͷ෼ࢄ͕ɺաڈʹ ḪΔ΄Ͳେ͖͘ͳΔΑ͏ఆࣜԽ  133 ॏΈ෇͖ஞ࣍Ψ΢εաఔճؼϞσϧʢ1/3ʣ  Γ−1

    = diag ((g(n))1≤n≤N) ∈ ℝN×N • ॏΈ෇͖Ψ΢εաఔճؼͷ༧ଌ෼෍ ಛʹ  ͷͱ͖ɺ  ͱ๨٫܎਺ ෇͖࠷খೋ৐๏ʹΑΔઢܗճؼͷ ఆࣜԽ͕Ұக σ2 ϵ = 1 ̂ μ′  ′ 
  76. • ॏΈ෇͖Ψ΢εաఔճؼϞσϧ΁ͷஞ࣍࠷খೋ৐๏ͷద༻ • ٯߦྻܭࢉͷճආʹΑΔܭࢉྔͷ࡟ݮͷͨΊ  ͱ  Λ݁Ϳߋ৽ࣜΛಘΔ • ஞ࣍ܭࢉ਺

     ͷ૿Ճʹ൐͍ɺ  ͷޡ͕ࠩྦྷੵ͠ɺਖ਼ଇԽޮՌ͕ݮগ • → ޡࠩΛิਖ਼͢Δʹ͸ɺஞ࣍࠷খೋ৐๏ͷద༻͕ෆՄೳͱͳΔ PN PN+1 N (γN − 1)Λ  134 ॏΈ෇͖ஞ࣍Ψ΢εաఔճؼϞσϧʢ2/3ʣ  P ˎ  ·ͨ͸  ͷ৔߹ʹ͸֘౰߲͸ফڈ γ = 1 Λ = diag(0)
  77.  135 =  Z⊤ΓZ + Λ z ⋮ xN

    x1 xN−1  (Z⊤ΓZ + Λ)−1 Inv  (Z⊤ΓZ + γ0Λ)−1 γ γ z ⋮ xN x1 xN−1 ޡࠩิਖ਼ 1. ɹͷٯߦྻͷܭࢉ 2. ɹɹͱͷޡࠩͷղফ 3. ɹɹͷٯߦྻͷܭࢉ −1 −1 ॏΈ෇͖ஞ࣍Ψ΢εաఔճؼϞσϧʢ3/3ʣ • ॏΈ෇͖Ψ΢εաఔճؼϞσϧ΁ͷஞ࣍࠷খೋ৐๏ͷద༻ • ޡࠩͷൃੜճ਺Λѻ͏  Λಋೖͨ͠࠶ ఆࣜԽʹΑΓɺ࠷খೋ৐๏ͱޡࠩิਖ਼ Λஞ࣍Խ • ఏҊख๏Ͱ͸ɺਪఆޡࠩΛڐ༰্ͨ͠ ͰɺຖճͰ͸ͳ͘೚ҙͷִؒͰิਖ਼ M ͜ͷૢ࡞͸  ߦྻʹର͢ΔٯߦྻܭࢉΛ2ճؚΉ ͨΊܭࢉίετ͕ߴ͍ɻ͕ͨͬͯ͠ɺڐ༰Ͱ͖Δਪ ఆਫ਼౓Λ౿·͑ɺ೚ҙͷִؒͰૢ࡞Λ࣮ߦ͢Δɻ ˎޡࠩิਖ਼ͷ஋ΛܾΊΔͷ͸աڈͷֶशσʔλͰ͸ ͳ͘Ͱ͋ΔͨΊɺ͜ͷޡࠩิਖ਼๏΋࠶ؼతֶशͰ͋ Δ R × R
  78.  136 ॏΈ෇͖ஞ࣍Ψ΢εաఔճؼϞσϧΛ༻͍ͨํࡦ  z  P N−2,M−2  Q

    N−2,M−2  x N−1  y N−1  (M = 0)  γ  P N−1,M−1  Q N−1,M−1  z  x N  y N  γ  P N,M  Q N,M  z  x *  p(y N−1 ∣ x N−1 , X, y)  p(y N ∣ x N , X, y)  p(y * ∣ x * , X, y)  (M = 0) ᶃ ࿹͝ͱʹॏΈ෇͖ஞ࣍Ψ΢εաఔճؼʢRW-GPRʣ ϞσϧΛ༻ҙ ᶄ ࿹͝ͱͷRW-GPRϞσϧͷ༧ଌ෼෍ͷฏۉͱ෼ࢄύϥ ɹ ϝʔλΛ༻͍ͨUCB1ํࡦʹΑͬͯ࿹Λબఆ ᶅ બ୒ͨ͠࿹͔ΒͷใुΛ༻͍ͯRW-GPRϞσϧΛ๨٫ ɹ ෇͖ͷ࠶ؼֶशʹΑͬͯߋ৽ ᶅ’ ִؒ  ͝ͱʹޡࠩΛิਖ਼ τ
  79. ճؼϞσϧͷੑೳ͓Αͼޡࠩͱิਖ਼๏ͷಛੑධՁ • ඇఆৗ͔ͭඇઢܗͳճؼ໰୊ͱͯ͠2ͭͷඇઢܗ ؔ਺ΛมԽલޙʢ  ͱ  ʣͱݟཱͯσʔλΛੜ੒ • ൚ԽੑೳͷධՁͷͨΊֶशσʔλൣғΛ

     ͱ্ͨ͠Ͱɺൣғ֎ͷ  ·Ͱ༧ଌͨ͠ • ఏҊճؼϞσϧʢ  ʣ͸৽͍͠ํͷσʔλʹ ্ख͘ద߹ͨ͠ fA fB [−3,3] [−4,4] M = 0  138 ճؼϞσϧͷ෼ੳ °4 °3 °2 °1 0 1 2 3 4 x °4 °2 0 2 4 y Predictive distribution (ˆ µ00 and 1æ confidence based on ˆ ß00) yA fA (x) yB fB (x) ˆ µ00 1æ confidence
  80. ճؼϞσϧͷੑೳ͓Αͼޡࠩͱิਖ਼๏ͷಛੑධՁ • ༧ଌ෼෍ͷฏۉͱ෼ࢄύϥϝʔλͷ࠷ద஋ͱ࠶ ؼతֶशͷࠩ͸ɺ  ͷͱ͖Ұக͠ɺਖ਼͘͠ ػೳ͢Δ͜ͱΛ֬ೝͰ͖ͨ •  ͕େ͖͍ͱ͖ࠩ͸޿͕Γɺֶशσʔλൣғ֎

    ͰಛʹݦஶʹͳΔ • → ޡࠩͷ஝ੵ͕ճؼϞσϧʹ͓͚ΔॏΈύϥϝʔλͷ෼ࢄʹ૬౰ͳେ͖ ͞ΛԾఆ͢Δ͜ͱʹ౳͍ͨ͠Ίɻ࣮ࡍʹɺ൚ԽੑೳͷѱԽ΋֬ೝͨ͠ɻ M = 0 M  139 ճؼϞσϧͷ෼ੳ °0.10 °0.05 0.00 0.05 0.10 Error of ˆ µ00 Estimation error of ˆ µ00 for each M M=0 M=200 M=400 M=600 °4 °3 °2 °1 0 1 2 3 4 x 0.0000 0.0005 0.0010 0.0015 0.0020 Error of ˆ ß00 Estimation error of ˆ ß00 for each M M=0 M=200 M=400 M=600
  81. ඇఆৗ͔ͭඇઢܗͳจ຺෇͖ଟ࿹όϯσΟοτ໰୊ͷγϛϡϨʔγϣϯ • Wheel bandits[71]Λ֦ுͨ͠ɺඇఆৗWheel banditsͰධՁ • ࿹ͷใु͸ਖ਼ن෼෍  ʹै͏͕ɺฏۉ஋ 

    ͸ίϯςΩετ৘ใ  ʹରͯ͠ԼਤͷΑ͏ʹରԠ͢Δ • ଳͷ഑ஔ͸࣌ؒਐలʹରͯ͠ࠨճΓͰҠಈ͢Δʢ4000ճͷࢼߦதʹ1ճసʣ 𝒩 (μ, σ2) μ xt = (xt,d )1≤d≤2  140 ํࡦͷධՁख๏  μ1 = 0.1,μ2 = 0.0,μ3 = 1.0,σ2 = 0.01,δ = 0.8,ρ = 4000 ࿹  ͷใुͷฏۉ͸ίϯςΩε τ৘ใͱ࣌ؒʹؔΘΒͣৗʹಉ͡ a(1) ฏۉͷؔ܎͸  Ͱ͋ ΓɺଳͷൣғͰ͸ରԠ͢Δ࿹Λɺ ͦΕҎ֎Ͱ͸࿹  Λબ୒͢Δ͜ ͱ͕ظ଴ใुͷ࠷େԽʹͭͳ͕Δ μ2 < μ1 ≪ μ3 a(1)
  82.  141 ํࡦͷධՁख๏ ඇઢܗ ඇఆৗ ࠶ؼతֶश ํࡦ ஫ऍ ✓ ✓

    ✓ RW-GPB (Proposal) ޡࠩิਖ਼ͷޮՌΛൺֱ͢ΔͨΊෳ਺ͷิਖ਼ִؒτͰධՁ ✓ ✓ GP+UCB (Weighted, RFF) [95] - ࠷ઌ୺ํࡦ - ཚ୒ԽϑʔϦΤಛ௃ͷ࠾༻ ✓ ✓ GP+UCB (Weighted) [95] ✓ ✓ GP+UCB (Sliding Window) [93] ✓ ✓ GP+UCB (Restarting) [93] ✓ GP+UCB [88~90] ✓ ✓ Decay LinUCB [68] ֤ํࡦͷճؼϞσϧͷ༗ޮੑ͕༰қʹൺֱͰ͖ΔΑ͏ɺ୳ࡧεέʔϦϯά߲͸  ʹἧ͑ͨ β = 1 • γϛϡϨʔγϣϯʹ༻͍Δํࡦ
  83. 750 800 850 900 950 Cumulative rewards 102 103 Trials

    per second ø = 1600 ø = 800 ø = 400 ø = 100 ø = 1 ø = 40 ø = 4 Cumulative rewards - Trials per second trade-oÆ RW-GPB GP+UCB (Sliding Window) GP+UCB (Weighted) GP+UCB (Weighted, RFF) ྦྷੵใुͱ࣮ߦ࣌ؒͷτϨʔυΦϑͷൺֱ • ఏҊํࡦ͕ɺ࠷ઌ୺ํࡦʹରͯ͠ߴ͍ྦྷੵใुͱ ୹͍࣮ߦ࣌ؒΛୡ੒ͨ͠ • GP+UCBʢWeightedʣ͸ྦྷੵใु͕࠷΋ߴ͘ ࠷΋࣮ߦ͕࣌ؒ௕͍ • ۙࣅਫ਼౓ͷ޲্͕ํࡦʹ΋ॏཁ • ଞͷඇఆৗํࣜ͸୳ࡧ͕ຫੑతʹ૿Ճ • ઢܗͳํࡦ͸ਖ਼͘͠ਪఆͰ͖ͳ͔ͬͨ  142 ํࡦͷධՁ݁Ռ
  84. ࣮ߦ࣌ؒͷ෼ੳͱޡࠩิਖ਼๏ͷ༗ޮੑ • ఏҊํࡦ͸ɺ࠶ؼతֶशʹΑ࣮ͬͯߦ࣌ؒ ΛҰఆʹอͭ • RFFΛར༻͠ͳ͍ํࡦ͸ɺࢦ਺తʹ࣮ߦ࣌ؒ ͕૿Ճɺར༻͢Δ৔߹΋ֶश͕࣌ؒ૿Ճ • ྦྷੵޡࠩͱͦͷิਖ਼ճ਺͸࣮ߦ࣌ؒΑΓ ྦྷੵใु΁ͷӨڹ͕େ͖͍

    • ੵۃతͳޡࠩิਖ਼͕༗ޮ  143 ํࡦͷධՁ݁Ռ 500 1000 1500 2000 2500 3000 3500 4000 Number of trials 0 20 40 60 80 Cumulative execution time (Sec) Cumulative execution time RW-GPB (ø = 4) GP+UCB (Sliding Window) GP+UCB (Weighted) GP+UCB (Weighted, RFF) 750 800 850 900 950 Cumulative rewards 102 103 Trials per second ø = 1600 ø = 800 ø = 400 ø = 100 ø = 1 ø = 40 ø = 4 Cumulative rewards - Trials per second trade-oÆ RW-GPB GP+UCB (Sliding Window) GP+UCB (Weighted) GP+UCB (Weighted, RFF)
  85. • ఏҊํࡦ͸ɺطʹ࿦ͨ͡ɺ๨٫ܕͷํࡦʹ͓͚ΔҎԼͷ՝୊Λ౿ऻ͢Δɻ • i. ճؼϞσϧϕʔεͰ͋ΔͨΊঢ়ଶۭؒϞσϧͷܽଌ஋ॲཧʹجͮ͘൪ڰΘ ͤ΁ͷରࡦ͕࠾༻Ͱ͖ͳ͍ • ii. มಈͷ͕۠ؒෆنଇͳঢ়گʹ͓͚Δ๨٫཰ͷύϥϝʔλௐ੔ͷࠔ೉͞ •

    iii. ίϯςΩετ৘ใͷཁҼ͝ͱͷ؍ଌ਺ͷภΓ͕͋Δ৔߹ͷiiͷ೉қ౓޲্ • ຊධՁͰ͸ɺ͜ΕΒ͕Өڹ͠ͳ͍γϛϡϨʔγϣϯΛઃఆ͕ͨ͠ɺ࣮ӡ༻΁ͷ ద༻ʹ޲͚ɺ͜ΕΒ΁ͷରࡦ͕๬·ΕΔɻ • → iʹ͍ͭͯ͸ະબ୒ͷ࿹ʹର͢Δ๨٫ૢ࡞ͷΈΛద༻͢ΔํࣜΛݕ౼த  144 ఏҊʹؔ͢Δٞ࿦
  86. ݚڀ֓ཁ: ଟ༷͔ͭܧଓతʹมԽ͢Δ؀ڥʹదԠ͢Δ৘ใγεςϜ എܠͱ໨త ՝୊ ੒Ռ [1] ࡾ୐ ༔հ, ็ ߃ݑ,

    Synapse: จ຺ʹԠͯ͡ܧଓతʹਪનख๏ͷબ୒Λ࠷దԽ͢Δਪન γεςϜ, ిࢠ৘ใ௨৴ֶձ࿦จࢽD, Vol.J103-D, No.11, pp.764-775, Nov 2020. [2] ࡾ୐ ༔հ, ็ ߃ݑ, Synapse: จ຺ͱ࣌ؒܦաʹԠͯ͡ਪનख๏ͷબ୒Λ࠷దԽ͢Δϝ λਪનγεςϜ, ిࢠ৘ใ௨৴ֶձ࿦จࢽD, Vol.J105-D, No.11, pp.641-652, Nov. 2022. [3] Yusuke Miyake, Tsunenori Mine, Contextual and Nonstationary Multi-armed Bandits Using the Linear Gaussian State Space Model for the Meta-Recommender System, 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp.3138-3145, Oct 2023. [4] Yusuke Miyake, Ryuji Watanabe, Tsunenori Mine, Online Nonstationary and Nonlinear Bandits with Recursive Weighted Gaussian Process, The 48th IEEE International Conference on Computers, Software, and Applications (COMPSAC 2024) (to appear) 1. ܧଓతʹબ୒Λ࠷దԽ͢Δ৘ใγεςϜͷઃܭ [1] ՝୊1ʹ͍ͭͯɺਪનγεςϜΛ୊ࡐʹɺҎԼͰఏҊ͢ΔػցֶशϞσϧͷಛੑΛߟྀͨ͠ଟ࿹όϯσΟοτ໰୊ͷํࡦʹΑͬͯɺࣗಈత͔ͭܧଓతʹબ୒Λ ࠷దԽ͢ΔదԠܕ৘ใγεςϜج൫ΛઃܭɾධՁͨ͠ɻ 2. ଟ༷͔ͭܧଓతʹมԽ͢Δ؀ڥ΁ͷదԠ [2] ՝୊2ʹ͍ͭͯɺཻࢠϑΟϧλΛ༻͍ͨจ຺෇͖͔ͭඇఆৗͳଟ࿹όϯσΟοτํࡦΛఏҊ͠ɺଟ༷͔ͭܧଓతʹมԽ͢Δ؀ڥ΁ͷదԠྗ޲্Λ࣮ݱͨ͠ɻ 3. దԠͷߴ଎Խ [3] ՝୊2ɾ3ʹ͍ͭͯɺઢܗΧϧϚϯϑΟϧλΛ༻͍ͨจ຺෇͖͔ͭඇఆৗͳଟ࿹όϯσΟοτํࡦͷఏҊ͠ɺదԠͷߴ଎ԽΛ࣮ݱͨ͠ɻ 4. ඇઢܗੑ΁ͷରԠ [4] ՝୊2ɾ3ɾ4ʹ͍ͭͯɺॏΈ෇͖ஞ࣍Ψ΢εաఔճؼΛ༻͍ͨඇఆৗ͔ͭඇઢܗͳจ຺෇͖ଟ࿹όϯσΟοτํࡦΛఏҊ͠ɺඇઢܗͳ໰୊ઃఆʹ͓͚Δඇఆৗ ੑ΁ͷରԠͱॲཧ଎౓ͷ޲্Λ࣮ݱͨ͠ɻ 1. ࣮؀ڥͰͷධՁʹΑΔػձଛࣦ ैདྷͷଟ࿹όϯσΟοτ໰୊ͷํࡦΛ༻͍ͯϞσϧબ୒ͷ࠷దԽΛਤΔ৘ใγεςϜͰ͸ɺػցֶश ϞσϧͷಛੑΛߟྀͰ͖ͣɺػձଛࣦΛे෼ʹ཈͑ΒΕͳ͍ɻ 2. จ຺΍࣌ؒͷܦաʹΑΔ༗༻ੑͷมಈ ػցֶशϞσϧͷ༗༻ੑ͸ɺར༻ऀ΍γεςϜͷঢ়گʹΑͬͯɺ·ͨಉ͡ঢ়گͰ͋ͬͯ΋࣌ؒͷܦա ʹΑͬͯมಈ͢Δɻ 3. దԠʹ൐͏͕࣌ؒٴ΅͢༗༻ੑ΁ͷӨڹ จ຺΍࣌ؒͷܦաΛ΋ߟྀͨ͠ܧଓతͳൺֱධՁͷ࢓૊Έͷಋೖ͸ɺԠ౴ʹ஗ΕΛҾ͖ى͜͢ɻ 4. ༗༻ੑͷਪఆʹ͓͚Δෳࡶͳؔ܎ੑ จ຺ͱػցֶशϞσϧͷ༗༻ੑͷؒʹ͸ɺඇઢܗͳؔ܎ੑ΋͋ΓಘΔɻ 1. ৘ใγεςϜͱ؀ڥมԽ ଟ༷͔ͭܧଓతʹมԽ͢Δ؀ڥͷதͰɺ৘ใγεςϜ͕ܧଓతʹػೳ͢Δʹ͸ɺैདྷͷਓखʹΑΔӡ༻ Ͱ͸ͳ͘ɺࣗಈԽ͞ΕͨదԠػߏͷ࣮ݱ͕՝୊ʹͳΔɻ 2. ؀ڥมԽʹࣗΒదԠ͢Δ৘ใγεςϜ దԠܕ৘ใγεςϜͷ࣮ݱʹ͸ɺσʔλ͔Βಈతʹಈ࡞Λઃܭ͢ΔػցֶशϞσϧͱͷ౷߹͕ෆՄܽͰ ͋Δ͕ɺͲͷػցֶशϞσϧ͕ਅʹޮՌతͰ͋Δ͔Λ༧Ί஌Δ͜ͱ͸೉͍͠ɻ 3. બ୒ͷ࠷దԽ ࣮؀ڥͰͷධՁʹΑΔػցֶशϞσϧͷબ୒Ͱ͸ɺ୹ظతͳධՁʹΑΔػձଛࣦ΍࠷దͳϞσϧΛݟಀ ͢ϦεΫ͕൐͏ͨΊɺ͜ͷબ୒աఔΛ࠷దԽ͠ػձଛࣦΛܰݮ͢Δ࢓૊Έ͕ٻΊΒΕΔɻ બ୒ͷ࠷దԽͷ՝୊ దԠܕ৘ใγεςϜͷ࣮ݱʹ޲͚ͨબ୒ ͷ࠷దԽ બ୒ͷ࠷దԽͷ՝୊ͷղܾ IUUQTJDPOTDPN