Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
solving of multi-armed bandit problem in advert...
Search
yoppi
April 20, 2018
2
7.3k
solving of multi-armed bandit problem in advertisement recommendation
yoppi
April 20, 2018
Tweet
Share
More Decks by yoppi
See All by yoppi
Applying oCPC algorithm for production
yoppi
2
720
recommendation system with document similarity
yoppi
0
3.1k
RailsはRubyだ
yoppi
0
230
Featured
See All Featured
Building Your Own Lightsaber
phodgson
103
6.1k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.2k
Designing for Performance
lara
604
68k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
232
17k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
44
9.3k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
The Cost Of JavaScript in 2023
addyosmani
45
7k
The Pragmatic Product Professional
lauravandoore
32
6.3k
The Invisible Side of Design
smashingmag
298
50k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7k
Transcript
ࠂϨίϝϯυʹ͓͚Δ ଟόϯσΟοτͷద༻ͱղ๏ )JSPLB[V:PTIJEB!ZPQQJCMPH 4QFFF *OD ΦϨγΧφΠτ
import “github.com/yoppi” w !ZPQQJCMPH4QFFF *OD w 6;06UFDIMFBEFOHJOFFS w ϨίϝϯυΤϯδϯ࡞ͬͨΓ w
Ξυαʔό࡞ͬͨΓ w ࠂۀք͘Β͍ʢಈըࠂɺωΠςΟϒΞυʣ w (Pɺ+BWB4DSJQUɺ3VCZ
ࠓ͢͜ͱ w 6;06ͷࠂϨίϝϯυ w ଟόϯσΟοτ w 6;06ͷࠂϨίϝϯυͷద༻ͱղ๏ w ద༻݁Ռ w
·ͱΊͱࠓޙͷ՝
ࠓ͢͜ͱ w 6;06ͷࠂϨίϝϯυ w ଟόϯσΟοτ w 6;06ͷࠂϨίϝϯυͷద༻ͱղ๏ w ద༻݁Ռ w
·ͱΊͱࠓޙͷ՝
6;06ͷࠂϨίϝϯυ ࠂೖߘ ࠂ৴ ΞυωοτϫʔΫ ʜ ʜ ϝσΟΞ ࠂཧళ
6;06ͷࠂϨίϝϯυ هࣄɾίϯςϯπ ʮ͋ͳͨʹ͓͢͢Ίͷهࣄʯ ϝσΟΞͷهࣄԼ෦ʹ ΟδΣοτͱ͍͏ܗͰ ࠂΛ৴͢Δ
6;06ͷࠂϨίϝϯυ
6;06ͷࠂϨίϝϯυ ࠂ ࠂ ࠂ/ ͋ΔࠂͰԿΛ৴͢ΕΫϦοΫͯ͘͠ΕΔͷ͔ʁ ʜ
ࠂϨίϝϯυͷछྨ w ίϯςϯπϕʔε৴ ‣ ͦͷϝσΟΞͷهࣄͱࠂͷྨࣅ͔ΒϨίϝϯυ͢Δ w ϦλʔήςΟϯά৴ ‣ ࠂओͷΛݟͨ͋ͱʹϢʔβʹରͯͦ͠ͷΛϨί ϝϯυ͢Δ
w λʔήςΟϯά৴ ‣ ϢʔβͷଐੑΛ%.1͔Βਪଌ͠ྨࣅ͢ΔଐੑͷࠂΛϨ ίϝϯυ͢Δ w աڈͷՌʹجͮ͘৴ ‣ ͦͷʹաڈʹ৴͞ΕͨใΛͱʹՌ͕Α͘ͳΔΑ ͏ʹϨίϝϯυ͢Δ
ࠓճͷϨίϝϯυΤϯδϯͷత w 6;06ͷച্ʢF$1.ʣΛ্͍͛ͨ w ܭࢉྔΛݮΒ͠ޮΑ͘ϨίϝϯυγεςϜΛӡ༻ ͍ͨ͠
ࠂϨίϝϯυͷछྨ w ίϯςϯπϕʔε৴ w ͦͷϝσΟΞͷهࣄͱࠂͷྨࣅ͔ΒϨίϝϯυ͢Δ w ϦλʔήςΟϯά৴ w ࠂओͷΛݟͨ͋ͱʹϢʔβʹରͯͦ͠ͷΛϨί ϝϯυ͢Δ
w λʔήςΟϯά৴ w ϢʔβͷଐੑΛ%.1͔Βਪଌ͠ྨࣅ͢ΔଐੑͷࠂΛϨ ίϝϯυ͢Δ w աڈͷՌʹجͮ͘৴ w ͦͷʹաڈʹ৴͞ΕͨใΛͱʹՌ͕Α͘ͳΔΑ ͏ʹϨίϝϯυ͢Δ
ࠓ͢͜ͱ w 6;06ͷࠂϨίϝϯυ w ଟόϯσΟοτ w 6;06ͷࠂϨίϝϯυͷద༻ͱղ๏ w ద༻݁Ռ w
·ͱΊͱࠓޙͷ՝
ଟόϯσΟοτ ͕Kݸ ࣌ؒtʹ͓͍ͯi(t)Λબ͠ ใुXi(t)ΛಘΔ ͦΕΛ܁Γฦͯ͠ใुΛ࠷େ Խ͍ͤͨ͞ i(t) Xi(t)
ଟόϯσΟοτ w બࢶͷू߹͔ΒҰͭΛબͨ͠ͱ͖ʹɺͦͷબࢶʹ ର͢ΔใुΛಘΔ͕ଞͷબࢶͷใुใಘΒΕͳ͍ w ͦͷঢ়گԼͰ܁Γฦ͠બͨ͠ͱ͖ʹใुͷΛ࠷େԽ ͢Δ͜ͱΛࢦ͢ ‣ iͷਅͷظµi Λ͍ͬͯΔͳΒظ͕࠷େʹͳ
ΔΛҾ͖ଓ͚Ε͍͍͕ͦΕલఏ݅ͱͯ͠Ͱ͖ ͳ͍ w ͦΕ·ͰʹಘͨใुΛͱʹબࢶΛܾఆ͢ΔઓུΛํ ࡦʢQPMJDZʣͱݺͿ
ଟόϯσΟοτ ʲྫʳ εϩοτϚγϯΛճ(T= 100) Ҿ͚Δͱ͖ʹ ใुΛ࠷େԽͤ͞Δʹʁ K = 4
ଟόϯσΟοτ K = 4 ʲࡶͳํࡦʳ nճશ෦ͷΛҾ͍ͯՌͷྑ͔ͬͨΛɺ Γͷճ(100-4n) Ҿ͖ଓ͚Δ
ଟόϯσΟοτ K = 4 n = 3 ͜ͷͱ͖ճՌ͕Α͔ͬͨΛҾ͖ଓ͚Δ͕ɺ ͔͔ͨͩճͷ୳ࡧͰਅͷՌΛݕূ͢Δ͜ͱ ͦ͠͏
i* = argmaxi∈{1, 2, 3, 4}µi
ଟόϯσΟοτ K = 4 n = 20 ͜ͷͱ͖ճՌ͕Α͔ͬͨΛҾ͖ଓ͚Δ͕ɺ ใुͷྑ͍Λར༻͢Δճ͕গͳ͘ɺ ใुΛ࠷େԽͤ͞Δ͜ͱ͍͔͠
i* = argmaxi∈{1, 2, 3, 4}µi
୳ࡧͱࣝར༻ͷτϨʔυΦϑ w ୳ࡧʢFYQMPSBUJPOʣ w ଞͷͷ΄͏͕Ռ͕ྑ͍ͷͰͱ༧ଌͯ͠Ҿ͍ͯΈΔ w ͨͩɺʮͬͱଞʹྑ͍͕͋ΔͷͰʁʯΛ܁Γฦ͢ͱɺՌ ͷ͍͍ΛҾ͘ճ͕গͳ͘ͳΓύϑΥʔϚϯεΛͩ͢͜ͱ͕Ͱ͖ ͳ͍ w
ࣝར༻ʢFYQMPJUBUJPOʣ w ͋Δظؒͷ͍͔ͭ͘ͷͷՌ͕࠷ྑ͍ͷΛ͞Βʹબ͠ଓ͚ Δ w ͨͩɺʮ͜ͷ࠷ߴʂʯͱͳͬͯͦΕ͔Γબ͍ͯ͠Δͱɺଞʹ ͬͱύϑΥʔϚϯε͕ྑ͍Λݟ͚ͭΒΕͳ͍ w ୳ࡧͱࣝར༻ΛͦΕͧΕదʹ࣮ߦ͢ΔํࡦΛݕ౼͢Δ͜ͱ͕ॏཁ
ଟόϯσΟοτͷओཁͳํࡦ w ЏHSFFEZ w ୯७ͳ૯Γ݁Ռ͔Βྑ͔ͬͨͷΛҾ͖ଓ͚Δ w 6$#ʢ6QQFSDPOpEFODFCPVOEʣ w ඪຊฏۉʹิਖ਼߲ΛՃ͑ͨͷΛείΞͱͯ͠ΛબͿ w
54ʢ5IPNQTPO4BNQMJOHʣ w աڈʹ؍ଌͰ͖ͨ݁Ռ͔ΒϕΠζ౷ܭͰະདྷͷظΛ༧ ଌ͢Δ
ଟόϯσΟοτͷओཁͳํࡦ w ЏHSFFEZ w 6$# w 5IPNQTPO4BNQMJOH
5IPNQTPO4BNQMJOH w ʮ֤͕ظ࠷େͱͳΔ֬ʯΛͱʹϥϯμϜͰΛબ͢Δํ ࡦʢ֬Ұக๏ʣ w ͜ͷ֬ͷΈΛϕΠζ౷ܭͰఆࣜԽͨ͠ͷ͕5IPNQTPO 4BNQMJOH w ظµi ͕ԿΒ͔ͷࣄલ͔ΒΓཱ͍ͬͯΔͱ͠ɺਅͷظµi
ͷࣄޙΛٻΊΛબ͢Δ w ྫ͑ࠂΛྫʹڍ͛Δͱɺࠂ͕ΫϦοΫ͞ΕͨɺΫϦοΫ͞ Εͳ͔ͬͨͱ͍͏ʹྨͰ͖ΔͷͰࣄલϕϧψʔΠ ʢBer(µ): µ∈[0, 1]ʣͱͳΔ w ܭࢉΛ؆ૉʹ͢ΔͨΊɺϕϧψʔΠͷڞࣄલͰ͋Δ#FUB Λ༻͍Δ
5IPNQTPO4BNQMJOHͷٙࣅίʔυ ύϥϝʔλ: α>0, β>0 1: ֤iʹ͍ͭͯni ←0, mi ←0 2:
for i = 1, 2,...,T do 3: ^μi = Beta(mi+α, (ni-mi)+β) 4: i ← argmaxi∈{1,2,...,K}^μi 5: iΛͻ͍ͯใुXi(t)∈{0,1}Λ؍ଌ͢Δ 6: ni ← ni + 1, mi ← mi + Xi(t) 7: end for
ࠓ͢͜ͱ w 6;06ͷࠂϨίϝϯυ w ଟόϯσΟοτ w 6;06ͷࠂϨίϝϯυͷద༻ͱղ๏ w ద༻݁Ռ w
·ͱΊͱࠓޙͷ՝
6;06ͷࠂϨίϝϯυͷద༻ w ଟόϯσΟοτΛࠂϨίϝϯυΞϧΰϦζϜͱͯ͠ 6;06ద༻ w ͕ࠂʹରԠ͢Δ w ํࡦͱͯ͠5IPNQTPO4BNQMJOHΛ࠾༻ w ࣄલͱࣄޙͦͷࠂͷ$53ͷ
w ຖͷ৴݁Ռ $53 Λࣄલͱͯ͠ࣄޙΛܭࢉ͠ɺ ৴͢ΔࠂΛબ͢Δ w #FUBͷܭࢉ࣌ʹ͓͚ΔЋ ЌώϡʔϦεςΟοΫʹܾΊ ଧͪ
5IPNQTPO4BNQMJOHΛબΜͩ༁ w όονߋ৽ʹؤ݈Ͱ͋Δ w ຊ൪ӡ༻͢ΔࡍʹσʔλαΠζ͔Β͘Δܭࢉྔͷ੍ʹΑΓ ใु݁ՌΛϦΞϧλΠϜͰ؍ଌ͢Δ͜ͱ͕͘͠Ԇͯ͠ߋ৽ ͢Δ͜ͱ͕ଟ͍ w ͕ɺͷબ࣌ʹʮظ࠷େͱͳΔ֬ʹ͕ͨͬͯ͠ϥϯμ ϜͰબ͢ΔʯͨΊɺՌͷѱ͍ʹภΔ͜ͱ͕গͳ͍
w ܭࢉྔ͕গͳ͍ w جຊతͳϩδοΫࣄલ͔ΒࣄޙΛ#FUBΛܭࢉ͢ Δ͚ͩͰΑ͍ͷͰܭࢉྔΛ͑ΒΕΔ w ଟ͘ͷϞσϧͰཧݶքΛୡͰ͖Δ͜ͱ͕ΒΕ͍ͯΔ
54Λࠂʹద༻ͨ͠߹ͷٙࣅίʔυ ύϥϝʔλ: α>0, β>0 1: ֤ࠂiʹ͍ͭͯ impressioni ←0, clicki ←0
2: for i = 1, 2,...,T do 3: ^μi = Beta(clicki+α, (impressioni-clicki)+β) 4: i←argmaxi∈{1,2,…,K}^μi*CPCi*1000 5: ࠂάϧʔϓiΛͻ͍ͯใुXi(t)∈{0, 1} Λ؍ଌ͢Δ 6: impressioni ←impressioni+1, clicki ←clicki+Xi(t) 7: end for
Ћ ЌͬͯͲ͏ܾΊΕ͍͍ͷʁ w ࣄલʢ৽نʹೖߘ͞Εͨࠂʣ͕ͳ͍ࠂʹ͓͍ ͯҙຯΛͳ͢ w ؾΛ͚ͭΔ͖ͱ͜ΖฏۉͱࢄͰΦʔμʔ͕มΘ Δ
ϨίϝϯυγεςϜߏ ࠂϚελ ৴ϩά ৴αʔόɾ࣮ੵूܭαʔό ৴σʔλ 0O.FN $BDIF ৴σʔλ࡞όον 0O.FN $BDIF
.FEJB "EWFSUJTFS
(PWT1ZUIPO w ݁ہͷͱ͜Ζ5IPNQTPO4BNQMJOHࣄલ͔Βࣄ ޙΛٻΊΔͱ͖ʹ#FUBΛܭࢉ͢Δ͚ͩͰΑ͍ w ߴͳֶతͳܭࢉٻΊΒΕͳ͍ w ѹతʹ(PͰܭࢉ͢Δ΄͏͕ճͷόονʹΑΔ࣮ߦ ࣌ؒΛ͘Ͱ͖Δ w
όϯσΟοτΛղ͘ͱ͖ʹͷߋ৽ִؒΛͳ Δ͘খ͘͞ͳΔ΄͏͕ྑ͍ w ͦͷͨΊ(PΛ࠾༻࣮ͯ͠
(PWT1ZUIPO ࣮ߦ࣌ؒ T /
(P 1ZUIPO
ద༻݁Ռ w 6;06Λಋೖ͍ͯ͠ΔϝσΟΞʹద༻ w ৴͍ͯ͠Δɺ͔ͭɺJNQSFTTJPO͕ଟ͍ϝσΟΞ w 6;06શମͷF$1.Λ΄Ͳ্Ͱ͖ͨ
·ͱΊͱࠓޙͷ՝ w తͱ͍ͯͨ͠ɺܭࢉྔͷݮٴͼF$1.ͷ্ΛୡͰ͖ͨ w ϋΠύʔύϥϝʔλͷ࠷దԽ w Ћ ЌͷΛࠓώϡʔϦεςΟοΫʹఆ͍ٛͯ͠Δঢ়ଶͳͷͰ ͝ͱͷ࠷దͳΛٻΊ͍ͨ w
બ͢Δͷߋ৽සͷ࠷దԽ w શͯͷʹ͓͍ͯ࣌ؒݻఆͰΛߋ৽͍ͯ͠ΔͷͰ͝ͱͷ࠷ దͳߋ৽සΛٻΊ͍ͨ w $73ΛՃຯͨ͠ͷ༧ଌ w $53ͷΈ༧ଌͨ͠ͷͰΛબΜͰ͍ΔͷͰ$73༧ଌ౿·͑ Λબ͍ͨ͠
ࢀߟจݙ w ྛ ΞυωοτϫʔΫʹ͓͚Δࠂ৴ܭըͷ ࠷దԽ ਓೳֶձࢽ7PM/P w ଟόϯσΟοτͷཧͱΞϧΰϦζϜ IUUQJCJTNMPSHBSDIJWFJCJT JCJT@CBOEJUQEGT
w ຊଟ३ʰόϯσΟοτͷཧͱΞϧΰϦζ Ϝʱߨஊࣾ