Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Offline A/B testing for Recommender Systems
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
alpicola
November 20, 2018
Technology
2.2k
0
Share
Offline A/B testing for Recommender Systems
alpicola
November 20, 2018
More Decks by alpicola
See All by alpicola
[AEON TECH HUB #24] お客様の長期的興味の理解に向けて
alpicola
0
160
商品レコメンドでのexplicit negative feedbackの活用
alpicola
2
960
Recommending What Video to Watch Next: A Multitask Ranking System
alpicola
1
940
Kibanaを用いたアクセスログ調査と解析 / Access Log Analysis Using Kibana
alpicola
0
1k
Other Decks in Technology
See All in Technology
Cortex Codeでデータの仕事を全部Agenticにやりきろう!
gappy50
0
320
ふりかえりを 「あそび」にしたら、 学習が勝手に進んだ / Playful Retros Drive Learning
katoaz
0
380
OCI技術資料 : ロード・バランサ 概要 - FLB・NLB共通
ocise
4
28k
パワポ作るマンをMCP Apps化してみた
iwamot
PRO
0
310
試されDATA SAPPORO [LT]Claude Codeで「ゆっくりデータ分析」
ishikawa_satoru
0
300
【Findy FDE登壇_2026_04_14】— 現場課題を本気で解いてたら、FDEになってた話
miyatakoji
0
170
2026-04-02 IBM Bobオンボーディング入門
yutanonaka
0
250
レガシーシステムをどう次世代に受け継ぐか
tachiiri
0
300
Oracle Cloud Infrastructure(OCI):Onboarding Session(はじめてのOCI/Oracle Supportご利⽤ガイド)
oracle4engineer
PRO
2
17k
すごいぞManaged Kubernetes
harukasakihara
1
350
互換性のある(らしい)DBへの移行など考えるにあたってたいへんざっくり
sejima
PRO
0
560
OCI技術資料 : 証明書サービス概要
ocise
1
7.2k
Featured
See All Featured
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
1
480
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
160
Git: the NoSQL Database
bkeepers
PRO
432
67k
Leo the Paperboy
mayatellez
6
1.6k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
160
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
230
It's Worth the Effort
3n
188
29k
Stop Working from a Prison Cell
hatefulcrawdad
274
21k
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
370
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
We Are The Robots
honzajavorek
0
210
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
Transcript
Offline A/B testing for Recommender Systems ͯͳ ాத (alpicola) @
จಡΈձ 11/19 1
Offline A/B testing for Recommender Systems — CriteoͷWSDM'18ͷจ — SpotifyͷRecSys'18จͰݴٴ
2
Offline A/B testing for Recommender Systems — CriteoͷWSDM'18ͷจ — SpotifyͷRecSys'18จͰݴٴ
— ΫοΫύου։࠵ͷಡΈձͰ͢Ͱʹհ͞Ε͍ͯͨ — ͕ɺվΊͯ۷ΓԼ͕͛ͨͰ͖Εͱࢥ͍·͢ 3
ΦϑϥΠϯABςετ? — ΦϯϥΠϯͰߦ͏ABςετ࣌ؒͱ͕͔͔ۚΔ — ΦϑϥΠϯͰͦΕʹ͍ۙධՁ͕ߦ͑ΕΞϧΰϦζ ϜվળͷαΠΫϧΛߴԽͰ͖Δ — Ͱਫ਼? ! 4
ϩάʹجͮ͘ΦϑϥΠϯධՁͷݚڀ — Counterfactual estimationͱ͔off-policy estimationͱ ݺΕΔ — WSDM'15ͷνϡʔτϦΞϧ — SIGIR'16ͷνϡʔτϦΞϧ
— ධՁ͚ͩͰͳֶ͘शͷతؔʹ͏͜ͱͰ͖Δ — ͜ͷจͰධՁͷΈΛѻ͏ 5
จͷߩݙ — ΦϑϥΠϯABςετͰ༻͍Δใुͷਪఆख๏NCISͷ ͋Δछͷ࠷దੑΛࣔ͢ — ͜ͷݟʹج͍ͮͯNCISͷ֦ுPieceNCISͱ PointNCISΛఏҊ — ΦϯϥΠϯABςετ݁Ռͱͷ૬͕ؔେ্͖͘ 6
ઃఆ — Top-k ϥϯΩϯά — : ϩά — : ίϯςΩετ
— : ΞΫγϣϯ — : ใु 7
ઃఆ — : ίϯςΩετ͔ΒΞΫγϣϯΛબͿϙϦγʔ — : ݱߦͷϙϦγʔ — : ςετ͍ͨ͠ϙϦγʔ
— : ฏۉॲஔޮՌ — ͜ΕΛਪఆ͍ͨ͠ 8
ઃఆ — ΦϯϥΠϯABςετ — ͷݩͰͷϩάͱ ͷݩͰͷϩά͕͋Δ — ඪຊฏۉͰ , ͦΕͧΕਪఆ
— ΦϑϥΠϯABςετ — ͷݩͰͷϩά͔Β ਪఆ ! 9
ैདྷख๏ — Importance sampling (IS) — Normalized importance sampling (NIS)
— Doubly robust estimator (DR) — Capped importance sampling (CIS) — Normalized capped importance sampling (NCIS) ౷ܭϞϯςΧϧϩ๏ͷจ຺Ͱొ 10
Importance sampling (IS) — ! όΠΞε͕ͳ͍ — — " ʹΑΔߴόϦΞϯε
(unbounded) — όϦΞϯε͕େ͖͍ͱ ͱ ΛൺֱͰ͖ͳ͍ 11
Normalized importance sampling (NIS) Λͬͯ Λஔ͖͑ — ! ҰகਪఆྔʹͳΔ —
— " ґવͱͯ͠όϦΞϯεେ 12
Capped importance sampling (CIS) ॏΈͷ࠷େΛ ʹ (max capping) ॏΈ͕ Ҏ্ͷ߲ࣺͯΔ
(zero capping) 13
CISͷόΠΞε 14
CISͷόΠΞε — όΠΞε ͷ࣌ͷ Ͱbound͞ΕΔ — — ใु͕େ͖͍ͱ͜ΖΛऔΕΔΑ͏ʹվળ͍ͨ͠ ͕ͦ͏͢ΔͱόΠΞε͕େ͖͘ͳΔ !
15
CISͷόΠΞε Cappingͷઃఆʹ͍͍τϨʔυΦϑ͕ଘࡏ͠ͳ͍ ! 16
Normalized capped importance sampling (NCIS) NIS, CIS྆ํͷΞΠσΞΛ࣋ͪࠐΉ 17
NCISͱCISͷؔ 18
NCISͱCISͷؔ CIS͕͍࣋ͬͯͨόΠΞε Λୈೋ߲ͰϞσϧ ͍ͯ͠ΔͱݟͳͤΔ 19
NCISͱCISͷؔ (ಛʹzero cappingͷ࣌) 20
NCISͱCISͷؔ (ಛʹzero cappingͷ࣌) — ͳΒۙతʹόΠΞ ε͕ͳ͘ͳΔ ! — ͷ ,
ʹର͢Δґଘ͕খ͍࣌͞ͳͲ 21
NCISͷόΠΞε 22
NCISͷόΠΞε — ͱcappingͷ༗ແʹ૬͕ؔ͋ΔͱόΠΞε͕େ͖͘ ͳΔ ! — ަབྷҼࢠϢʔβʔͷλΠϓͳͲ͕ߟ͑ΒΕΔ (Table 1) 23
NCISͷόΠΞε 24
จͷΞΠσΞ — ͷϞσϦϯάΛάϩʔόϧ㱺ϩʔΧϧʹ — ίϯςΩετ ʹରͯ͠ہॴతͳNCIS — ͱcappingͷ૬ؔΛݮΒ͢ — Piecewise
NCIS: ׂ͞ΕͨྖҬ͝ͱʹNCIS — Pointwise NCIS: ཁૉ͝ͱʹNCIS 25
Piecewise NCIS (PieceNCIS) ίϯςΩετͷू߹ ͷׂ Λߟ͑Δ 26
Piecewise NCIS (PieceNCIS) ׂ֤ʹରͯ͠NCIS 27
ׂͷྫ దͳؔ ΛఆΊͯ ֤ Ͱ ͷ ʹର͢Δґଘ͕খ͘͞ͳΔΑ͏ʹ 28
Pointwise NCIS (PointNCIS) ཁૉ୯ҐͰׂ͢Δ (i.e. ) ಛఆͷίϯςΩετʹର͢Δαϯϓϧ͘͝গͳ͍ͷ ͰૉʹNCISΛద༻Ͱ͖ͳ͍ 29
Pointwise NCIS (PointNCIS) — ΞΫγϣϯʹ͍ͭͯपลԽ͢Δ ͱਖ਼֬ʹٻΊΒΕΔ — ΞΫγϣϯͷ͕ଟ͍ͱܭࢉ͕ߴίετ ! —
ΛαϯϓϦϯάͰٻΊΔ 30
Midzuno-Sen method 1. Λαϯϓϧ 2. Λ ͔Β ͳͷ͕ಘΒΕΔ·Ͱαϯϓϧ 3. Λ
͔Βαϯϓϧ 4. Λฦ͢ ͜͏ͯ͠ಘΒΕΔΛ ͱॻ͘ 31
Pointwise NCIS (PointNCIS) — ͷ͏ͪ ͕ ͷσʔλແࢹͰ͖Δ — ใु͕εύʔεͳ࣌ʹޮతʹܭࢉͰ͖Δ !
32
࣮ݧ — ϓϩϓϥΠΤλϦͷσʔληοτ — 39छɺ߹ܭͰઍԯ݅ͷϩάσʔλ — ΫϦοΫϕʔεͷใु (εύʔε͔ͭࢄେ) — ରCIS,
NCIS, PieceNCIS, PointNCIS ( ) — IS, NISόϦΞϯε͕ߴ͗͢ΔͷͰআ֎ 33
ΦϯϥΠϯʗΦϑϥΠϯABςετͷ૬ؔ 34
ద߹ͱِӄੑ ʮ ͕ ΑΓΑ͍͔Ͳ͏͔ʯͷ2༧ଌͱͯ͠ݟΔ 35
࣮ݧ݁Ռͷ·ͱΊ — CIS૬͕ؔෛ — શମతʹΊͷਪఆ͕ग़͍ͯͨ (Figure 4) — CIS⇒NCISͰେ͖͘վળ —
NCIS⇒PointNCISͰِཅੑ͕͞ΒʹԼ͕Δ — ద߹NCISҎޙͦ͜·ͰΑ͘ͳΒͳ͍ — ࣮ߦʹ͓͍ͯਫ਼ʹ͓͍ͯPointNCIS͕Α͍ 36
Appendix — ͕খ͍͞ͱ ͕ cappingΛ͑Δ͜ͱ — Max cappingͰ ʹͳΔΑ͏ͳ ৽͍͠capping
͕ͱΕΔ (Lemma A.3) 37