Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
FukuokaR #7
Search
Hiroki Mizukami
March 25, 2017
Science
0
320
FukuokaR #7
https://www.amazon.co.jp/dp/4774188778
Hiroki Mizukami
March 25, 2017
Tweet
Share
More Decks by Hiroki Mizukami
See All by Hiroki Mizukami
音楽配信サービスにおける 推薦システムの概要と 数理モデルについて
hiroki_mizukami
0
200
CADEDA #6 AWAにおけるデータ利活用の取り組みと今後の展望について
hiroki_mizukami
4
2.2k
オンライン広告の数理モデルと数学ソフトウェア MSFD#23
hiroki_mizukami
6
4.6k
Other Decks in Science
See All in Science
眼科AIコンテスト2024_特別賞_6位Solution
pon0matsu
0
240
AI科学の何が“哲学”の問題になるのか ~問いマッピングの試み~
rmaruy
1
2.4k
位相的データ解析とその応用例
brainpadpr
1
820
All-in-One Bioinformatics Platform Realized with Snowflake ~ From In Silico Drug Discovery, Disease Variant Analysis, to Single-Cell RNA-seq
ktatsuya
PRO
0
280
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
120
MoveItを使った産業用ロボット向け動作作成方法の紹介 / Introduction to creating motion for industrial robots using MoveIt
ry0_ka
0
240
科学で迫る勝敗の法則(名城大学公開講座.2024年10月) / The principle of victory discovered by science (Open lecture in Meijo Univ. 2024)
konakalab
0
250
(論文読み)贈り物の交換による地位の競争と社会構造の変化 - 文化人類学への統計物理学的アプローチ -
__ymgc__
1
150
Pericarditis Comic
camkdraws
0
1.5k
The Incredible Machine: Developer Productivity and the Impact of AI
tomzimmermann
0
460
局所保存性・相似変換対称性を満たす機械学習モデルによる数値流体力学
yellowshippo
1
140
Snowflakeによる統合バイオインフォマティクス
ktatsuya
PRO
0
560
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
327
24k
Product Roadmaps are Hard
iamctodd
PRO
50
11k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
3
360
Making the Leap to Tech Lead
cromwellryan
133
9k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
The World Runs on Bad Software
bkeepers
PRO
66
11k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
Statistics for Hackers
jakevdp
797
220k
Building a Scalable Design System with Sketch
lauravandoore
460
33k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Producing Creativity
orderedlist
PRO
343
39k
Transcript
Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ
Ӣ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ొ ཽ
ཽ ܭ ౷ ొ ཧ ֬ 2
ཧϞσϦϯάͱɺ ౷ܭϞσϦϯάͱɺ ͦΕ͔Βɺࢲɻ 2
ཽ ܭ ౷ ొ ཧ ֬ 3
@ Fukuoka R Mar 25, 2017 य़ Hiroki Mizukami Destroy 3
ཽ ܭ ౷ ొ ཧ ֬ 4
※ݸਓͷݟղɻɻɻ 4
ཽ ܭ ౷ ొ ཧ ֬ 5
5 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
• Έ͔ͣΈ ͻΖ͖ • LINE_ID: @piroyoung • αΠόʔܥͷAI Labɽ •
αʔόαΠυΤϯδχΞ • σʔλαΠΤϯςΟετ • ౦ژࡏॅʗԬग़ • ֶʗࠂʗWeb • Love έΰύʔΫ • R/Python/Scala/javascript/Spark/ Docker/AWS/Stan/Tableau/AWS/GCP ࣗݾհ ϔϏϝλ
Rݴޠ ʢڱٛʣ Rݴޠʢ͋ʔΔ͛Μ͝ʣΦʔϓϯιʔεɾϑϦʔιϑτΣΞͷ౷ܭղੳ͚ ͷϓϩάϥϛϯάݴޠٴͼͦͷ։ൃ࣮ߦڥͰ͋Δɻ RݴޠχϡʔδʔϥϯυͷΦʔΫϥϯυେֶͷRoss IhakaͱRobert Clifford GentlemanʹΑΓ࡞ΒΕͨɻݱࡏͰR Development Core
TeamʢSݴޠ։ൃऀ Ͱ͋ΔJohn M. Chambersࢀը͍ͯ͠Δ[1]ɻʣʹΑΓϝϯςφϯεͱ֦ு͕ͳ ͞Ε͍ͯΔɻ RݴޠͷιʔείʔυओʹCݴޠɺFORTRANɺͦͯ͠RʹΑͬͯ։ൃ͞Εͨɻ - wikipedia -
Rݴޠ ʢٛʣ σʔλੳΛੜۀͱ͢Δܑ͓͞Μ͓Ͷ͐͞ΜୡͷίϛϡχςΟͷ૯শɾ֓೦ɾε ϥϯάɻདྷΔͷશͯڋ·ͳ͍ελΠϧͰɺ࣮ࡍʹσʔλੳΛ͍ͬͯΔ͔ ͢ΒجຊతʹࣗݾਃࠂɻϢʔϞΞͱϢʔϞΞͱਓฑ͕ΛूΊΔϙΠϯτɽෳ ͷελʔτΞοϓϕϯνϟʔΛੜΈग़͍ͯ͠Δɽ ͱ͋Δ౷ܭʹΑΔͱ࣮ࡍʹRΛ͔ͭͬͯΔͻͱ Α͏͢ΔʹɼࠓRͷίΞͳ͠ͳ͍ͬͯ͜ͱͰ͢͢Έ·ͤΜɽ - mikipedia
-
ཽ ܭ ౷ ొ ཧ ֬ 9
9 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
ཧϞσϦϯά ཧϞσϦϯά ͱσʔλͷதʹ͋ΔߏΛࣜͰهड़͢Δ͜ͱ ྫ͑͜Μͳσʔλ͕༗Δ ͜ͷͱ͖όωAʹؔͯ͠ ʦόωͷ͞ʧʹ 0.2 x [͓Γͷॏ͞] +
3 ͱݱʹؔ͢Δࣜͷදݱ͕ಘΒΕΔɽ
ཧϞσϦϯά Ͳ͏ͬͨʁ όωAʹؔͯ͠ҎԼͷ࿈ཱํఔ͕ࣜͨͯΒΕΔ ͜ΕΛղ͚
ཧϞσϦϯά Կ͕͏Ε͍͠ʁ • ݱ࣮ͷͷߟʹֶͷςΫχοΫͰ͑ΒΕΔɽ • ײ͕ٴͳ͍ʹ͑Δ • ݫີ • ఆྔత
• ʮόωAͷํ͕৳ͼ͍͢ʯ
ཧϞσϦϯά ݫີʻʼײɼఆྔతʻʼఆੑత ʮؾԹ͕ߴ͍ͱδϝδϝ͢ΔͶ͐ʯ ͜Ε͜ΕͰॏཁɽ
ཽ ܭ ౷ ొ ཧ ֬ 14
ʮͱΓ͋͑ͣɺՄࢹԽ͠Αʁʯ 14
ཽ ܭ ౷ ొ ཧ ֬ 15
ʮࣜɺͨͯΐʁʯ 15
ཽ ܭ ౷ ొ ཧ ֬ 16
ʮσʔλΛೖ͠Αʁʯ 16
ཽ ܭ ౷ ొ ཧ ֬ 17
ʮύϥϝλܭࢉͰ͖ͨ͊ʂʂʯ 17
ཽ ܭ ౷ ొ ཧ ֬ 18
18 Click = CTR · Imp
ཽ ܭ ౷ ొ ཧ ֬ 19
19 pV = nRT
ཽ ܭ ౷ ొ ཧ ֬ 20
20
ཽ ܭ ౷ ొ ཧ ֬ 21
21 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
౷ܭϞσϦϯά ౷ܭϞσϦϯά ͱ֬ʹجͮ͘ཧϞσϦϯάɽ ֬มΛؚΉϞσϧࣜΛ༻͍Δɽ ֬มͱϥϯμϜͳৼΔ͍ʹ؍ଌΛରԠ͚ΔΈͷ͜ͱɽ ཁ͢Δʹ ʮ ͕ग़ͨ−ʂʂʯʹʼ 1 ͬͯͳ۩߹ɽ
X : ! 2 ⌦ 7! X(!) 2 R
౷ܭϞσϦϯά ࣄͱߟͷରͱ͢ΔϥϯμϜͳৼΔ͍ͷ͋ͭ·Γɽ ͜Ε؍ଌ͕͇ΛԼճΔͱ͍͏ৼΔ͍ͷू·Γͷ͜ͱ ֶతͳఆٛ X : ! 2 ⌦ 7!
X(!) 2 R X < x [ X < x ] := X 1([ 1 , x )) = { ! 2 ⌦| X ( ! ) < x }
౷ܭϞσϦϯά ֬ͱ؍ଌͷཚࡶ͞ͷֶతදݱ ͜Εਖ਼نͰ͜Μͳײ͡ʹද͢ɽ ʮ֬มX͕ฏۉμɼඪ४ภࠩσͷਖ਼نʹै͏ʯͱಡΉɽ μσͳͲͷΛݸੑ͚ΔύϥϝλΛ ͱ͍͏ɽ X ⇠ N(µ,
2)
౷ܭϞσϦϯά ਪఆͱɼσʔλΛͱʹΛ༧͢Δ͜ͱ ʮΉΉʔʂ͜Ε֬0.5Ͱද͕ग़Δͷ͔͠Εͳ͍ʂʯ ʮͬͺ10ͷ1͘Β͍͔͠Εͳ͍ɽɽɽʯ ͜ͷਪఆͷʢͬͱʣΒ͠͞ͱݺΕ͍ͯΔ ද ཪ ද ཪ ཪ
ཪ ཪ ཪ ཪ ཪ ཪ ཪ
ཽ ܭ ౷ ొ ཧ ֬ 26
26 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
ઢܗճؼϞσϧ ҎԼͷΑ͏ͳσʔλ͕༗Δɽ ͕ɼ࣮෩͕ਧ͍ͯͯਖ਼֬ʹܭଌग़དྷͯͳ͍ͬΆ͍ɽ ࠷ॳͱ͓ͳ͡ઢܗͷϞσϧࣜʹσʔλΛೖͯ͠ΈΔͱ
ઢܗճؼϞσϧ
ཽ ܭ ౷ ొ ཧ ֬ 29
ղ͚ͳ͌ɻɻɻ 29
ཽ ܭ ౷ ొ ཧ ֬ 30
ղͷͳ͌ɺ࿈ཱํఔࣜɻɻɻ 30
ཽ ܭ ౷ ొ ཧ ֬ 31
୳ͯ͠ɺݟ͔ͭΒͳ͌ͬͯίτɻɻɻ 31
ཽ ܭ ౷ ొ ཧ ֬ 32
͏ŵŧƄແཧɻ౷ܭ͠ΐɻɻɻ 32
ઢܗճؼϞσϧ ͜ͷϞσϧ؍ଌޡ͕ࠩߟྀ͞Ε͍ͯͳ͌ɻɻɻ ਖ਼نͷޡࠩԾఆ͢Δ y = ✓0 + ✓1x +✏ y
= ✓0 + ✓1x ✏ ⇠ N(0, 2)
ઢܗճؼϞσϧ ਖ਼نʹै͏ޡࠩΛԾఆͨ͠ϞσϧΛઢܗճؼϞσϧͱ͍͏ ✏ ⇠ N(0, 2) Y (✓0 + ✓1X)
⇠ N(0, 2) Y ⇠ N(✓0 + ✓1X, 2)
ཽ ܭ ౷ ొ ཧ ֬ 35
ਪఆ͠ΐɻɻɻ 35
ઢܗճؼϞσϧ ਖ਼نʹै͏ޡࠩΛԾఆͨ͠ϞσϧΛઢܗճؼϞσϧͱ͍͏ Ұ൪Β͍͠θͱσΛܭࢉ͢Δ ͜͜Ͱ Y ⇠ N(✓0 + ✓1X, 2)
L(✓1, ✓2, ) = Y i 1 p 2⇡ 2 e (yi µi)2 2 2 µi = ✓0 + ✓1xi
ઢܗճؼϞσϧ ରؔ θͷਪఆԼઢ෦Λ࠷খʹ͢Ε͍͍ࣄ͕Θ͔Δ ͜ΕΛ࠷খ2๏ͱ͍͏ɽ = 0
ઢܗճؼϞσϧ σʹؔ͢Δํఔࣜ ͜ΕΛղ͚ ͕ಘΒΕΔɽ͜Εඪຊࢄɽ @ @ log L ( ✓1,
✓2, ) = 0 2 = 1 n X i (yi µi)2
ઢܗճؼϞσϧ Rͩͱ؆୯ʹܭࢉͰ͖Δɽ
ཽ ܭ ౷ ొ ཧ ֬ 40
PythonͩͬͨΒɻɻɻ statsmodels / sklearn.linear_model.*** 40
ཽ ܭ ౷ ొ ཧ ֬ 41
41 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ղऍͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά
ઢܗճؼϞσϧ ղऍλεΫ ؍ଌ͞Εͨσʔλͷੑ࣭ΛௐΔɽ ੑผ༧ଌϞσϧ αΠτAΛݟͯΔͷஉੑ͕ଟ͍ɽ
ઢܗճؼϞσϧ ղऍλεΫ ͜ͷCPAʢ͋ͨΓίετʣࢪࡦͷྑ͞ͷධՁͱͯ͠༗ޮ Ͱɽɽɽ ʮ2ஹԁग़ͨ͠ΔΘ ɼ2ԯCVΖʯʹʼ͑ͬɾɾɾ ͪΖΜແཧ͕͋Δ CV = 1
CPA · Cost
ઢܗճؼϞσϧ ղऍλεΫ ͜ͷCPAʢ͋ͨΓίετʣࢪࡦͷྑ͞ͷධՁͱͯ͠༗ޮ Ͱɽɽɽ ʮ2ஹग़ͨ͠ΔΘʯ ʹʼ 2ԯCVʁʁʁ ͪΖΜແཧ͕͋Δ CV =
1 CPA · Cost y=x/CPA
ઢܗճؼϞσϧ ൚ԽλεΫ ະͷσʔλʹର͢Δ༧ଌੑೳࢸ্ओٛ • Neural Network • Gradient Boosting Decision
Tree • SVM with some kernel • Ridge/Lasso • Feature Hashing ౷ܭతͳͷΈͰಈ͍͍ͯͳ͍͕ଟ͍ Α͘Θ͔ΒΜ͕Կނ͔ͨΔ
ઢܗճؼϞσϧ ൚ԽλεΫ minimize: loss(label, Feature) Feature Label
ཽ ܭ ౷ ొ ཧ ֬ 47
47 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ղऍͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά
• ཧϞσϦϯάΛ༻͍Εݱ࣮ͷΛֶͷϊ ϋͰղܾͰ͖Δ • ౷ܭతͳςΫχοΫΛ͏͜ͱͰߋʹॊೈʹ • ൚ԽͱղऍϞσϧผͷςΫχοΫ ·ͱΊ
ੈా୩۠ࡏॅ H.M͞Μ ʮ࠷ॳʰ͜Μͳॻ੶Ͱඞཁͳ͕ࣝΈʹͭ͘ͳΜͯɾɾɾʱͱ͍͏ؾ࣋ͪ ͋Γɺ৴ٙͰ͜ͷຊΛखʹऔΓ·ͨ͠ɻ͍͟खʹͱͬͯݟΔͱShell ScriptSQLͷجૅͪΖΜɼPythonʹΑΔ࣮ફతͳΞϓϦέʔγϣϯͷ ࡞Γํ·Ͱஸೡʹղઆ͞Ε͍ͯͯ༧Ҏ্ͷϘϦϡʔϜͰͨ͠ɻͱ͘ʹۤख ͩͬͨ౷ܭϞσϦϯάטΈࡅ͍ͯॻ͔Ε͍ͯͯऔֻ͔ͬΓʹ࠷ߴͩͬͨ ͱࢥ͍·͢ɻ2000ԁऑͱ͍͏Ձֶ֨ੜʹخ͍͠Ͱ͢ɻࠓͰຖ൴ঁͱ ͤʹΒͯ͠ډ·͢ɻʯ ͨͳ͠ΎΜύΫͬͨ͝ΊΜ