Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
データサイエンティストに同じクエリは二度も通じぬ
Search
Takahiro Yoshinaga
December 07, 2019
Technology
2
960
データサイエンティストに同じクエリは二度も通じぬ
Presentation in Japan.R 2019
Takahiro Yoshinaga
December 07, 2019
Tweet
Share
More Decks by Takahiro Yoshinaga
See All by Takahiro Yoshinaga
ビッグデータビジネスによる継続的な価値創造と人材育成
yoshinaga0106
0
120
社内LINE公式アカウント メッセージ送りすぎ問題を データサイエンスで解決する
yoshinaga0106
0
220
[ICML2021 論文読み会] A General Framework For Detecting Anomalous Inputs to DNN Classifiers
yoshinaga0106
0
1.4k
Data Science API
yoshinaga0106
5
2.6k
Anomaly Detection in KDD2019
yoshinaga0106
1
380
Data Engineering & Data Analysis #8
yoshinaga0106
1
2.5k
Conversion Prediction Using Multi-task Conditional Attention Networks to Support the Creation of Effective Ad Creatives
yoshinaga0106
0
1.5k
Introduction of Clumpiness
yoshinaga0106
0
150
データにまつわる苦労話から考えるデータ活用
yoshinaga0106
0
150
Other Decks in Technology
See All in Technology
EKS Pod Identity における推移的な session tags
z63d
1
200
Preferred Networks (PFN) とLLM Post-Training チームの紹介 / 第4回 関東Kaggler会 スポンサーセッション
pfn
PRO
1
180
R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization
takmin
0
430
mruby(PicoRuby)で ファミコン音楽を奏でる
kishima
1
230
Understanding Go GC #coefl_go_jp
bengo4com
0
1.1k
AIとTDDによるNext.js「隙間ツール」開発の実践
makotot
5
670
[CV勉強会@関東 CVPR2025 読み会] MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos (Li+, CVPR2025)
abemii
0
190
Yahoo!広告ビジネス基盤におけるバックエンド開発
lycorptech_jp
PRO
1
270
Android Studio の 新しいAI機能を試してみよう / Try out the new AI features in Android Studio
yanzm
0
270
新卒(ほぼ)専業Kagglerという選択肢
nocchi1
1
2.2k
[OCI Skill Mapping] AWSユーザーのためのOCI(2025年8月20日開催)
oracle4engineer
PRO
2
140
小さなチーム 大きな仕事 - 個人開発でAIをフル活用する
himaratsu
0
120
Featured
See All Featured
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Writing Fast Ruby
sferik
628
62k
The Cult of Friendly URLs
andyhume
79
6.5k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
183
54k
Transcript
2019/12/7 Takahiro Yoshinaga, LINE Corporation
© 2015 KURUMADA PRODUCTION
@t_yoshinaga0106 Takahiro Yoshinaga aE l l , l hi RE
S R E s l e t a t o l l / BL cDn IPN
!
# , , cost, impression Web service df #>
gender age cost impression click conversion #> 1 M 10 51 101 0 0 #> 2 F 20 52 102 3 1 #> 3 M 30 53 103 6 2 #> 4 F 40 54 104 9 3 #> 5 M 50 55 105 12 4 #> 6 F 60 56 106 15 5 #> 7 M 70 57 107 18 6 #> 8 F 80 58 108 21 7 #> 9 M 90 59 109 24 8 #> 10 F 100 60 110 27 9 Sample # !" !
:
dplyr # Summarize by gender df_summarized_gender <- df %>% group_by(gender)
%>% summarize( cost = sum(cost), impression = sum(impression), click = sum(click), conversion = sum(conversion), ctr = sum(click) / sum(impression), cvr = sum(conversion) / sum(click), ctvr = sum(conversion) / sum(impression), cpa = sum(cost) / sum(conversion), cpc = sum(cost) / sum(click), ecpm = sum(cost) / sum(impression) * 1000 ) df_summarized_gender #> # A tibble: 2 x 11 #> gender cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <fct> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 F 280 530 75 25 0.142 0.333 0.0472 11.2 3.73 528. #> 2 M 275 525 60 20 0.114 0.333 0.0381 13.8 4.58 524. # Summarize by age df_summarized_age <- df %>% group_by(age) %>% summarize( cost = sum(cost), impression = sum(impression), click = sum(click), conversion = sum(conversion), ctr = sum(click) / sum(impression), cvr = sum(conversion) / sum(click), ctvr = sum(conversion) / sum(impression), cpa = sum(cost) / sum(conversion), cpc = sum(cost) / sum(click), ecpm = sum(cost) / sum(impression) * 1000 ) df_summarized_age #> # A tibble: 10 x 11 #> age cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <dbl> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 10 51 101 0 0 0 NaN 0 Inf Inf 505. #> 2 20 52 102 3 1 0.0294 0.333 0.00980 52 17.3 510. #> 3 30 53 103 6 2 0.0583 0.333 0.0194 26.5 8.83 515. #> 4 40 54 104 9 3 0.0865 0.333 0.0288 18 6 519. #> 5 50 55 105 12 4 0.114 0.333 0.0381 13.8 4.58 524. #> 6 60 56 106 15 5 0.142 0.333 0.0472 11.2 3.73 528. #> 7 70 57 107 18 6 0.168 0.333 0.0561 9.5 3.17 533. #> 8 80 58 108 21 7 0.194 0.333 0.0648 8.29 2.76 537. #> 9 90 59 109 24 8 0.220 0.333 0.0734 7.38 2.46 541. #> 10 100 60 110 27 9 0.245 0.333 0.0818 6.67 2.22 545.
dplyr # Summarize by gender df_summarized_gender <- df %>% group_by(gender)
%>% summarize( cost = sum(cost), impression = sum(impression), click = sum(click), conversion = sum(conversion), ctr = sum(click) / sum(impression), cvr = sum(conversion) / sum(click), ctvr = sum(conversion) / sum(impression), cpa = sum(cost) / sum(conversion), cpc = sum(cost) / sum(click), ecpm = sum(cost) / sum(impression) * 1000 ) df_summarized_gender #> # A tibble: 2 x 11 #> gender cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <fct> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 F 280 530 75 25 0.142 0.333 0.0472 11.2 3.73 528. #> 2 M 275 525 60 20 0.114 0.333 0.0381 13.8 4.58 524. # Summarize by age df_summarized_age <- df %>% group_by(age) %>% summarize( cost = sum(cost), impression = sum(impression), click = sum(click), conversion = sum(conversion), ctr = sum(click) / sum(impression), cvr = sum(conversion) / sum(click), ctvr = sum(conversion) / sum(impression), cpa = sum(cost) / sum(conversion), cpc = sum(cost) / sum(click), ecpm = sum(cost) / sum(impression) * 1000 ) df_summarized_age #> # A tibble: 10 x 11 #> age cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <dbl> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 10 51 101 0 0 0 NaN 0 Inf Inf 505. #> 2 20 52 102 3 1 0.0294 0.333 0.00980 52 17.3 510. #> 3 30 53 103 6 2 0.0583 0.333 0.0194 26.5 8.83 515. #> 4 40 54 104 9 3 0.0865 0.333 0.0288 18 6 519. #> 5 50 55 105 12 4 0.114 0.333 0.0381 13.8 4.58 524. #> 6 60 56 106 15 5 0.142 0.333 0.0472 11.2 3.73 528. #> 7 70 57 107 18 6 0.168 0.333 0.0561 9.5 3.17 533. #> 8 80 58 108 21 7 0.194 0.333 0.0648 8.29 2.76 537. #> 9 90 59 109 24 8 0.220 0.333 0.0734 7.38 2.46 541. #> 10 100 60 110 27 9 0.245 0.333 0.0818 6.67 2.22 545. !? !?
%! $ # "
mmetrics GI EI - C l ü . : .
: A - . . / l - ü - .: C - . l : ü LD ND R l - : ü .: .: - : : : - C .
# metrics <- mmetrics::define( cost = sum(cost), impression = sum(impression),
click = sum(click), conversion = sum(conversion), ctr = sum(click) / sum(impression), cvr = sum(conversion) / sum(click), ctvr = sum(conversion) / sum(impression), cpa = sum(cost) / sum(conversion), cpc = sum(cost) / sum(click), ecpm = sum(cost) / sum(impression) * 1000) # axis df_summarized_gender <- mmetrics::add(df, gender, metrics = metrics) df_summarized_age <- mmetrics::add(df, age, metrics = metrics) Use Case of mmetrics
Result # df_summarized_gender #> # A tibble: 2 x
11 #> gender cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <fct> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 F 280 530 75 25 0.142 0.333 0.0472 11.2 3.73 528. #> 2 M 275 525 60 20 0.114 0.333 0.0381 13.8 4.58 524. # df_summarized_age #> # A tibble: 10 x 11 #> age cost impression click conversion ctr cvr ctvr cpa cpc ecpm #> <dbl> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 10 51 101 0 0 0 NaN 0 Inf Inf 505. #> 2 20 52 102 3 1 0.0294 0.333 0.00980 52 17.3 510. #> 3 30 53 103 6 2 0.0583 0.333 0.0194 26.5 8.83 515. #> 4 40 54 104 9 3 0.0865 0.333 0.0288 18 6 519. #> 5 50 55 105 12 4 0.114 0.333 0.0381 13.8 4.58 524. #> 6 60 56 106 15 5 0.142 0.333 0.0472 11.2 3.73 528. #> 7 70 57 107 18 6 0.168 0.333 0.0561 9.5 3.17 533. #> 8 80 58 108 21 7 0.194 0.333 0.0648 8.29 2.76 537. #> 9 90 59 109 24 8 0.220 0.333 0.0734 7.38 2.46 541. #> 10 100 60 110 27 9 0.245 0.333 0.0818 6.67 2.22 545.
© ,0%"/4)"-UE1VCMJTIFST