Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
メルカリにおける分析環境整備の取り組み
Search
nagai shinya
August 19, 2020
8.3k
8
Share
メルカリにおける分析環境整備の取り組み
以下のイベントの発表資料です。
https://forkwell.connpass.com/event/182769/
nagai shinya
August 19, 2020
More Decks by nagai shinya
See All by nagai shinya
Analytics Engineeringチームを立ち上げて学んだこと
__hiza__
4
2.4k
1日50万件貯まるクエリのログを活かして、SQLの生成に挑戦している話
__hiza__
7
2.2k
Analytics Engineeringチームの目標管理
__hiza__
71
47k
データ整備の優先順位付けに役立つテクニック
__hiza__
5
3.4k
データマネジメントがちょっと楽になるBigQuery監査ログの使い方
__hiza__
0
6.1k
レガシー化したdata pipelineの廃止
__hiza__
0
1.1k
LookerのDashboardをより柔軟に作る
__hiza__
0
1.6k
Featured
See All Featured
Building AI with AI
inesmontani
PRO
1
910
How to build a perfect <img>
jonoalderson
1
5.4k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
520
Rails Girls Zürich Keynote
gr2m
96
14k
The Art of Programming - Codeland 2020
erikaheidi
57
14k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.1k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
340
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
680
Statistics for Hackers
jakevdp
799
230k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.9k
We Are The Robots
honzajavorek
0
220
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Transcript
1 ϝϧΧϦʹ͓͚ΔੳڥඋͷऔΓΈ גࣜձࣾϝϧΧϦ / JP Data Analyst ӬҪ৳
2 Introduction
3 ! ӬҪ ৳ ! גࣜձࣾϝϧΧϦ / JP ! Data
Analyst ◦ ੳڥͷඋͳͲΛ୲ ࣗݾհ
4 ! ݱঢ় ◦ ͳͥվળʹऔΓΉͷ͔? ! ͋Γ͍ͨ࢟ ◦ վળͷαΠΫϧΛճ͍ͨ͠ɻ !
औΓΈ ◦ ϨΨγʔͳσʔληοτΛഇࢭ͢Δɻ ◦ ͦͷͨΊʹɺۀͱKPIͱج൫ΛηοτͰߟ͑Δɻ ΞδΣϯμ : ϝϧΧϦʹ͓͚Δੳڥͷඋͷࣄྫ
5 ݱঢ় | ͳͥվળʹऔΓΉͷ͔?
6 ! ج൫ ◦ BigQuery + Looker ! ن ◦
ΫΤϦ࣮ߦϢʔβʔ 700ਓҎ্/݄ ◦ ࢀর͞Ε͍ͯΔςʔϒϧ 100Ҏ্/݄ ◦ Analyst, PdM, ML, CS, ͳͲ ϝϧΧϦʹ͓͚Δσʔλͷར༻ঢ়گ
7 ! ଟ͘ͷਓ͕σʔλΛۀʹ͍ͬͯΔɻ ! ੳڥͷඋ → ·ͩ·ͩෆेɻ ! σʔλج൫Λվળ͢Δࣄ͕ɺੜ࢈ੑͷվળʹܨ͕Δɻ ϑΣʔζతʹɺ্ཱͪ͛ɺීٴɺͱ͍͏ΑΓӡ༻ͷ՝͕େ͖͍ɻ
ͳͥվળʹऔΓΉͷ͔?
8 ͋Γ͍ͨ࢟ | վળͷαΠΫϧΛճ͍ͨ͠
9 ͚͍ͨ͜͞ͱ → ෛͷαΠΫϧ ᶃ վળʹऔΓͳ͍ ᶅ վળʹ͑Δ ͕࣌ؒݮΔ ᶄ
ݱঢ়ҡ͕࣋ ͡Θ͡ΘେมʹͳΔ
10 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ
11 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ ᶆ ेͳ༨༟͕Ͱ͖Ε ɹ ɹ ߈ΊͷվળͰ͖Δ
12 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ ᶆ ेͳ༨༟͕Ͱ͖Ε ɹ ɹ ߈ΊͷվળͰ͖Δ ࠓɺऔΓΜͰ͍Δͷ͜͜
13 औΓΈ | ϨΨγʔͳσʔληοτΛഇࢭ͢Δ
14 2ͭͷpipeline ݩςʔϒϧ ੳςʔϒϧ(৽) ੳςʔϒϧ(ϨΨγʔ) BigQuery Production Production͔ΒBigQueryʹσʔ λΛίϐʔ Production͔Βͷpipeline͕
2ܥ౷͋Δ ϝϯςφϯείετ͕͔͞Ήɻ ࣈ͕߹Θͳ͘ͳΔɻ
15 ৽pipelineͷยد ݩςʔϒϧ ੳςʔϒϧ(৽) ੳςʔϒϧ(ϨΨγʔ) BigQuery Production ňϨΨγʔpipelineΛഇࢭ ৽pipelineʹยد͍ͤͨ͠ʼn
͔͠͠ɺϨΨγʔ͔Βܭࢉ͍ͯ͠ ΔKPIΛ͍ͬͯΔνʔϜ͕͋Δɻ ϨΨγʔͳpipelineͷॲཧΛ৽ pipelineͰ࠶ݱ͠Α͏ͱ͍͕ͯͨ͠ ग़དྷͳ͔ͬͨɻ
16 ͳͥϨΨγʔͳpipelineͷґଘ͕ΊΒΕͳ͍ͷ͔? → ϨΨγʔͳpipeline͔Βܭࢉ͍ͯ͠ΔKPIΛۀʹͬͯ ͍Δ͔Β ৽pipelineͷยد
17 ࣮ / ۀཁ݅྆໘͔Βཁ݅Λཧ → ຊ࣭తʹ৽pipelineͷσʔλͰସͰ͖ͦ͏ͩͱ໌ ͳͥͦͷKPI͕ඞཁͳͷ͔? ۀཁ݅ͷཧ
18 ! ۀཁ݅ΛݩʹKPIͷఆٛΛݟ͠ → ఆ্ٛϨΨγʔͷґଘΛͳͤͨ͘ɻ KPIͷఆ͔ٛΒݟ͢
19 औΓΈͷϙΠϯτ ج൫ ࢦඪ ۀ ͚ͩ͜͜มߋ͢Δͷ͍͠ ηοτͰߟ͑Δ ج൫ͱۀ ηοτͰߟ͑Δ
20 औΓΈͷൣғΛߜΓࠐΉ ᶃΛࢀরͨ͠Ϣʔβʔ … 200ਓ/݄ ᶄΛࢀরͨ͠Ϣʔβʔ … 2ਓ/݄ ͬͯΔਓͷਓʹ100ഒҎ্ͷࠩ →
ॏཁͳͱ͜Ζ͔Βͬͨํ͕ྑ͍ table Tableผ ࢀরϢʔβʔ (ۙ30) ᶄ ᶃ
21 ·ͱΊ
22 ! ݱঢ় ◦ ͳͥվળʹऔΓΉͷ͔? → ੜ࢈ੑͷվળʹܨ͕Δ ! ͋Γ͍ͨ࢟ ◦
վળͷαΠΫϧΛճ͍ͨ͠ → ࣌ؒͷ࠶ࢿΛଓ͚Δ ! औΓΈ ◦ ϨΨγʔͳσʔληοτΛഇࢭ͢Δɻ ◦ ͦͷͨΊʹɺۀͱKPIͱج൫ΛηοτͰߟ͑Δɻ ·ͱΊ : ϝϧΧϦʹ͓͚Δੳڥͷඋͷࣄྫ