Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
メルカリにおける分析環境整備の取り組み
Search
nagai shinya
August 19, 2020
8
7.8k
メルカリにおける分析環境整備の取り組み
以下のイベントの発表資料です。
https://forkwell.connpass.com/event/182769/
nagai shinya
August 19, 2020
Tweet
Share
More Decks by nagai shinya
See All by nagai shinya
Analytics Engineeringチームを立ち上げて学んだこと
__hiza__
4
1.9k
1日50万件貯まるクエリのログを活かして、SQLの生成に挑戦している話
__hiza__
7
1.7k
Analytics Engineeringチームの目標管理
__hiza__
64
38k
データ整備の優先順位付けに役立つテクニック
__hiza__
5
2.9k
データマネジメントがちょっと楽になるBigQuery監査ログの使い方
__hiza__
0
5.3k
レガシー化したdata pipelineの廃止
__hiza__
0
1k
LookerのDashboardをより柔軟に作る
__hiza__
0
1.5k
Featured
See All Featured
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
4
500
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
7
620
Six Lessons from altMBA
skipperchong
27
3.7k
Site-Speed That Sticks
csswizardry
4
450
RailsConf 2023
tenderlove
29
1k
What's in a price? How to price your products and services
michaelherold
245
12k
Visualization
eitanlees
146
16k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
129
19k
Transcript
1 ϝϧΧϦʹ͓͚ΔੳڥඋͷऔΓΈ גࣜձࣾϝϧΧϦ / JP Data Analyst ӬҪ৳
2 Introduction
3 ! ӬҪ ৳ ! גࣜձࣾϝϧΧϦ / JP ! Data
Analyst ◦ ੳڥͷඋͳͲΛ୲ ࣗݾհ
4 ! ݱঢ় ◦ ͳͥվળʹऔΓΉͷ͔? ! ͋Γ͍ͨ࢟ ◦ վળͷαΠΫϧΛճ͍ͨ͠ɻ !
औΓΈ ◦ ϨΨγʔͳσʔληοτΛഇࢭ͢Δɻ ◦ ͦͷͨΊʹɺۀͱKPIͱج൫ΛηοτͰߟ͑Δɻ ΞδΣϯμ : ϝϧΧϦʹ͓͚Δੳڥͷඋͷࣄྫ
5 ݱঢ় | ͳͥվળʹऔΓΉͷ͔?
6 ! ج൫ ◦ BigQuery + Looker ! ن ◦
ΫΤϦ࣮ߦϢʔβʔ 700ਓҎ্/݄ ◦ ࢀর͞Ε͍ͯΔςʔϒϧ 100Ҏ্/݄ ◦ Analyst, PdM, ML, CS, ͳͲ ϝϧΧϦʹ͓͚Δσʔλͷར༻ঢ়گ
7 ! ଟ͘ͷਓ͕σʔλΛۀʹ͍ͬͯΔɻ ! ੳڥͷඋ → ·ͩ·ͩෆेɻ ! σʔλج൫Λվળ͢Δࣄ͕ɺੜ࢈ੑͷվળʹܨ͕Δɻ ϑΣʔζతʹɺ্ཱͪ͛ɺීٴɺͱ͍͏ΑΓӡ༻ͷ՝͕େ͖͍ɻ
ͳͥվળʹऔΓΉͷ͔?
8 ͋Γ͍ͨ࢟ | վળͷαΠΫϧΛճ͍ͨ͠
9 ͚͍ͨ͜͞ͱ → ෛͷαΠΫϧ ᶃ վળʹऔΓͳ͍ ᶅ վળʹ͑Δ ͕࣌ؒݮΔ ᶄ
ݱঢ়ҡ͕࣋ ͡Θ͡ΘେมʹͳΔ
10 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ
11 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ ᶆ ेͳ༨༟͕Ͱ͖Ε ɹ ɹ ߈ΊͷվળͰ͖Δ
12 Γ͍ͨ͜ͱ → վળͷαΠΫϧΛճ͢ ᶃ վળʹऔΓΉ ᶄ ݱঢ়ҡ͕࣋ ͪΐͬͱָʹͳΔ ᶅ
վળʹ͑Δ ͕࣌ؒ૿͑Δ ᶆ ेͳ༨༟͕Ͱ͖Ε ɹ ɹ ߈ΊͷվળͰ͖Δ ࠓɺऔΓΜͰ͍Δͷ͜͜
13 औΓΈ | ϨΨγʔͳσʔληοτΛഇࢭ͢Δ
14 2ͭͷpipeline ݩςʔϒϧ ੳςʔϒϧ(৽) ੳςʔϒϧ(ϨΨγʔ) BigQuery Production Production͔ΒBigQueryʹσʔ λΛίϐʔ Production͔Βͷpipeline͕
2ܥ౷͋Δ ϝϯςφϯείετ͕͔͞Ήɻ ࣈ͕߹Θͳ͘ͳΔɻ
15 ৽pipelineͷยد ݩςʔϒϧ ੳςʔϒϧ(৽) ੳςʔϒϧ(ϨΨγʔ) BigQuery Production ňϨΨγʔpipelineΛഇࢭ ৽pipelineʹยد͍ͤͨ͠ʼn
͔͠͠ɺϨΨγʔ͔Βܭࢉ͍ͯ͠ ΔKPIΛ͍ͬͯΔνʔϜ͕͋Δɻ ϨΨγʔͳpipelineͷॲཧΛ৽ pipelineͰ࠶ݱ͠Α͏ͱ͍͕ͯͨ͠ ग़དྷͳ͔ͬͨɻ
16 ͳͥϨΨγʔͳpipelineͷґଘ͕ΊΒΕͳ͍ͷ͔? → ϨΨγʔͳpipeline͔Βܭࢉ͍ͯ͠ΔKPIΛۀʹͬͯ ͍Δ͔Β ৽pipelineͷยد
17 ࣮ / ۀཁ݅྆໘͔Βཁ݅Λཧ → ຊ࣭తʹ৽pipelineͷσʔλͰସͰ͖ͦ͏ͩͱ໌ ͳͥͦͷKPI͕ඞཁͳͷ͔? ۀཁ݅ͷཧ
18 ! ۀཁ݅ΛݩʹKPIͷఆٛΛݟ͠ → ఆ্ٛϨΨγʔͷґଘΛͳͤͨ͘ɻ KPIͷఆ͔ٛΒݟ͢
19 औΓΈͷϙΠϯτ ج൫ ࢦඪ ۀ ͚ͩ͜͜มߋ͢Δͷ͍͠ ηοτͰߟ͑Δ ج൫ͱۀ ηοτͰߟ͑Δ
20 औΓΈͷൣғΛߜΓࠐΉ ᶃΛࢀরͨ͠Ϣʔβʔ … 200ਓ/݄ ᶄΛࢀরͨ͠Ϣʔβʔ … 2ਓ/݄ ͬͯΔਓͷਓʹ100ഒҎ্ͷࠩ →
ॏཁͳͱ͜Ζ͔Βͬͨํ͕ྑ͍ table Tableผ ࢀরϢʔβʔ (ۙ30) ᶄ ᶃ
21 ·ͱΊ
22 ! ݱঢ় ◦ ͳͥվળʹऔΓΉͷ͔? → ੜ࢈ੑͷվળʹܨ͕Δ ! ͋Γ͍ͨ࢟ ◦
վળͷαΠΫϧΛճ͍ͨ͠ → ࣌ؒͷ࠶ࢿΛଓ͚Δ ! औΓΈ ◦ ϨΨγʔͳσʔληοτΛഇࢭ͢Δɻ ◦ ͦͷͨΊʹɺۀͱKPIͱج൫ΛηοτͰߟ͑Δɻ ·ͱΊ : ϝϧΧϦʹ͓͚Δੳڥͷඋͷࣄྫ