Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GCPではじめるスモールスタートなデータ活用
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Takashi Nishibayashi
September 06, 2016
Technology
3
3k
GCPではじめるスモールスタートなデータ活用
2016-09-06
bq_sushi #4での発表資料です
Takashi Nishibayashi
September 06, 2016
Tweet
Share
More Decks by Takashi Nishibayashi
See All by Takashi Nishibayashi
病院向け生成AIプロダクト開発の実践と課題
hagino3000
0
520
入院医療費算定業務をAIで支援する:包括医療費支払い制度とDPCコーディング (公開版)
hagino3000
0
170
診断前の病歴テキストを対象としたLLMによるエンティティリンキング精度検証
hagino3000
1
170
論文紹介 Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models
hagino3000
0
920
論文紹介 Audience Size Forecasting Fast and Smart Budget Planning for Media Buyers
hagino3000
0
250
論文紹介 Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems
hagino3000
1
660
論文紹介 Budget Management Strategies in Repeated Auctions (公開版)
hagino3000
2
320
論文紹介 A Request-level Guaranteed Delivery Advertising Planning: Forecasting and Allocation
hagino3000
1
150
論文紹介 Online Experimentation with Surrogate Metrics Guidelines and a Case Study
hagino3000
1
400
Other Decks in Technology
See All in Technology
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
3k
茨城の思い出を振り返る ~CDKのセキュリティを添えて~ / 20260201 Mitsutoshi Matsuo
shift_evolve
PRO
1
150
We Built for Predictability; The Workloads Didn’t Care
stahnma
0
130
オープンウェイトのLLMリランカーを契約書で評価する / searchtechjp
sansan_randd
3
610
無ければ作る! バイブコーディングで作ったものを一気に紹介
tatsuya1970
0
110
サイボウズ 開発本部採用ピッチ / Cybozu Engineer Recruit
cybozuinsideout
PRO
10
73k
Application Performance Optimisation in Practice (60 mins)
stevejgordon
0
120
ファインディの横断SREがTakumi byGMOと取り組む、セキュリティと開発スピードの両立
rvirus0817
1
970
Meshy Proプラン課金した
henjin0
0
180
M&A 後の統合をどう進めるか ─ ナレッジワーク × Poetics が実践した組織とシステムの融合
kworkdev
PRO
1
300
Webhook best practices for rock solid and resilient deployments
glaforge
1
200
クレジットカード決済基盤を支えるSRE - 厳格な監査とSRE運用の両立 (SRE Kaigi 2026)
capytan
6
1.9k
Featured
See All Featured
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
89
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
170
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
440
Java REST API Framework Comparison - PWX 2021
mraible
34
9.1k
Utilizing Notion as your number one productivity tool
mfonobong
2
210
Docker and Python
trallard
47
3.7k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Optimizing for Happiness
mojombo
379
71k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Facilitating Awesome Meetings
lara
57
6.7k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
Transcript
1 GCPͰ͡ΊΔ εϞʔϧελʔτͳσʔλ׆༻ #bq_sushi ver. bq_sushi #4 2016-09-06 Takashi Nishibayashi
2 Takashi Nishibayashi Software Engineer Zucks AdNetwork, Zucks Inc. Data
analysis team ݱࡏ৴ޮͷ࠷దԽ ೖࡳՁ֨ࣗಈௐϩδοΫɺ৴αʔ όʔͷࠂબϩδοΫͷ։ൃʹैࣄ @hagino3000
3 3 ͜ΕԿ͔ ಉͷGCP NEXT TOKYOͷࣄྫհηογϣ ϯͰൃදͨ͠༰ͷॖখ൛Ͱ͢
4 4 Zucks AdNetwork ͷσʔλ׆༻ͷมભ
5 5 5 ϓϩδΣΫτ։࢝࣌ͷཧͱݱ࣮
6 6 6 ࢦ͢ॴ(Ծ) ࠂ৴αʔόʔͰΠϯϓϨογϣϯຖʹػցֶशϞσϧʹΑΔίϯ όʔδϣϯ༧ଌɺΫϦοΫ༧ଌΛߦͳ͍৴ޮΛΞοϓ ݱ࣮ େྔͷϩάϑΝΠϧ͕༷ʑͳϑΥʔϚοτͰAWS S3ʹஔ͔Ε͍ͯΔ ϚελσʔλMySQLʹ֨ೲ͞Ε͍ͯΔ
Elastic SearchʹೖͬͯΔͷۙ2िؒ
7 7 7
8 8 8 ͍͖ͳΓ౸ୡͰ͖ͳ͍
9 1ظ: ·ͣσʔλαΠΤϯςΟετ͕ར༻Ͱ͖ΔΑ͏ʹ ü ωοτࠂۀքͰػցֶश͕ྲྀߦ͍ͬͯΔͱ͍͑ɺࣗαʔϏεͷ σʔλͰͦΕ͕Մೳͳͷ͔ݕূ͍ͨ͠ ü ࣮ݧԾઆݕূͷͨΊʹਓ͕ؒσʔλΛखܰʹར༻͍ͨ͠ ü ݶΒΕͨਓ͕ؒΫΤϦूܭΛ࣮ߦͰ͖Εྑ͍
ü ඦϛϦඵͷԠੑೳٻΊͳ͍ ü σʔλετΞͷཧʹख͕͔͔ؒΒͳ͍ࣄ͕ॏཁ ü σʔλྔ 600GByte/day ఔ͕ͩɺ·ͩ·ͩ૿͑ͦ͏
10 1ظ: ·ͣσʔλαΠΤϯςΟετ͕ར༻Ͱ͖ΔΑ͏ʹ ² ࠂͷ৴ϩάΛBigQueryʹྲྀ͠ࠐΜͩ ² MySQLͷϚελσʔλBigQueryʹಉظ ² WebUIPandasɺBigQuery Pythonܦ༝Ͱར༻
² BigQueryͰαϒαϯϓϦϯάͯ͠ϩʔΧϧϚγϯͰֶश ² AWS EMRୀ ² Elastic Searchୀ ² Cloud Datalab betaʹඈͼ͍ͭͯരࢮ (20161݄)
11 2ظ: όονॲཧ͔Βར༻Ͱ͖ΔΑ͏ʹ ü ܧଓతʹճ͍࣮ͨ͠ݧɺ༧ଌॲཧͷόονΛcronͰΒ͍ͤͨ ü ੳλεΫʹݶΒͣɺ৴γεςϜଆͷόονॲཧ͍͍ͨ ü ػೳຖͷ༻ঢ়گ(ΫΤϦίετ)Ѳ͍ͨ͠
12 2ظ: όονॲཧ͔Βར༻Ͱ͖ΔΑ͏ʹ ² CloudLoggingͷઃఆͰBigQueryͷࠪϩάΛBigQueryʹΤΫεϙʔτ ² ػೳຖʹαʔϏεΞΧϯτΛ͍ग़ͯ͠ɺ༻ঢ়گΛѲ ² ίετ͕ͶͨΒ௨ ²
ೖࡳ୯Ձࣗಈௐόονɺෆਖ਼ΫϦοΫఆόον͕Քಈ ² ϧʔϧϕʔεɺҟৗݕϕʔεͷࣝผλεΫSQLͰॻ͚Δ ² ࣮ݧ݁ՌCloud Storage/BigQueryʹอଘ
13
14
15 Audit Logͷ༻్ ² ػೳຖͷΫΤϦίετ ² ຖͷΫΤϦίετ ² ςετ༻ͷςʔϒϧ࡞ऀௐࠪ ²
ΘΕ͍ͯͳ͍ςʔϒϧௐࠪ
16 3ظ: ͯ͢ͷ৬छͷϝϯόʔ͕σʔλΛར༻Ͱ͖ΔΑ͏ʹ ü ఆܕͷௐࠪλεΫΤϯδχΞ๊͕͑ͨ͘ͳ͍ ü ίετ͕രൃ͠ͳ͍Α͏ʹར༻ऀΛ૿͍ͨ͠ ü SQLॻ͚Δਓ͕૿͑Δͱྑ͍ײ͡ʹͳΔͷͰ
17 3ظ: ͯ͢ͷ৬छͷϝϯόʔ͕σʔλΛར༻Ͱ͖ΔΑ͏ʹ ² re:dashͰΫΤϦͰ͖ΔΑ͏ʹͨ͠ ² ΤϯδχΞ͕ཁΛݩʹςϯϓϨʔτͷΫΤϦΛ࡞ ² Ϩϙʔτը໘ͷϓϩτλΠϓʹ ²
ΫΤϦ୯ҐͷίετϦϛοτઃఆ(re:dashͷػೳ)ͰߴֹΫΤϦ࣮ߦ Λࢭ
18 ཁٻ͞ΕΔσʔλ࣭ϨϕϧมΘΔ ü Ϣʔεέʔε͕૿͑Δͱσʔλ࣭͕՝ʹ ü 23࣌ͷϩάऔΓࠐΈ͕ऴͬͨޙʹॲཧΛΒ͍ͤͨΜ͚ͩͲ? ² Stream Insert, Batch
Insert, ΫΤϦશͯϦτϥΠػߏඞਢ ² ݄ʹ1BigQueryͷௐࢠͷѱ͍͕͋Δ ² σʔλͷऔΓࠐΈ࿙ΕɺॏෳऔΓࠐΈνΣοΫͷόονΛՔಇ ² σʔλͷऔΓࠐΈঢ়گ͕֎෦͔Β֬ೝͰ͖ΔΈ
19 ෭࣍తՌ • ΤϯδχΞ͕͍ͭͰ৴ϩάͷௐ͕ࠪՄೳʹ • MySQLͰѻ͑ͳ͔ͬͨαΠζͷσʔλΛݩʹͨ͠ҙࢥܾఆ͕Մೳʹ • ༷ʑͳόονॲཧ͕σʔλΛར༻Մೳʹ • SQLΛॻ͚ͩ͘ͰϨϙʔτ͕ࣗ༝ʹ࡞Մೳʹ
• ϓϩδΣΫτͷϝϯόʔશһ͕σʔλʹΞΫηεՄೳʹ
20 ͦͷଞ • ΦϯϥΠϯͰσʔλΛࢀর͢ΔΑ͏ͳॲཧʹBigQuery͔ͳ͍ • Key-ValueͰҾ͚ΔΑ͏ʹͯ͠BigtableΛͬͨํ͕͍͍ • BigQueryͷલʹΩϟογϡϨΠϠΛ༻ҙ͢Δࣄྫ • Cloud
Dataproc or Cloud Dataflow…… • SpotifySparkෳࡶ͗ͯ͑͢ͳ͍ͱͷࣄͰDataflowΛscala͔Βར༻ • https://github.com/spotify/scio • Cloud Datalab͕৽͘͠ͳͬͨͦ͏ͳͷͰظ • Jupyter NotebookͷΫϥυ൛
21 ·ͱΊ • ͍͖ͳΓ͍͠ॴΛૂ͏ͱՌ͕ग़Δ·Ͱ͕͔͔࣌ؒΔͨΊɺͳΒ͠Λ͠ ͳ͕Βσʔλ׆༻ΛਐΊ͍ͯΔ • SQLͰهड़Ͱ͖Δϧʔϧϕʔεҟৗݕϕʔεͷॲཧػցֶशͱൺֱ͢ Δͱૣ͘Ռ͕ग़ͤΔ • Cloud
Storage, Cloud Logging, Cloud Dataprocͱͷ࿈ܞ͕ڧԽ͞Εɺ BigQueryͷϢʔεέʔε͕૿͑ͨ • ඦmsecͷԠੑೳɺಉ࣌ΫΤϦ࣮ߦɺ҆ఆੑΛٻΊͳ͚ΕBigQuery Ϧʔζφϒϧʹ͑Δ
22 ิ BigQueryͰ౷ܭྔΛग़࣌͢ʹ͏ΫΤϦϝϞ http://qiita.com/hagino3000/items/e9ed62638ebe54391188
23 23 Thank You