Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cloud Composer & Dataflow によるバッチETLの再構築 #data_m...
Search
yuzutas0
PRO
July 19, 2019
Technology
11k
33
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Cloud Composer & Dataflow によるバッチETLの再構築 #data_ml_engineering / 20190719
データとML周辺エンジニアリングを考える会#2の発表資料です。
https://data-engineering.connpass.com/event/136756/
yuzutas0
PRO
July 19, 2019
More Decks by yuzutas0
See All by yuzutas0
OLSにおける推定量β1=共分散÷分散の導出 / 20230517
yuzutas0
PRO
2
690
民間企業におけるデータ整備の課題と工夫 / 20220305
yuzutas0
PRO
15
8.1k
累計参加者8,500名! #DataEngineeringStudy の43スライドから学ぶ、データエンジニアリングの羅針盤 / 20220224
yuzutas0
PRO
14
5.4k
あの人の自分戦略を聞きたい!2022 #devsumi / 20220218
yuzutas0
PRO
4
4.3k
データ基盤による利益最大化と初期構築プロセス / 20220209
yuzutas0
PRO
10
6.8k
Engineer Career Lounge#1「エンジニアの成長戦略を考える」 #ECLounge カンニングペーパー / 20211217
yuzutas0
PRO
3
1.5k
Data Management Guide - 事業成長を支えるデータ基盤のDev&Ops #TechMar / 20211210
yuzutas0
PRO
22
27k
[投影資料]『実践的データ基盤への処方箋』の刊行にあたって #TechMar / 20210210-2
yuzutas0
PRO
1
3.8k
DXを妨げる要因と実現へのアプローチ by @yuzutas0 / 20211022
yuzutas0
PRO
55
47k
Other Decks in Technology
See All in Technology
現場のトークンマネジメント
dak2
1
190
徹底討論!ECS vs EKS!
daitak
3
1.7k
サイバーエージェントにおけるAI推進戦略と変革への取り組み
shotatsuge
0
530
「ビジネスがわかるエンジニア」とは何か?
ryooob
0
300
WebGIS AI Agentの紹介
_shimizu
0
560
作る力から、見極める力へ — AI時代に広がるエンジニアの価値と役割
rince
0
330
LayerX コーポレートエンジニアリング室におけるサプライチェーンセキュリティへの取り組み / Supply Chain Security at LayerX Corporate Engineering
yuyatakeyama
3
840
「勝手に広まる」人気 AI エージェントを爆速で作ろう!(AWS Summit Japan 2026講演資料)
minorun365
PRO
10
2.5k
iOS アプリの「これって不具合ですか?」を AI に調べてもらう
miichan
0
140
“詰む”前に仕組みを作れ 〜技術の波に溺れないためのキャッチアップ術〜
takasyou
7
3.8k
Bucharest Tech Week 2026 - Guardians of the Cloud-Native Galaxy
edeandrea
PRO
0
140
AIチャット検索改善の3週間
kworkdev
PRO
2
170
Featured
See All Featured
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
2
400
Automating Front-end Workflow
addyosmani
1370
210k
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
620
From π to Pie charts
rasagy
0
220
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.7k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Un-Boring Meetings
codingconduct
0
320
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
1
250
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.3k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2.1k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
10k
Transcript
Cloud Composer & Dataflow ʹΑΔ όονETLͷ࠶ߏங 2019-07-19 #data_ml_engineering presented by
@yuzutas0 https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ https://www.pexels.com/photo/architecture-blur-building-colourful-392031/
WEBʹެ։ࡁΈͰ͢ #data_ml_engineering ɹࡱӨϝϞෆཁͰ͢ɻϦϥοΫεͯ͠ฉ͍͍͚ͯͨͩΕͱࢥ͍·͢ɻ εϥΠυ 70+ ຕ ɹΞδΣϯμʲ4ʳΛॏతʹɺଞϥΠτχϯάͰτʔΫ͠·͢ɻ ɹ࠙λΠϜɾSNSͰͷQ&AαϙʔτΛલఏͱͨ͠༰ʹͳΓ·͢ɻ ςΫϊϩδʔثͩͱࢥ͍ͬͯ·͢
ɹతɾ੍ʹԠ͍͚ͯ͡·͠ΐ͏ɻಛఆͷٕज़ཁૉΛਪ͢ΔൃදͰ͋Γ·ͤΜɻ ɹҙɾ໔
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹ@yuzutas0 ɹɹ
ɹաڈͷొஃࢿྉ σʔλج൫ͷϊϋɾݟΛఏڙ͍ͯ͠·͢ PyCon JP ϕεττʔΫΞϫʔυ༏ल σϒαϛՆ ΞϯέʔτຬNo.1
ʮ࠶ߏஙʯͷࣄྫΛఏڙ͢Δ ͋͘·Ͱ1ͭͷࣄྫͳͷͰ ࣗ͝ͷٕज़ཁૉ৫ঢ়گͱൺͳ͕Βߟ͑ͯ ࣗͳΓͷֶͼΛಘ͍ͯͩ͘͞ ɹຊͷझࢫ
ϩάऩूETLʹ͍ͭͯ γεςϜߏஙɾӡ༻ͷ࣮Λ୲͏ ιϑτΣΞΤϯδχΞ ͱɺͦͷΫϥΠΞϯτɾϚωʔδϟʔʢʹͳΔ༧ఆͷਓʣ ɹຊͷఆλʔήοτ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹϝϧΧϦʢCtoCϑϦϚʣ
FY2019.6 3Q ܾࢉઆ໌ձࢿྉ https://pdf.irpocket.com/C4385/eHSm/vwwn/oECA.pdf ɹࣄۀʢʹσʔλ૿ྔʣ
ɹάϩʔόϧɾ৽نࣄۀ
https://speakerdeck.com/hik0107/mercari-bi-team-data-analytics-summit-2018 ɹੵۃతͳσʔλ׆༻
ɾϓϩμΫτ͕৳ͼ͍ͯΔ ɾσʔλྔ͕ٸܹʹ૿͍͑ͯΔ ɾάϩʔόϧ৽نࣄۀΛ৳͢ମ੍Λ࡞͍ͬͯΔ ɾੳMLͳͲσʔλΛੵۃతʹ׆༻͍ͯ͠Δ ɹ·ͱΊ of ಛ
ʮBQͷσʔλ͕ߋ৽͞Ε͍ͯͳ͍ΜͰ͚͢Ͳʂʯ ʢҰ෦ͷςʔϒϧ݄ࢭ·͍ͬͯͨʣ ɹݱͰੜ͍ͯͨ͡՝
ɹ௧ ϓϩμΫτˢ σʔλˢ ෛՙˢ ར༻ऀˢ Good Good Bad
Bad ❌ γεςϜɺବͰ͢ʂ ߋ৽͞Ε͍ͯͳ͍Μʂ
ɹྺ࢙తܦҢ ETL System ETL for US ETL for
JP ࡞ͬͨʂ ϝϯςʂ US Team ຊۀͷΒ ળҙͰαϙʔτ ʢਖ਼ݶք͕͋Δʣ JP SRE JP BI JPཉ͍͠ʂ ૬Γͤͯ͞ʂ ґཔ USΞϓϦΛ ྑ͘͢Δͧʂ JPΞϓϦຊ൪ڥ ͕࠷༏ઌͩʂ ੳۀʹ ઐ೦͢Δͧʂ ETL for UK
ɹ͜ͷҊ݅ͷείʔϓᶃ ϓϩμΫτ Ϣʔβʔ DBɾϩά ࢪࡦɾۀ BigQuery ऩू ૄ௨
׆༻ Ձ %BUB0QTʹ͓͍ͯ ࠷େԽ͖͢తม
ɹ͜ͷҊ݅ͷείʔϓᶄ Other Product DB .POPMJUI "11#& Other Other
BigQuery ॱ࣍Ҡ༧ఆ Read Only Replica ػີใ ϚεΩϯά DB .JDSP TFSWJDFT DB .JDSP TFSWJDFT DB .JDSP TFSWJDFT ੴङDC GCP
ɾ݄ߋ৽͞Ε͍ͯͳ͍σʔλ ɾ͋ͳͨͩͬͨΒͲ͏͠·͔͢ʁ ɹToday’s Issue
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹؔऀώΞϦϯά ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ɹܭଌ͢Δ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ؔऀҰಉʮ༧ΑΓ൵ࢂͳ͜ͱʹͳ͍ͬͯΔʯ ɹBQߋ৽ԆbotΛ࡞ͬͨ
ຖ࣮࣌ߦ dataset.__TABLES__ ΛSELECT ϝλใΛεφοϓγϣοτอଘ pandas.read_csv() Ͱऔಘ νΣοΫ࣌ؒɺରςʔϒϧ ௨ઌνϟϯωϧ pandas.read_gbq() Ͱ
ςʔϒϧ໊ͱ ࠷ऴߋ৽࣌Λऔಘ ߋ৽༗ແΛఆ slackweb.Slack(). notify() Ͱ ࢦఆνϟϯωϧʹ௨ ɹBQ update checker / implementation IUUQTXXXqBUJDPODPNGSFFJDPODTW@ ύωϧσʔλΛੳͰ͖ΔΑ͏ʹੵ
ɹBQ update checker / design http://yuzutas0.hatenablog.com/entry/2017/05/23/073000 BigQuery
ɹBQ update checker / docs for user (1)
ɹBQ update checker / docs for user (2)
ɹՄࢹԽ → ߹ҙܗ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ
σϕϩού ʮݴ͏΄Ͳ͔ʁʯ ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ ༏ઌॱΛ্͛ͯରԠʂ
ɹԆ໋͢Δ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
ΞφϦετͱҰॹʹʮͱΓ͋͑ͣϦτϥΠʯ Ԇ͍ͯ͠ͳ͍ςʔϒϧͷ࿈ܞ·Ͱಓ࿈ΕͰશ໓ ʢೋ࣍ࡂʣ ʮར༻ऀ͕ఆ͍ͯ͠Δ΄Ͳ؆୯ͳঢ়گͰͳ͍ʯ͕ՄࢹԽ͞Εͨ ɹఆରԠ IUUQTXXXQFYFMTDPNQIPUPCSPXOBOEXIJUFUBCCZLJUUFO
USݖݶΛఆൃߦͯ͠Βͬͯௐࠪ։࢝ ॏ͗ͯ͢ཧը໘͕։͚ͳ͍ ίπΛڭ͑ͯΒ͏ͱ͜Ζ͔Β…… http://{ip_or_domain}/admin/airflow/tree?dag_id={id}&num_runs=1 ɹ҉தࡧ IUUQTXXXQFYFMTDPNQIPUPHSFZDPODSFUFSPBE
ɾσʔλ૿Ճʹ͏λΠϜΞτ͕ଟൃ ɾશδϣϒ͕ྻ࣮ߦͰޙଓॲཧΛר͖ࠐΉ ʢJDBC→DBͷΞΫηεෛՙΛ͑ΔҙਤͰͷઃܭʣ ɾUSνʔϜಉ͡Έ͕ͩδϣϒͷ͚ํΛ ɾJPͦ͜·Ͱग़དྷ͍ͯͳ͔ͬͨ ʢ૬Γʴยखؒͷળҙαϙʔτͩͱݶք͕͋Δʣ ɹௐࠪ
Ԧಓͷखஈͱͯ͠USνʔϜͱಉ༷ͷνϡʔχϯά ʢ҆қͳ࠶ߏஙʹಀ͛ͳ͍ʂʣ ͨͩ͠ ɾΈΛΩϟονΞοϓ͢Δͱ͜Ζ͔Βελʔτ ɾෛՙͰΤϥʔ͕ى͖͍ͯΔطଘγεςϜӨڹΛߟྀ͠ͳ͕Β࡞ۀ ɹνϡʔχϯά͔ʁ
ϝϧϖΠDataplatformTeam͔ΒఏҊ ʮ͜ΜͳΜ࡞ͬͨΜ͚ͩͲྑ͔ͬͨΒԣల։͠·ͤΜʁʯ ɹϦϏϧυ͔ʁ ϝϧϖΠʹ͓͚Δେنόονॲཧ - Mercari Engineering Blog
https://tech.mercari.com/entry/2019/06/05/120000
̋ ̋ ˕ ˕ ɹൺֱݕ౼ γεςϜ αϙʔτ 64
&5-4ZTUFN "JSqPXPO(,& 4QBSLFBSMZ νϡʔχϯά͢Εػೳཁ݅ΛຬͨͤΔ ͣ ཧɾ͕࣌ࠩ͋Δ ඇಉظͰ૬ஊՄೳ .FSQBZ #BUDI1JQFMJOF $MPVE$PNQPTFS %BUBqPXMBUFMZ ػೳཁ݅ΛຬͨͤΔ GVMMNBOBHFEͰ૬ରతʹ͍͍͢ ͣ ཧతʹΦϑΟε͕͍ۙ ૬ஊ͍͢͠
໌Β͔ʹ “ETLγεςϜઃܭ” ͷͰͳ͘ ”JPઐϝϯςφͷظෆࡏ” ͱ “ͦ͏ͳΔʹࢸͬͨ৫తྗֶ” ͕ ਅʹղ͖͘Πγϡʔ
“σʔλૄ௨͕ࢭ·͍ͬͯΔ” ණࢁͷҰ֯ ͳΔ͘ϚΠϯυγΣΞΛׂ͔ͣʹࡁΉΑ͏ʹ “͍͔ʹٕज़໘ͰϥΫͯ͠ରԠ͢Δ͔” ͕ҙࢥܾఆͷ࣠ͱͳΔ ɹҙࢥܾఆͷϙΠϯτ IUUQTXXXJSBTVUPZBDPNCMPHQPTU@IUNM
https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ ࠶ߏஙʴར༻ସͷ΄͏͕ૣྃ͘Ͱ͖Δͱஅ ʢ҆қͳ࠶ߏஙʹಀ͛·ͨ͠ʂʣ ɹϦϏϧυʂ ͪͳΈʹΦν ɹᶃϝϧϖΠͷύΠϓϥΠϯϑϧGCPલఏͷߏͳͷͰɺͦͷ··ͷԣల։ग़དྷͳ͔ͬͨ ɹᶄUSνʔϜUSνʔϜͰԆՄࢹԽΛड͚ͯJPͷδϣϒΛվमͯͩͬͨ͘͠͞
ɹՄࢹԽ → ߹ҙܗ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ
σϕϩού ʮݴ͏΄Ͳ͔ʁʯ ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ ϑΥʔΧε͢Δ ఆରԠʹ࣌ؒɾ࿑ྗΛׂ͔ͳ͍
ɾସςʔϒϧͷ֓ࢉͰࡁ·ͤΔ ɾBQʹͳ͍σʔλΛεΫϦϓτͰࢀর͢Δ ɾݟπʔϧΛੵۃతʹڞ༗͠߹͏ ෆ҆ఆͳγεςϜʹաґଘͤͣʹۀΛߦ͢ΔੌΈ͕͋Δοʂ ʢతʹḷΓணͨ͘Ίͷखஈɾܦ࿏1ͭͰͳ͍ʣ ɹΞφϦετͷ͕͋ͬͯͦ͜ https://www.pexels.com/photo/group-hand-fist-bump-1068523/
ɹ߹ҙܗ·ͱΊ ՝ ղܾ ΞφϦετ ʮࢭ·ͬͯΔʂʯ ʮࠓ΄͍͠ʂఆରԠΛʂʯ σϕϩού ʮݴ͏΄Ͳ͔ʁʯ
ʮ࠶ߏஙͨ͠΄͏͕͍͍ʂʯ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ɹγεςϜߏ Replica DB
ɹγεςϜߏ Replica DB ͜͜ !TJSPLFO͞Μ͕ ྑ͍ײ͡ʹ ͬͯ͘Ε·ͨ͠
ɹγεςϜߏ Replica DB ͜͜Λ͠·͢
ɹCloud Composer: DAG Runs ᶃόϦσʔγϣϯ ᶄDataflow࣮ߦ ᶅGCSϑΝΠϧऔಘ ᶆBQ
Load (ࠩ or શ݅)
ɹComposer → Dataflow ʢਖ਼֬ʹGCS্ʹඋ͞Ε͍ͯΔʣTemplate Λࢦఆͯ͠ Cloud Dataflow ʹ࣮ߦ໋ྩΛૹΔ
ɹCloud Dataflow: ETL ᶃGCS͔ΒdumpϑΝΠϧΛread ᶄѱຐվͷมॲཧͰσʔλΛmodify ᶅGCSʹBQ LoadableͳϑΝΠϧΛwrite ಈ࡞֬ೝͰΤϥʔΛ௵͠ͳ͕Β
มॲཧΛ࡞ΓࠐΉ ※ΤϯϋϯεͷͨΊ࠷৽ঢ়گͱဃ͕͋Γ·͢ɻ
ɹWhy Dataflow? ɾmysqldumpͷTSVϑΥʔϚοτͰBigQueryʹLoadͰ͖ͳ͍ → ཁܗ ɹɹɾdouble-quotation-marks escaped by
double-quotation-marks in double-quotation-marks ɹɹɾnew-line escaped by double backslashes ɾσʔλྔ͕ଟ͍ͷͰDBෛՙˍύϑΥʔϚϯε؍͔Β ɹεέʔϥϏϦςΟͷߴ͍DataflowʹॲཧΛدͤͨ ɾDataflowมஔͱͯ͠ΛׂΓ͍ͬͯΔͷͰ ɹDataflow → BigQuery ʹLoadͤͣɺGCSʹมޙϑΝΠϧΛஔ͍͍ͯΔ ɾ࣮ߦڥPython3.5 (supported at Apache Beam 2.11.0 / Mar 5, 2019)
ɹDataflow Onboard by @rilmayer_jp
ɹTest Code for Transform σόοάͰΤϥʔ͕ग़ͨ σʔλύλʔϯΛςετʹ͏ σόοάͰΤϥʔ͕ग़ͨ ςʔϒϧͷσʔλΛςετʹ͏
beamϞδϡʔϧ MagicMockʹͯ͠ ϩδοΫ෦͚ͩ ίʔυͰςετ
ɹComposer → BQ: શ݅ߋ৽ GCS → BQ Load
ɹComposer → BQ: ࠩߋ৽ ݩςʔϒϧ + tmpςʔϒϧ ˠ
Union ALL → ॏෳআڈ → ্ॻ͖ tmpςʔϒϧΛআ ࠩσʔλΛtmpςʔϒϧʹload ৄ͘͠ҎԼͷهࣄΛࢀর͍ͩ͘͞ʂ ඦGBͷσʔλΛMySQL͔ΒBigQueryಉظ͢Δ https://tech.mercari.com/entry/2018/06/28/100000
ɹRebuilt BQ / docs for user (1)
ɹRebuilt BQ / docs for user (2)
ɹRebuilt BQ / docs for user (3) ʢ݄์ஔ͞Ε͍ͯΔʣݱঢ়ΑΓ
ʮϚγʹͳΔʯͰσʔλར༻ऀͱѲΔ ɹɾա࣭ʹ͠ͳ͍ ɹɾܭଌʢԆࢹʣͱαϙʔτ໌ه ɹɾᐆດͳͷᐆດͰ͋Δ͜ͱΛ໌ه
Ұ෦νʔϜʹఏڙ → ڥґଘͷো → ݕɾՐফ͠ɾରԠϑϩʔͷඋ ɹCanary Release
Sprint + Increment: ܧଓతվળͷϦζϜΛ࡞Δ ɹִिසͰஈ֊ϦϦʔε W W W
0QT Ұ෦ͷνʔϜ͔Βఏڙ ࣍ͷνʔϜʹఏڙ ʜʜ ར༻ҊW 2"ɾϑΟʔυόοΫ ར༻ҊW 2"ɾϑΟʔυόοΫ ʜʜ %BUB શ݅ߋ৽ͰࡁΉςʔϒϧ ࠩߋ৽͠ͳ͍ͱਏ͍ςʔϒϧ ʜʜ NZTRMEVNQͰ$47ϑΝΠϧ͕ (#ҎԼʹׂ͞ΕΔςʔϒϧ %BUBqPXͰ$47Λׂ͠ͳ͍ͱ #2-PBE͕ࣦഊ͢Δςʔϒϧ ʜʜ վળ վળ վળ վળ վળ վળ վળ վળ
7hͰλΠϜΞτ͍ͯͨ͠ߪങσʔλ࿈ܞ͕ɺ2.5hͰແࣄʹSuccessʂ 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00 09:00
Before After ɹ݁Ռ ❌ ✅ લͷॲཧ
1. ͡Ίʹ 2. ίϯςΩετ 3. ܭଌɺݕ౼ɺ߹ҙܗ 4. ϦϏϧυˍϦϦʔε 5. ͓ΘΓʹ
ɹΞδΣϯμ
ݸੑ๛͔ͳλϨϯτϓϨΠϠʔ͕ଟ͍৫ͳͷͰ ࣗͷྲّྀઃܭࢥΛԡ͠௨͢ͷͰͳ͘ ӢͷΑ͏ʹॊೈʹܗΛม͑ͯʢCloudʣ ࢦشऀͷΑ͏ʹશମΛݟ͠ʢComposerʣ ใͷྲྀΕΛཧ͠ͳ͕ΒਐΊͨʢDataflowʣ ·͞ʹ "Cloud Composer & Dataflow
ʹΑΔόονETLͷ࠶ߏங” ɹҙࣝͨ͜͠ͱ https://www.pexels.com/photo/hd-457881/
[BI / PM] @mattsun, @shoei, @hase-ryo, @hikaru, @nakatomo, ɹɹɹɹ @natsume,
@igachan-san, @tsudar, @anboo, @hiza [JP Dev] @siroken3, @shoe116, @ichirin2501, @bokko, @catatsuy, @shinpei [Merpay Dev] @laughingman7743, @syucream, @cocoiti, @kazegusuri, @sfujjiwara [US Dev/ML] @hatone, @yu [JP ML / Search] @furusawa, @tairosan ɹSpecial Thanks account-name in team Slack
ɹࠓޙͷ՝ of Batch ETL in Mercari JP ظ
lΘΕΔzج൫ͷຏ͖ࠐΈ ϓϩμΫτϚωδϝϯτ γεςϜ։ൃ XJUI#*43&%BUB1MBUGPSN தظ lഁյͱz͔Βlܭଌͱվળzͷγϑτ αʔϏεϚωδϝϯτʢ*5*-ʣ σʔλϚωδϝϯτʢ%.#0,ʣ XJUIIBTFSZPTBO ظ lہॴ࠷దz͔Βͷ٫ શࣾσʔλઓུࡦఆʢ%BUB0QTʣ XJUIUBJSPTBO
݈શͳੳ ݈શͳσʔλͷ্ʹΓཱͪ·͢ ݈શͳσʔλ ݈શͳϓϩηεͱγεςϜͷ্ʹΓཱͪ·͢ ·ͣͷલͷখ͞ͳ1า͔Β σʔλΛඋ͍͖ͯ͠·͠ΐ͏ʂ ɹ·ͱΊ
๛ͳσʔλ׆༻ࣄྫͱ߹Θͤͯ Ҋ݅ɾϓϩηεɾγεςϜɾνʔϜɾΧϧνϟʔΛ ͍͔ʹ݈શͳঢ়ଶͱϋοΫ͢Δ͔͝հ ɹએ
ݽ܉ฃಆͰؤு͍ͬͯΔݱ୲ͷօ༷ ݱঢ়Λෆ҆ࢹ͍ͯ͠ΔϚωʔδϟʔͷօ༷ ͥͻ @yuzutas0 ʹֻ͓͚͍ͩ͘͞ AsIs → ToBe ొΓํͷཧΛ͓ख͍͠·͢
ɹަྲྀλΠϜʹ͚ͯ
ྫ͑Cloud DataflowखܰʹεέʔϧͰ͖ΔҰํͰίετֻ͔Γ·͢ ࣄۀن׆༻ํ๏ʹΑͬͯROI؍ͰϖΠ͠ͳ͍͔͠Ε·ͤΜ ɾεέʔϥϒϧͳγεςϜΛ࡞ΔલʹΔ͜ͱࢁఔ͋ΔͷͰʁ ɾද໘తͳٕज़ཁૉΛऔΓೖΕΔ͜ͱ͕తԽ͍ͯ͠ͳ͍ʁ ɾͦͷσʔλૄ௨ͰຊʹܦӦ՝ΛղܾͰ͖Δʁ ҆қͳγεςϜ։ൃʹඈͼͭ͘લʹɺͥͻҰߟ͑ͯΈ͍ͯͩ͘͞ ɹҙɿਖ਼͍͠ͷΛɺਖ਼͘͠࡞Γ·͠ΐ͏
ʮ࠶ߏஙʯͷࣄྫΛఏڙ͢Δ ͋͘·Ͱ1ͭͷࣄྫͳͷͰ ࣗ͝ͷٕज़ཁૉ৫ঢ়گͱൺͳ͕Βߟ͑ͯ ࣗͳΓͷֶͼΛಘ͍ͯͩ͘͞ ɹຊͷझࢫʢ࠶ܝʣ
ࢲ͜͏͠·ͨ͠ɻ ͋ͳͨͩͬͨΒͲ͏͠·͔͢ʁ
͋ͳ͕ͨ͝୲͍ͯ͠Δ ϏδωεɺϓϩηεɺγεςϜɺνʔϜɺΧϧνϟʔͱ Ͳ͕͜ಉ͡Ͱ͔ͨ͠ʁͲ͕͜ҧ͍·͔ͨ͠ʁ ͦͷڞ௨ɾࠩҟɺͳͥੜ͍ͯ͡·͔͢ʁ
͋ͳͨͷ୲ݱࠓͷঢ়ଶ͕ϕετͰ͔͢ʁ ͦΕͱվળ༨͋Γͦ͏Ͱ͔͢ʁ খ͍͍ͯ͘͞ͷͰม͑ΒΕΔ͜ͱ͋Γ·͔͢ʁ
ࠓ͙͢1ͭΞΫγϣϯΛى͜͢ͱͨ͠Β Կ͕Ͱ͖ͦ͏Ͱ͔͢ʁ
https://www.pexels.com/photo/architecture-blur-building-colourful-392031/ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠