Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy...
Search
Hiroka Zaitsu
October 25, 2019
Technology
0
880
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity
2019.10.25 Fukuoka.go#14+Umeda.go
https://fukuokago.connpass.com/event/146447/
Hiroka Zaitsu
October 25, 2019
Tweet
Share
More Decks by Hiroka Zaitsu
See All by Hiroka Zaitsu
GMOペパボのデータ基盤とデータ活用の現在地 / Current State of GMO Pepabo's Data Infrastructure and Data Utilization
zaimy
3
240
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
550
Vertex AI Matching Engine と CLIP を使って EC サービスの類似画像検索機能を作る / Development of similar image search function for EC services using Vertex AI Matching Engine and CLIP
zaimy
0
740
BigQuery の日本語データを Dataflow と Vertex AI でトピックモデリング / Topic modeling of Japanese data in BigQuery with Dataflow and Vertex AI
zaimy
1
5.9k
データサイエンティストの仕事紹介 / Data Scientist Job Introduction
zaimy
1
610
GMOペパボのサービスと研究開発を支えるデータ基盤の裏側 / Inside Story of Data Infrastructure Supporting GMO Pepabo's Services and R&D
zaimy
1
1.8k
正則化とロジスティック回帰/machine-learning-lecture-regularization-and-logistic-regression
zaimy
0
8.7k
ECサイトにおける閲覧履歴を用いた購買に繋がる行動の変化検出 / Change Detection in Behavior Followed by Possible Purchase Using Electronic Commerce Site Browsing History
zaimy
1
940
ハンドメイド作品を対象としたECサイトにおける大量生産品の検出 / Detection of Mass-produced Goods at EC Site to Trade Handmade Goods
zaimy
3
4.8k
Other Decks in Technology
See All in Technology
Preferred Networks (PFN) とLLM Post-Training チームの紹介 / 第4回 関東Kaggler会 スポンサーセッション
pfn
PRO
1
180
mruby(PicoRuby)で ファミコン音楽を奏でる
kishima
1
220
Backboneとしてのtimm2025
yu4u
4
1.4k
Webアクセシビリティ入門
recruitengineers
PRO
1
230
Go で言うところのアレは TypeScript で言うとコレ / Kyoto.なんか #7
susisu
5
1.5k
制約理論(ToC)入門
recruitengineers
PRO
2
260
我々は雰囲気で仕事をしている / How can we do vibe coding as well
naospon
2
220
Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders
kzykmyzw
0
310
Figma + Storybook + PlaywrightのMCPを使ったフロントエンド開発
yug1224
4
270
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
8.6k
見てわかるテスト駆動開発
recruitengineers
PRO
4
280
サービスロボット最前線:ugoが挑むPhysical AI活用
kmatsuiugo
0
190
Featured
See All Featured
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
1k
How to Think Like a Performance Engineer
csswizardry
25
1.8k
Producing Creativity
orderedlist
PRO
347
40k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
Mobile First: as difficult as doing things right
swwweet
223
9.9k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Testing 201, or: Great Expectations
jmmastey
45
7.6k
Become a Pro
speakerdeck
PRO
29
5.5k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
570
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
Transcript
ࡒେՆ / Pepabo R&D Institute, GMO Pepabo, Inc. 2019.10.25 Fukuoka.go#14+Umeda.go
trinity Ͱ Cloud Composer ʹ ϫʔΫϑϩʔΛ؆୯σϓϩΠ
σʔλαΠΤϯςΟετ ࡒ େՆ / @zaimy 2 Hiroka Zaitsu ϖύϘݚڀॴ ݚڀһ
1. Cloud Composer ͱ 2. Cloud Composer ͷσϓϩΠ࣌ͷࠔΓ͝ͱ 3. trinity
ʹΑΔղܾͷࢼΈ 4. ࠓޙΔ͜ͱ 3 ࣍
1. Cloud Composer ͱ
• GCP ͷ "ϑϧϚωʔδυͷϫʔΫϑϩʔ ΦʔέετϨʔγϣϯ αʔϏε" • Apache Airflow Λ
GCP ্ʹߏங͢Δ • ϖύϘͷϩάج൫ʢDWHʣΛ Treasure Data ͔Β GCP Ҡߦத • ϫʔΫϑϩʔαʔϏε Treasure Workflow (Ϛωʔδυ Digdag) ͔Β Cloud Composer Ҡߦத 5 Cloud Composer ͷ֓ཁ
ϫʔΫϑϩʔͷίʔυϕʔε repository └ dags ɹ ├ workflowA ɹ │ ├
main.py ɹ │ └ hoge.sql ɹ └ workflowB ɹ ɹ ├ main.py ɹ ɹ └ piyo.sql 6 • dags σΟϨΫτϦԼʹϫʔΫϑϩʔ୯ҐͰ αϒσΟϨΫτϦΛΔ • ϫʔΫϑϩʔຊମʢDAGʣͷ python ίʔυ • ϫʔΫϑϩʔͰར༻͢ΔΫΤϦ • ઃఆϑΝΠϧɹͳͲ ※σΟϨΫτϦߏΛ Cloud Storage ͱ߹ΘͤΔ߹
ϫʔΫϑϩʔͷσϓϩΠʢՃͱߋ৽ʣ $ gcloud composer environments storage dags import \ --environment
ENVIRONMENT_NAME \ --location LOCATION \ --source LOCAL_FILE_TO_UPLOAD 7 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSJNQPSU
ϫʔΫϑϩʔͷআ ͦͷ1 - Cloud Storage ͔Βআ $ gcloud composer environments
storage dags delete \ --environment ENVIRONMENT_NAME \ --location LOCATION \ DAG_NAME.py 8 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF
ϫʔΫϑϩʔͷআ ͦͷ2 - Airflow ͔Βআ $ gcloud composer environments run
--location LOCATION \ ENVIRONMENT_NAME delete_dag -- DAG_NAME 9 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF@EBH
2. Cloud Composer ͷ σϓϩΠ࣌ͷࠔΓ͝ͱ
• ϫʔΫϑϩʔͷՃͱߋ৽ • import ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • import
Cloud Storage ͷϑΝΠϧΛ্ॻ͖͢Δ • ίʔυϕʔεͰআͨ͠ϑΝΠϧ ݸผʹআ͠ͳ͍ݶΓ Cloud Storage ʹΔ 11 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• ϫʔΫϑϩʔͷআ • delete ͱ Airflow ͷ dag_delete ͷ2ճίϚϯυΛ࣮ߦ͢Δඞཁ͕͋Δ •
delete ϑΝΠϧ୯Ґ, dag_delete ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϑΝΠϧ/ϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • ։ൃʹΑΓेݸͷϫʔΫϑϩʔʹʑ͕ࠩੜ·Ε͍ͯ͘ • ࠩΛػցతʹݕग़ͯ͠ Cloud Composer ʹಉظ͍ͨ͠ 12 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• όέοτ/σΟϨΫτϦؒͰϑΝΠϧΛಉظ͢Δ Cloud Storage ͷίϚϯυ • ϑΝΠϧͷߋ৽࣌ࠁʹࠩҟ͕͋Εಉظରͱఆ͞ΕΔ • ༰͕มߋ͞Ε͍ͯͳͯ͘ॲཧରʹͳͬͯ͠·͏ •
Cloud Storage ʹґଘ͢Δ • Airflow GCP Ҏ֎ͰߏஙͰ͖ΔͷͰଞͷετϨʔδʹରԠ͍ͨ͠ 13 gsutil rsync Ͳ͏͔ͳ
• ಛఆͷ git ϦϙδτϦͱಉظ͢Δ Airflow ͷػೳ • ୯ҰͷϒϥϯνͷΈࢦఆՄೳ • ຊ൪ڥʹ
master ͷίʔυΛಉظ͢Δʹྑͦ͞͏ • ςετڥ CI Ͱ feature branch ͷίʔυΛσϓϩΠ͍ͨ͠ 14 Airflow sync Ͳ͏͔ͳ
3. trinity ʹΑΔղܾͷࢼΈ
• ίʔυϕʔεͱ Cloud Storage ͱ Airflow ͷ3ͭΛಉظ͢Δ • ϫʔΫϑϩʔ୯ҐͰɺσΟϨΫτϦߏͱϑΝΠϧ༰͔ΒϋογϡΛܭࢉ •
͋Δ࣌ͷϫʔΫϑϩʔఆٛΛද͢ϋογϡ • ίʔυϕʔε͔Βܭࢉͨ͠ϋογϡͱ Cloud Storage ʹอଘ͞Ε͍ͯΔ ϋογϡ͕ҟͳΔϫʔΫϑϩʔΛಉظૢ࡞ͷରʹ͢Δ 16 trinity ͷํ
• https://github.com/zaimy/trinity • A tool to synchronize workflows between Codebase,
Cloud Storage and Airflow metadata. • ͳͥ Goʁ • ΫϩείϯύΠϧͰ Mac, Linux, Windows ʹରԠͰ͖Δ • ϫʔΫϑϩʔ୯ҐͰॲཧ͕ՄೳͳͷͰฒྻԽ͍ͨ͠ 17 trinity ͷ࣮ $ trinity --bucket=BUCKET_NAME \ --composer-env=COMPOSER_ENV_NAME
1. ίʔυϕʔεͰϋογϡΛܭࢉͯ͠ϫʔΫϑϩʔ͝ͱʹอଘ 2. ίʔυϕʔεͱ Cloud Storage ͷϫʔΫϑϩʔΛϦετͯ͠ൺֱ i. ίʔυϕʔεʹ͔͠ͳ͚Ε Cloud
Storage ʹΞοϓϩʔυʢՃʣ ii. Cloud Storage ʹ͔͠ͳ͚Ε Cloud Storage ͱ Airflow ͔Βআ iii. ྆ํʹ͋Είʔυϕʔεͱ Cloud Storage ͷϋογϡΛൺֱ a. ࠩҟ͕͋Ε Cloud Storage ͷϫʔΫϑϩʔΛஔʢߋ৽ʣ 18 ॲཧͷྲྀΕ
؆୯ʹಉظతͳσϓϩΠ͕ Ͱ͖ΔΑ͏ʹͳͬͨ !
• ςετՃͱϦϑΝΫλϦϯά • Go ͷ࡞๏ߟ͑ํʹԊ͍͖͍ͬͯͨ • ػೳՃ • Airflow ʹ
dags Ҏ֎ʹ plugins ͋ΔͷͰରԠ͢Δ • dry-run 20 ࠓޙΔ͜ͱ
None