Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy...
Search
Hiroka Zaitsu
October 25, 2019
Technology
0
900
trinity で Cloud Composer に ワークフローを簡単デプロイ / Easy workflow deployment to Cloud Composer with trinity
2019.10.25 Fukuoka.go#14+Umeda.go
https://fukuokago.connpass.com/event/146447/
Hiroka Zaitsu
October 25, 2019
Tweet
Share
More Decks by Hiroka Zaitsu
See All by Hiroka Zaitsu
GMOペパボのデータ基盤とデータ活用の現在地 / Current State of GMO Pepabo's Data Infrastructure and Data Utilization
zaimy
3
320
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
650
Vertex AI Matching Engine と CLIP を使って EC サービスの類似画像検索機能を作る / Development of similar image search function for EC services using Vertex AI Matching Engine and CLIP
zaimy
0
770
BigQuery の日本語データを Dataflow と Vertex AI でトピックモデリング / Topic modeling of Japanese data in BigQuery with Dataflow and Vertex AI
zaimy
1
6.1k
データサイエンティストの仕事紹介 / Data Scientist Job Introduction
zaimy
1
630
GMOペパボのサービスと研究開発を支えるデータ基盤の裏側 / Inside Story of Data Infrastructure Supporting GMO Pepabo's Services and R&D
zaimy
1
1.8k
正則化とロジスティック回帰/machine-learning-lecture-regularization-and-logistic-regression
zaimy
0
8.9k
ECサイトにおける閲覧履歴を用いた購買に繋がる行動の変化検出 / Change Detection in Behavior Followed by Possible Purchase Using Electronic Commerce Site Browsing History
zaimy
1
960
ハンドメイド作品を対象としたECサイトにおける大量生産品の検出 / Detection of Mass-produced Goods at EC Site to Trade Handmade Goods
zaimy
3
4.9k
Other Decks in Technology
See All in Technology
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
0
660
21st ACRi Webinar - Univ of Tokyo Presentation Slide (Shinya Takamaeda)
nao_sumikawa
0
110
[JAWS-UG 横浜支部 #91]DevOps Agent vs CloudWatch Investigations -比較と実践-
sh_fk2
1
170
プロダクトマネジメントの分業が生む「デリバリーの渋滞」を解消するTPMの越境
recruitengineers
PRO
3
580
なぜフロントエンド技術を追うのか?なぜカンファレンスに参加するのか?
sakito
9
2k
直接メモリアクセス
koba789
0
230
著者と読み解くAIエージェント現場導入の勘所 Lancers TechBook#2
smiyawaki0820
11
5.2k
.NET 10 のパフォーマンス改善
nenonaninu
2
4.9k
Modern Data Stack大好きマンが語るSnowflakeの魅力
sagara
0
290
【pmconf2025】PdMの「責任感」がチームを弱くする?「分業型」から全員がユーザー価値に本気で向き合う「共創型開発チーム」への変遷
toshimasa012345
0
140
AI時代の開発フローとともに気を付けたいこと
kkamegawa
0
950
freeeにおけるファンクションを超えた一気通貫でのAI活用
jaxx2104
3
1.4k
Featured
See All Featured
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
36
6.2k
Typedesign – Prime Four
hannesfritz
42
2.9k
Being A Developer After 40
akosma
91
590k
How to Think Like a Performance Engineer
csswizardry
28
2.3k
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
1
86
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
The Cult of Friendly URLs
andyhume
79
6.7k
Practical Orchestrator
shlominoach
190
11k
Visualization
eitanlees
150
16k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Transcript
ࡒେՆ / Pepabo R&D Institute, GMO Pepabo, Inc. 2019.10.25 Fukuoka.go#14+Umeda.go
trinity Ͱ Cloud Composer ʹ ϫʔΫϑϩʔΛ؆୯σϓϩΠ
σʔλαΠΤϯςΟετ ࡒ େՆ / @zaimy 2 Hiroka Zaitsu ϖύϘݚڀॴ ݚڀһ
1. Cloud Composer ͱ 2. Cloud Composer ͷσϓϩΠ࣌ͷࠔΓ͝ͱ 3. trinity
ʹΑΔղܾͷࢼΈ 4. ࠓޙΔ͜ͱ 3 ࣍
1. Cloud Composer ͱ
• GCP ͷ "ϑϧϚωʔδυͷϫʔΫϑϩʔ ΦʔέετϨʔγϣϯ αʔϏε" • Apache Airflow Λ
GCP ্ʹߏங͢Δ • ϖύϘͷϩάج൫ʢDWHʣΛ Treasure Data ͔Β GCP Ҡߦத • ϫʔΫϑϩʔαʔϏε Treasure Workflow (Ϛωʔδυ Digdag) ͔Β Cloud Composer Ҡߦத 5 Cloud Composer ͷ֓ཁ
ϫʔΫϑϩʔͷίʔυϕʔε repository └ dags ɹ ├ workflowA ɹ │ ├
main.py ɹ │ └ hoge.sql ɹ └ workflowB ɹ ɹ ├ main.py ɹ ɹ └ piyo.sql 6 • dags σΟϨΫτϦԼʹϫʔΫϑϩʔ୯ҐͰ αϒσΟϨΫτϦΛΔ • ϫʔΫϑϩʔຊମʢDAGʣͷ python ίʔυ • ϫʔΫϑϩʔͰར༻͢ΔΫΤϦ • ઃఆϑΝΠϧɹͳͲ ※σΟϨΫτϦߏΛ Cloud Storage ͱ߹ΘͤΔ߹
ϫʔΫϑϩʔͷσϓϩΠʢՃͱߋ৽ʣ $ gcloud composer environments storage dags import \ --environment
ENVIRONMENT_NAME \ --location LOCATION \ --source LOCAL_FILE_TO_UPLOAD 7 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSJNQPSU
ϫʔΫϑϩʔͷআ ͦͷ1 - Cloud Storage ͔Βআ $ gcloud composer environments
storage dags delete \ --environment ENVIRONMENT_NAME \ --location LOCATION \ DAG_NAME.py 8 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF
ϫʔΫϑϩʔͷআ ͦͷ2 - Airflow ͔Βআ $ gcloud composer environments run
--location LOCATION \ ENVIRONMENT_NAME delete_dag -- DAG_NAME 9 ίʔυϕʔε $MPVE4UPSBHF "JSqPX HDMPVEDPNQPTFSEFMFUF@EBH
2. Cloud Composer ͷ σϓϩΠ࣌ͷࠔΓ͝ͱ
• ϫʔΫϑϩʔͷՃͱߋ৽ • import ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • import
Cloud Storage ͷϑΝΠϧΛ্ॻ͖͢Δ • ίʔυϕʔεͰআͨ͠ϑΝΠϧ ݸผʹআ͠ͳ͍ݶΓ Cloud Storage ʹΔ 11 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• ϫʔΫϑϩʔͷআ • delete ͱ Airflow ͷ dag_delete ͷ2ճίϚϯυΛ࣮ߦ͢Δඞཁ͕͋Δ •
delete ϑΝΠϧ୯Ґ, dag_delete ϫʔΫϑϩʔ୯ҐͰͷ࣮ߦ • ࠩͷ͋ΔϑΝΠϧ/ϫʔΫϑϩʔʹରͯ͠ݸผʹ࣮ߦ͢Δඞཁ͕͋Δ • ։ൃʹΑΓेݸͷϫʔΫϑϩʔʹʑ͕ࠩੜ·Ε͍ͯ͘ • ࠩΛػցతʹݕग़ͯ͠ Cloud Composer ʹಉظ͍ͨ͠ 12 gcloud ίϚϯυΛͦͷ··ӡ༻ʹ͏ͱେม
• όέοτ/σΟϨΫτϦؒͰϑΝΠϧΛಉظ͢Δ Cloud Storage ͷίϚϯυ • ϑΝΠϧͷߋ৽࣌ࠁʹࠩҟ͕͋Εಉظରͱఆ͞ΕΔ • ༰͕มߋ͞Ε͍ͯͳͯ͘ॲཧରʹͳͬͯ͠·͏ •
Cloud Storage ʹґଘ͢Δ • Airflow GCP Ҏ֎ͰߏஙͰ͖ΔͷͰଞͷετϨʔδʹରԠ͍ͨ͠ 13 gsutil rsync Ͳ͏͔ͳ
• ಛఆͷ git ϦϙδτϦͱಉظ͢Δ Airflow ͷػೳ • ୯ҰͷϒϥϯνͷΈࢦఆՄೳ • ຊ൪ڥʹ
master ͷίʔυΛಉظ͢Δʹྑͦ͞͏ • ςετڥ CI Ͱ feature branch ͷίʔυΛσϓϩΠ͍ͨ͠ 14 Airflow sync Ͳ͏͔ͳ
3. trinity ʹΑΔղܾͷࢼΈ
• ίʔυϕʔεͱ Cloud Storage ͱ Airflow ͷ3ͭΛಉظ͢Δ • ϫʔΫϑϩʔ୯ҐͰɺσΟϨΫτϦߏͱϑΝΠϧ༰͔ΒϋογϡΛܭࢉ •
͋Δ࣌ͷϫʔΫϑϩʔఆٛΛද͢ϋογϡ • ίʔυϕʔε͔Βܭࢉͨ͠ϋογϡͱ Cloud Storage ʹอଘ͞Ε͍ͯΔ ϋογϡ͕ҟͳΔϫʔΫϑϩʔΛಉظૢ࡞ͷରʹ͢Δ 16 trinity ͷํ
• https://github.com/zaimy/trinity • A tool to synchronize workflows between Codebase,
Cloud Storage and Airflow metadata. • ͳͥ Goʁ • ΫϩείϯύΠϧͰ Mac, Linux, Windows ʹରԠͰ͖Δ • ϫʔΫϑϩʔ୯ҐͰॲཧ͕ՄೳͳͷͰฒྻԽ͍ͨ͠ 17 trinity ͷ࣮ $ trinity --bucket=BUCKET_NAME \ --composer-env=COMPOSER_ENV_NAME
1. ίʔυϕʔεͰϋογϡΛܭࢉͯ͠ϫʔΫϑϩʔ͝ͱʹอଘ 2. ίʔυϕʔεͱ Cloud Storage ͷϫʔΫϑϩʔΛϦετͯ͠ൺֱ i. ίʔυϕʔεʹ͔͠ͳ͚Ε Cloud
Storage ʹΞοϓϩʔυʢՃʣ ii. Cloud Storage ʹ͔͠ͳ͚Ε Cloud Storage ͱ Airflow ͔Βআ iii. ྆ํʹ͋Είʔυϕʔεͱ Cloud Storage ͷϋογϡΛൺֱ a. ࠩҟ͕͋Ε Cloud Storage ͷϫʔΫϑϩʔΛஔʢߋ৽ʣ 18 ॲཧͷྲྀΕ
؆୯ʹಉظతͳσϓϩΠ͕ Ͱ͖ΔΑ͏ʹͳͬͨ !
• ςετՃͱϦϑΝΫλϦϯά • Go ͷ࡞๏ߟ͑ํʹԊ͍͖͍ͬͯͨ • ػೳՃ • Airflow ʹ
dags Ҏ֎ʹ plugins ͋ΔͷͰରԠ͢Δ • dry-run 20 ࠓޙΔ͜ͱ
None