Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CNDF2023前夜祭 - 玄界灘のクラウドネイティブなデータ基盤運用の実践

CNDF2023前夜祭 - 玄界灘のクラウドネイティブなデータ基盤運用の実践

CNDF2023前夜祭で話したデータ基盤のSREに関連する話です。

Avatar for Kazuhiko Yamashita

Kazuhiko Yamashita

August 02, 2023
Tweet

More Decks by Kazuhiko Yamashita

Other Decks in Technology

Transcript

  1. γεςϜߏ੒ Ingest Pipeline DataFlow CloudComposer Extract Analytics ML BigQuery Vertex

    AI source monitor Cloud
 Logging Cloud
 Monitoring DBσʔλɺϩάΛBigQueryʹू໿ Pub/Sub
  2. γεςϜߏ੒ Ingest Pipeline DataFlow CloudComposer Extract Analytics ML BigQuery Vertex

    AI source monitor Cloud
 Logging Cloud
 Monitoring DBσʔλɺϩάΛBigQueryʹू໿ Pub/Sub
  3. མͪଓ͚ΔDAGͱຫੑతͳτΠϧ DAG = Directed Acyclic Graph = ༗޲ඇ८ճάϥϑ https://air fl

    ow.apache.org/docs/apache-air fl ow/stable/core-concepts/dags.html ΑΓҾ༻
  4. SLO/SLOͷࡦఆ SREϫʔΫϒοΫ - Ch 13. Data Processing Pipelines ͲͷΑ͏ʹܾΊΔ͔ʁ https://sre.google/workbook/data-processing/

    https://cloud.google.com/stackdriver/docs/solutions/slo-monitoring/sli-metrics/data-proc-metrics?hl=ja Google Cloud ΦϖϨʔγϣϯεΠʔτ σʔλॲཧαʔϏε
  5. ࠷ۙ͸͓ئ͍ͯ͠ɺTABΛԡ͚ͩ͢ͷ࢓ࣄͰ͢ ## ೖྗ࢓༷ - Air fl owͷDAGͷϑΝΠϧΛೖྗͱ͠·͢ ## αϯϓϧ࣮૷1 ```python

    {sample1} ``` ## αϯϓϧ࣮૷2 ```python {sample2} ``` ## ࢦࣔ - ೖྗͱͯ͠౉͞ΕͨDAGͷςετίʔυΛpytestΛ༻͍࣮ͯ૷͍ͯͩ͘͠͞ɻ - Ϋϥεʹ͍ͭͯ͸spy΍mockΛ༻͍ͯϞοΫԽ͍ͯͩ͘͠͞ɻ - 2ͭͷαϯϓϧ࣮૷Λࣔ͠·͢ɻͦΕΛࢀߟʹ͍ͯͩ͘͠͞ɻ - ͋ͳ͕ͨੜ੒ͨ͠ςετ͸ͦͷ··ϑΝΠϧʹอଘͯ͠ɺ࣮ߦ͠·͢ɻग़ྗ಺༰ͷઆ໌ͳͲ͸ෆཁͰ͢ɻ ## ೖྗ ϑΝΠϧ໊:{dag_ fi le_path} ```python {dag_ fi le_content} ``` ... $ poetry run python bin/test_generator.py dags/example.py
  6. γεςϜߏ੒ Ingest Pipeline DataFlow CloudComposer Extract Analytics ML BigQuery Vertex

    AI source monitor Cloud
 Logging Cloud
 Monitoring DBσʔλɺϩάΛBigQueryʹू໿ Pub/Sub
  7. σʔλύΠϓϥΠϯͷ؂ࢹ ΰʔϧσϯσʔλͷ౤ೖ <source> @type dummy @label @INPUT tag example dummy

    { "accessed_at": "2022-01-01T00:00:00Z", "account_id": 1, "client_id": "12345abcde", "event": "example_event", } </source> https://docs. fl uentd.org/v/0.12/input/dummy ຖඵμϛʔσʔλΛ ύΠϓϥΠϯʹྲྀ͠ɺ ΤϯυϙΠϯτͰ؂ࢹ
  8. γεςϜߏ੒ Ingest Pipeline DataFlow CloudComposer Extract Analytics ML BigQuery Vertex

    AI source monitor Cloud
 Logging Cloud
 Monitoring DBσʔλɺϩάΛBigQueryʹू໿ Pub/Sub
  9. Abstract Syntax Tree from air fl ow.operators.empty import EmptyOperator EmptyOperator(task_id="child_task3")

    % python -m ast example.py Module( body=[ ImportFrom( module='air fl ow.operators.empty', names=[ alias(name='EmptyOperator')], level=0), Expr( value=Call( func=Name(id='EmptyOperator', ctx=Load()), args=[], keywords=[ keyword( arg='task_id', value=Constant(value='child_task3'))]))], type_ignores=[]) grepͰͰ͖ͳ͍͜ͱ΋ɺ੩తղੳͳΒղܾͰ͖Δ