Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ECS Events & Lambda でカジュアルにはじめるコンテナスケジューラー / 20...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Taro Hirose
December 12, 2017
Technology
1.4k
2
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
ECS Events & Lambda でカジュアルにはじめるコンテナスケジューラー / 20171212_jawsug-container-lt
JAWS-UG コンテナ支部 #10 - connpass
https://jawsug-container.connpass.com/event/71130/
Taro Hirose
December 12, 2017
More Decks by Taro Hirose
See All by Taro Hirose
令和の時代におけるライブ動画サービスの作り方 / How to build your Live video streaming service in Reiwa era
uorat
0
150
Amazon IVS ROCKS!
uorat
1
380
OPENREC.tv におけるライブ動画およびメッセージ配信基盤の全貌 / 20170601_aws_devday_tokyo_openrec
uorat
4
5.1k
AWS re:Invent 2016 参加レポート / reinvent2016_report
uorat
0
2k
ライブ視聴を支える配信基盤の話をざっくりと / livestreaming-dogenzakabeerbash
uorat
2
1.6k
構成管理ツール Ansible 実践 / ansible-seminar-20160715
uorat
0
1.1k
ライブ視聴を支えるリアルタイムメッセージ配信基盤の話 / GunosyBeerBash #6
uorat
0
4.6k
Ansible 入門 #01 (初心者向け) / ansible-entry
uorat
1
250
Other Decks in Technology
See All in Technology
【NRUG vol.18】KubernetesにおけるNew Relicデータ取得量削減の考え方
nrug_member
0
130
Bedrock AgentCore RuntimeでAuth0 Changelog調査AIをアップグレードした話
t5u8a5a
1
160
【セミナー資料】Claude Code をセキュアに使うための考え方と設定の勘どころ / Claude Code Webinar 20260616
masahirokawahara
2
360
FinOps × AIエージェントで実現する コストインシデントの自動調査
oasis1994liveforever
0
140
iAEONの段階的リアーキテクト戦略 / iAEON's_Gradual_Re-architecture_Strategy
aeonpeople
0
110
【Cyber-sec+】経営層を"動かす"ための考え方
hssh2_bin
0
190
自宅LLMの話
jacopen
1
600
小さく始める AI 活用推進 ― 日経電子版 Web チームの事例/nikkei-tech-talk47
nikkei_engineer_recruiting
0
270
MCP Appsを作ってみよう
iwamot
PRO
4
660
2026TECHFRESH畢業分享會 - Lightning Talk - E起 See See : 電商推薦讀心術? 數據說了算
line_developers_tw
PRO
0
1.1k
Snowflakeと仲良くなる第一歩
coco_se
4
480
中期計画、2回作ってみた ~業務委託と正社員、両方の視点から~
demaecan
1
890
Featured
See All Featured
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
2k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
410
From π to Pie charts
rasagy
0
210
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
The Cost Of JavaScript in 2023
addyosmani
55
10k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
230
Believing is Seeing
oripsolob
1
140
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2.1k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
54k
Transcript
ECS Events & Lambda ͰΧδϡΞϧʹ࢝ ΊΔίϯςφεέδϡʔϥʔ JAWS-UG Container ࢧ෦ #10
2017.12.12 (Tue) LT
whoami ኍ ଠ / Taro Hirose ▸ OPENREC.tv / CyberZ,
Inc. ▸ Backend Engineer ▸ id: @uorat ▸ http://uorat.hatenablog.com
ECS Events Introduction
ECS Event is Կ ECS Cluster Ϧιʔεͷঢ়ଶมߋʹԠͯ͡௨͞ΕΔ CloudWatch Events ▸
ҎԼͷঢ়ଶมߋΠϕϯτΛडऔՄೳ ▸ Container Instance ▸ Task ▸ 2016.11.25 ։௨ ▸ Amazon ECSΠϕϯτετϦʔϜͰɺΫϥελͷঢ়ଶΛࢹ | Amazon Web Services ϒϩά ▸ https://aws.amazon.com/jp/blogs/news/monitor-cluster-state-with-amazon-ecs-event-stream/ Amazon ECS CloudWatch Events Lambda Event Stream Events SNS Kinesis
ECS Event is Կ e.g. Task ىಈ { "version": "0",
"id": "451dda85-ca1a-9045-5121-7a12dfb9317f", "detail-type": "ECS Task State Change", "source": "aws.ecs", "account": "123456789012", "time": "2017-09-07T08:28:04Z", "region": "ap-northeast-1", "resources": [ "arn:aws:ecs:ap-northeast-1:123456789012:task/b280d725-7382-43b8-a50d-ef909a36cb80" ], "detail": { "clusterArn": "arn:aws:ecs:ap-northeast-1:123456789012:cluster/uorat-ecs-event-test", "containerInstanceArn": "arn:aws:ecs:ap-northeast-1:123456789012:container-instance/ff83c4a8-67fc-4a13-8134-897c6dd2195a", ... "desiredStatus": "RUNNING", ... "lastStatus": "PENDING", ... "taskDefinitionArn": "arn:aws:ecs:ap-northeast-1:123456789012:task-definition/uorat-ecs-event-test:35", ... } } ECS Task Event
ECS Event is Կ Կ͕Ͱ͖Δͷʁ ▸ “ECSΠϕϯτετϦʔϜͰΫϥελͷঢ়ଶΛࢹ” ΑΓҾ༻ ▸ https://aws.amazon.com/jp/blogs/news/monitor-cluster-state-with-amazon-ecs-event-stream/
▸ “͜ͷใΛͬͯɺίϯςφͷஔͱεέʔϧΛࣗಈԽ͢Δ͜ͱՄೳͰɺΫϥελΛඇৗʹਫ਼ີͳϨ ϕϧͰ”ਖ਼͍͠αΠζ”ʹ͢Δ͜ͱ͕Ͱ͖·͢ɻϓϧܕͰͳ͘ΠϕϯτۦಈͰΫϥελͷঢ়ଶͷใΛ४ ϦΞϧλΠϜͰૹ͢Δ͜ͱʹΑΓɺECSΠϕϯτετϦʔϜػೳίϯςφΠϯϑϥͷࢹͱεέʔϧ ʹରͯ͠ඇৗʹൣғͳՄೳੑΛఏڙ͍ͯ͠·͢ɻ”
ECS Event is Կ ΠϕϯτۦಈͳλεΫඋγεςϜ࿈ܞ ▸ ྫ͑ ▸ λεΫͷՔಈཤྺΛ Elasticsearch
DynamoDB ʹอଘͯ͠ղੳ༻ʹ׆༻ ▸ λεΫίϯςφΠϯελϯεͷىಈ/ఀࢭ࣌ʹԿ͔͠ΒͷॲཧΛ࣮ߦ ▸ ࢹγεςϜ࿈ܞ
ECS Event is Կ e.g. ▸ Container Scheduler for Amazon
ECS ▸ re:Invent 2016 Ͱެ։͞Εͨ golang OSS ▸ ECS Cluster ༻ͷΧελϜεέδϡʔϥΛ࣮Մೳ ▸ ECS Cluster ͷΠϕϯτݕ ▸ ECS Cluster ͷঢ়ଶ ▸ ΧελϜεέδϡʔϥʔͷ࣮ߦ ▸ REST API ͷఏڙ
OPENREC.tv Case
Case: OPENREC.tv ήʔϜʹಛԽͨ͠ಈը৴ϝσΟΞ ▸ Ԇɾߴը࣭ ▸ ίϯςϯπͷ9ׂUGC ▸ ϢʔβʔओಋͷϥΠϒ৴͕த৺ ▸
ಉ࣌৴৴࣌ؒ৴ऀ࣍ୈ ▸ ू٬ྗ৴ऀ࣍ୈ ▸ ∴ ෛՙ͕ಡΈͮΒ͍ ▸ ಉ࣌ࢹௌऀ ແ੍ݶ ▸ ͍ΘΏΔ “” ແ͍ ▸ શϢʔβʔ͘͠ϥΠϒࢹௌͰ͖Δ͜ͱ
None
Architecture of live transcoding system CloudWatch Events (scheduled/1min, ECS Event
Stream) + Lambda + API ECS Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events
Architecture of live transcoding system ▸ EC2/ECS Auto Scaling ૬ੑ͕ѱ͍
▸ RTMP = ৗ࣌ଓ ▸ ෛՙ͕ͯ͘৴͍ͯ͠ΕॖୀͰ͖ͳ͍ ▸ “৴ঢ়گ” ͱ͍͏ಠࣗࢦඪʹج͍ͮͯ εέʔϧ ͤ͞Δεέδϡʔϥʔ͕ඞཁ ▸ Rolling Deploy ͷਏΈ ▸ ৴ऴྃϢʔβʔ࣍ୈ ▸ ৴͕ऴΘΔ·Ͱجຊతʹམͱͤͳ͍ ▸ தʹ 24 ࣌ؒ৴…
Architecture of live transcoding system Stateful application, but disposability ECS
Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events
Architecture of live transcoding system Expire >> Drain >> Stop
Container ECS Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events Broadcaster Container Instance Task (Container)
Architecture of live transcoding system Expire >> Drain >> Stop
Container ECS Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events Container Instance Task (Container)
Architecture of live transcoding system Expire >> Drain >> Stop
Container ECS Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events Container Instance
Architecture of live transcoding system ৴/৴ෛՙʹԠͯ͡ Container Instance Λ
AutoScale ▸ ϦϦʔε `docker image push` ͢Εɺউखʹ৽Πϝʔδ͕ਁಁ͢Δ ECS Cluster Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Broadcaster Container Instance Task (Container) Container Instance Aurora LIVE API ELB EC2 RDS CloudWatch Events & Lambda ec2:RunInstances ecs:RunTask ecs:StopTask ec2:StopInstances … ECS Events
w/ ECS Events Case
Case1: Monitoring Tasks Container/Application ࢹͷࣗಈઃఆ ▸ Task ͷঢ়ଶมԽʹԠͯ͡ ࢹ ON/OFF
1. Task ։࢝ → JVM, Wowza ͷࢹ։࢝ 2. Task ਖ਼ৗఀࢭ → ࢹఀࢭ 3. Task ҟৗఀࢭ → ࢹఀࢭͤͣΞϥʔτൃ๒ ▸ ࢹγεςϜطଘ Zabbix ͍ճ͠ ▸ ্ͷ 1, 2 Ͱ Zabbix API Λίʔϧ ▸ ͋ΓͷͳͷͰ্͕҆Γ
Case1: Monitoring Tasks e.g. ecs:StopTask ࣮ߦ { "version": "0", "id":
"41f02974-8365-f955-8465-264ef8b189ca", "detail-type": "ECS Task State Change", "source": "aws.ecs", "account": "123456789012", "time": "2017-09-07T08:10:36Z", "region": "ap-northeast-1", "resources": [ “arn:aws:ecs:ap-northeast-1:123456789012:task/55a0cfde-a377-4b97-…” ], "detail": { "clusterArn": "arn:aws:ecs:ap-northeast-1:123456789012:cluster/uorat-ecs-event-test", "containerInstanceArn": "arn:aws:ecs:ap-northeast-1:123456789012:container-instance/a04...", ... "desiredStatus": "STOPPED", ... "lastStatus": “RUNNING”, ... "stoppedReason": "Task stopped by user", ... } } ECS Task Event ECS Cluster Container Instance Task (Container) Container Instance Task (Container) ecs:StopTask
Case1: Monitoring Tasks e.g. ecs:StopTask ࣮ߦ def handle(event, context): ...
if desire_status == "RUNNING" and last_status == "PENDING": logger.info("Enable monitoring by Zabbix: host=%s" % (tag_name)) zabbix_register(tag_name, private_ip) elif desire_status == "STOPPED": logger.info("Found the stopped task: task_arn=%s, last_status=%s" % ( task_arn, last_status )) stopped_reason = event["detail"]["stoppedReason"] if stopped_reason == "Task stopped by user" and last_status == "RUNNING": logger.info("Disable monitoring by Zabbix: host=%s" % (tag_name)) zabbix_disable(tag_name) elif stopped_reason != "Task stopped by user": logger.warn("Found the failed task: host=%s, task_arn=%s, stopped_reason=%s" % ( tag_name, task_arn, stopped_reason )) respawn(task_arn, ec2_instance_id) logger.warn("Respawned and locked the task: host=%s, task_id=%s" % ( tag_name, task_arn )) ECS Cluster Container Instance Task (Container) Container Instance Task (Container) ecs:StopTask Lambda function: main.py
Case2: Respawn failed tasks ҟৗऴྃͨ͠ Task Λୟ͖ى͜͢ ▸ ࣋ଓଓͷͨΊ৴தͷ Task
ཁٹग़ ▸ “StoppedReason” ͕ظ֎ͷ߹ TaskΛ Re-run ▸ ServiceTask Ͱͳ͘ RunTask Ώ͑ʹඞཁ ▸ ͦͷޙͷରԠʹඞཁͳॲཧΛ࣮ߦ ▸ ҟৗऴྃΞϥʔτཁൃ๒ɺࢹࣗಈແޮ͠ͳ͍ ▸ Өڹͷग़ͨ৴ใΛ෦͚ʹ௨ ▸ ௐࠪ/߃ٱରԠͷͨΊ Task / Instance ΛϩοΫ
Case2: Respawn failed tasks e.g. Task ҟৗऴྃ ECS Cluster Container
Instance Container Instance Task (Container) { "version": "0", "id": "faeb52d8-e2ef-a726-655a-80f2373046b9", "detail-type": "ECS Task State Change", "source": "aws.ecs", "account": "123456789012", "time": "2017-09-07T08:40:33Z", "region": "ap-northeast-1", "resources": [ "arn:aws:ecs:ap-northeast-1:123456789012:task/d56d76f1-eb2a-42e5-..." ], "detail": { "clusterArn": "arn:aws:ecs:ap-northeast-1:123456789012:cluster/uorat-ecs-event-test", "containerInstanceArn": "arn:aws:ecs:ap-northeast-1:123456789012:container-instance/a04...", ... "desiredStatus": "STOPPED", … "lastStatus": "STOPPED", ... "stoppedReason": "Essential container in task exited", ... } } ECS Task Event
Case2: Respawn failed tasks e.g. Task ҟৗऴྃ def handle(event, context):
... if desire_status == "RUNNING" and last_status == "PENDING": logger.info("Enable monitoring by Zabbix: host=%s" % (tag_name)) zabbix_register(tag_name, private_ip) elif desire_status == "STOPPED": logger.info("Found the stopped task: task_arn=%s, last_status=%s" % ( task_arn, last_status )) stopped_reason = event["detail"]["stoppedReason"] if stopped_reason == "Task stopped by user" and last_status == "RUNNING": logger.info("Disable monitoring by Zabbix: host=%s" % (tag_name)) zabbix_disable(tag_name) elif stopped_reason != "Task stopped by user": logger.warn("Found the failed task: host=%s, task_arn=%s, stopped_reason=%s" % ( tag_name, task_arn, stopped_reason )) respawn(task_arn, ec2_instance_id) logger.warn("Respawned and locked the task: host=%s, task_id=%s" % ( tag_name, task_arn )) Lambda function: main.py ECS Cluster Container Instance Container Instance Task (Container) ecs:StartTask
StoppedReason ͷछྨ Documented ▸ ఀࢭ͞ΕͨλεΫͰͷΤϥʔͷ֬ೝ ▸ docs.aws.amazon.com/ja_jp/AmazonECS/latest/ developerguide/stopped-task-errors.html
Summary
Summary ͜Μͳوํʹ ͓͢͢Ί ECS Events & Lambda ▸ ECS ඪ४ͷ
Task Placement ͩͱগ͠ Γͳ͍ ▸ k8s Blox ڇ͔ ▸ ҟৗऴྃͨ͠ Task Λٹग़͍ͨ͠ ▸ ΠϕϯτۦಈͰ͜·ΊʹϦιʔε੍ޚ͠ ͍ͨ
Summary How about Fargate ? ▸ ແࣄ Task Events ྲྀΕ·ͨ͠
AWS Fargate
None
RAGE Shadowverse World Grand Prix