Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon ECSで好きなだけ検証環境を起動できるOSSの設計・実装・運用 / YAPC::...

Amazon ECSで好きなだけ検証環境を起動できるOSSの設計・実装・運用 / YAPC::Hiroshima 2024

FUJIWARA Shunichiro

February 09, 2024
Tweet

More Decks by FUJIWARA Shunichiro

Other Decks in Technology

Transcript

  1. ࣗݾ঺հ @fujiwara ໘ന๏ਓΧϠοΫSREνʔϜ ISUCON 1,2,5,11 ! ༏উ4ճ ISUCON 3,8,12,13 ӡӦ(ग़୊)

    4ճ github.com/kayac/ecspresso Amazon ECS σϓϩΠπʔϧ github.com/fujiwara/lambroll AWS Lambda σϓϩΠπʔϧ
  2. ࣮ݱํ๏(2) ʮඞཁͳ࣌ʹIaCͰશ෦࡞Δʯ ࢖͍͍ͨλΠϛϯάͰඞཁͳαʔόʔϦιʔεΛҰࣜ·Δͬͱ࡞੒ ྫ: AWSͰALB + ECS + RDS +

    Elas0Cache ALB, ECSαʔϏε/λεΫ, RDS, ElastCache Λ࡞੒Ͱ͖Δίʔυ (Terraformͱ͔CFnͱ͔CDKͱ͔)Λ࡞͓͍ͬͯͯόʔϯ
  3. ֓೦ਤ ALBʹ *.example.com ΛׂΓ౰ͯΔ mirage-ecsʹશͯͷϦΫΤετΛసૹ - foo.example.com ͸؀ڥ foo ʹproxy

    - bar.example.com ͸؀ڥ bar ʹproxy mirage.example.com Ͱ WebUI / API Λఏ ڙ - ؀ڥىಈ → ECS RunTask - ؀ڥ࡟আ → ECS StopTask mirage-ecs ͱ͸ 1.ʮ؀ڥʯΛ ECS ͷλεΫͱͯ͠ىಈ/ఀࢭ 2.ʮ؀ڥʯ΁ͷϦΫΤετΛλεΫʹ Reverse Proxy͢Διϑτ΢ΣΞ
  4. mirage-ecsΛ࢖͏ͱͳʹ͕خ͍͔͠ ࢖͍͍ͨ؀ڥΛԿݸͰ΋ɺ͙͢ʹ༻ҙͰ͖Δ - ECSλεΫΛFargateͰىಈ͢Ε͹͍ͭ͘Ͱ΋ - ىಈ࣌ؒ͸࠷୹1෼ʙ௕ͯ͘΋10෼ఔ౓ ௿ίετ - 1؀ڥ =

    ECSλεΫ1ݸ(1ʙ2vCPU, Memory 4GBʙ) - ECS on EC2Ͱಈ͔͢͜ͱ΋Մೳ ؀ڥͷىಈ࡟আ͕WebUIͱAPIͰૢ࡞Ͱ͖Δ - WebUI: ඇΤϯδχΞʹ΋༏͍͠ - API: ىಈ/࡟আΛSlack botͳͲͰ࣮૷Ͱ͖Δ
  5. WebΞϓϦέʔγϣϯΛmirage-ecsͰىಈ͢Δ ίϯςφͷentrypointͰ… ϒϥϯνΛ੾Γସ͑Δ - git (clone|fetch) && switch Ͱ࢖͍͍ͨϒϥϯνʹ੾Γସ͑Δ -

    git switch origin/$GIT_BRANCH ← ىಈ࣌ʹࢦఆͨ͠ϒϥϯν DBͳͲͷ֎෦ϦιʔεΛ༻ҙ͢Δ - ࡞੒ʹֻ͕͔࣌ؒΒͳ͍ɺैྔ՝ۚͳϦιʔε͸ͦͷ৔Ͱ࡞੒ - $SUBDOMAIN Λ໊લʹؚΊͯผͷ΋ͷͱͯ͠࡞Δ - DynamoDB ͷςʔϒϧͱ͔SQS queueͱ͔
  6. WebΞϓϦέʔγϣϯΛmirage-ecsͰىಈ͢Δ ىಈʹֻ͕͔࣌ؒΔɺ࠷௿՝ֹ͕ۚେ͖͍Ϧιʔε͸ڞ༻͢Δ޻෉ ྫ: RDS (MySQL) - σʔλϕʔεΛ$SUBDOMAINΛݩʹͯ͠࡞Δ - DBNAME=$(echo $SUBDOMAIN

    | tr - _) mysql -e "CREATE DATABASE IF NOT EXISTS $DBNAME ..." ྫ: Elas'Cache Redis - key_prefix Λ ${SUBDOMAIN}: ʹͯ͠࿦ཧతʹ෼཭ (ΫϥΠΞϯτϥΠϒϥϦͷػೳ)
  7. WebΞϓϦέʔγϣϯΛ mirage-ecsͰىಈ ֎෦Ϧιʔεͷ४උ͕ऴΘͬͨΒىಈ - DB migra*on΍ॳظσʔλͷimport - ϥΠϒϥϦͷߋ৽ (CPAN, Gem...)ɺϏϧυ

    - αʔόʔىಈ (plackup, rails s...) ىಈͨ͠؀ڥ΁ͷReverse Proxy͸mirage-ecs ͕ࣗಈͰఆٛͯ͘͠Ε·͢
  8. mirage → mirage-ecs LevelDB → ετϨʔδϨε - ReverseProxyͷͨΊʹʮ؀ڥ໊ʯ->ʮIPΞυϨεʯͷϚοϐϯά͕ඞཁ - ECSλεΫʹ͸λά͕࣋ͯΔ

    - ؀ڥ໊ͳͲͷϝλσʔλ͸λάʹೖΕΔɺIPΞυϨε͸ECS͕஌͍ͬͯ Δ - ఆظతʹECS APIͰλεΫͱIPΞυϨεΛऔಘͯ͠ReverseProxyఆٛΛ ߋ৽ αΠυΧʔΛ࣋ͯΔ - ʮ؀ڥ=ίϯςφʯ→ʮ؀ڥ=ECSλεΫ(ίϯςφ࠷େ10ݸ)ʯ
  9. mirage → mirage-ecs γϯάϧϗετߏ੒ → ෳ਺୆ߏ੒ʹͰ͖Δ - ؀ڥ͸mirage͕ಈ࡞͢Δϗετ্Ͱ͸ͳ͘ಠཱͯ͠ىಈ - mirage-ecsࣗମ͸ετϨʔδϨεͳͷͰෳ਺୆Ͱಈ࡞Ͱ͖Δ

    ͱ͸͍͑Proxy͔͠͠ͳ͍ͷͰ100؀ڥ͋ͬͯ΋1୆Ͱे෼ σϓϩΠ΋rollingͰOK - ؀ڥͱ͸VPC಺Ͱ௨৴͢ΔͷͰෳ਺୆Ͱಈ࡞Ͱ͖Δ
  10. mirage-ecs = Go੡ͷWeb APIαʔόʔ݉reverse proxy ϝΠϯͷHTTPαʔόʔ: ϦΫΤετΛશ෦ड͚ͯHostϔομΛΈΔ - mirage.* →

    ΞϓϦέʔγϣϯϋϯυϥΛ࣮ߦ - ͦΕҎ֎ → HostʹରԠͨ͠؀ڥѼͯͷReverseProxyΛ࣮ߦ ͦͷଞͷgorou%ne worker - ECS APIΛୟ͍ͯReverseProxyΛߋ৽͢Δworker ʮ؀ڥ=λεΫʯ͕૿͑ͨΒReveseProxyΛ࡞੒ͯ͠mapʹ௥Ճ ݮͬͨΒReveseProxyΛmap͔Β࡟আ - ϦΫΤετ਺ूܭΛCloudWatchʹ౤͛Δworker
  11. ECSͰ͸λεΫΛىಈ͢ΔͨΊͷઃఆ͕(Ͳ͏ͯ͠΋)ඞཁ ecs: region: "ap-northeast-1" cluster: mycluster default_task_definition: myapp enable_execute_command: true

    launch_type: FARGATE network_configuration: awsvpc_configuration: subnets: - subnet-aaaa0000 - subnet-bbbb1111 - subnet-cccc2222 security_groups: - sg-11112222 - sg-aaaagggg assign_public_ip: ENABLED
  12. mirage link ECSλεΫʹ͸10ίϯςφ·ͰؚΊΒΕΔ͕଍Γͳ͘ͳͬͯࠔͬͨͷͰ… 1. ىಈ࣌ʹෳ਺ͷλεΫఆٛ(A,B)Λ౉͢(Ϣʔβʔ) 2. ͦΕͧΕͷλεΫఆ͔ٛΒλεΫa,bΛىಈ͢Δ (mirage-ecs) 3. ίϯςφ໊ͱλεΫͷIPΞυϨεͰRoute53ʹ໊લΛఆٛ(mirage-ecs)

    nginx.foo.example.com λεΫaͷIPΞυϨε webapp.foo.example.com λεΫaͷIPΞυϨε backend.foo.example.com λεΫbͷIPΞυϨε sidecar.foo.example.com λεΫbͷIPΞυϨε ෳ਺ͷλεΫΛ1ͭͷʮ؀ڥʯͱͯ͠ѻ͑Δ
  13. POST /api/purge { "excludes": ["foo", "bar"], "exclude_tags": ["branch:preview"], "duration": 86400

    } excludes: ಛఆͷ؀ڥ໊͸আ֎ exclude_tags: ࢦఆͨ͠λά͕෇͍͍ͯΔ؀ڥ͸আ֎ duration: աڈࢦఆͨ͠ඵ਺ʹΞΫηε͕ͳ͍؀ڥΛ࡟আ͢Δ ࢭΊͨ͘ͳ͍؀ڥ΋͋ΔͷͰআ֎৚݅ΛࢦఆՄೳ(ސ٬޲͚ͱ͔)
  14. ؀ڥ͕ىಈͰ͖ͳ͍ཧ༝Λ஌Γ͍ͨ ECSλεΫɺཱͨͳ͍ͱ͖ࠔΓ·͢ΑͶ… github.com/fujiwara/tracer "Amazon ECS λεΫͷΠϕϯτͱϩάΛ࣌ܥྻͰग़͢ tracer Λ࡞ͬͨ"4 - λεΫʹؔ࿈͢ΔΠϕϯτ(࡞੒ɺىಈ։࢝ɺpull։࢝ͱఀࢭɺఀࢭ։

    ࢝ɺఀࢭ׬ྃͳͲ) - λεΫ಺ͷίϯςφ͕ CloudWatch Logs ʹग़ྗͨ͠ϩά - (ECS αʔϏε͔Βىಈ͞ΕͨλεΫͷ৔߹) αʔϏεͷΠϕϯτϩά ͜ΕΛmirage-ecsͷWebUI͔Β͙͢ݟΒΕΔΑ͏ʹ 4 h$ps:/ /techblog.kayac.com/ecs-task-tracer
  15. tracerͷग़ྗྫ (ىಈ→ఀࢭ) 2024-01-31T07:05:17.529Z TASK Created 2024-01-31T07:05:32.718Z CONTAINER:nginx LastStatus:PENDING HealthStatus:UNKNOWN 2024-01-31T07:05:32.718Z

    TASK LastStatus:PENDING 2024-01-31T07:05:22.775Z TASK Connected 2024-01-31T07:05:32.391Z TASK Pull started 2024-01-31T07:05:39.561Z TASK Pull stopped 2024-01-31T07:05:39.590Z TASK Started 2024-01-31T07:05:40.070Z CONTAINER:nginx LastStatus:PENDING HealthStatus:UNKNOWN 2024-01-31T07:05:40.070Z TASK LastStatus:PENDING 2024-01-31T07:05:40.070Z CONTAINER:nginx LastStatus:RUNNING HealthStatus:UNKNOWN 2024-01-31T07:05:40.070Z TASK LastStatus:RUNNING 2024-01-31T07:05:39.573Z CONTAINER:nginx /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration 2024-01-31T07:05:39.573Z CONTAINER:nginx /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/ 2024-01-31T07:05:39.576Z CONTAINER:nginx /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh -- (ུ) -- 2024-02-04T01:03:04.797Z TASK LastStatus:STOPPED 2024-02-04T01:02:50.022Z TASK Stopping 2024-02-04T01:02:50.022Z TASK StoppedReason:Terminate requested by Mirage 2024-02-04T01:02:50.022Z TASK StoppedCode:UserInitiated 2024-02-04T01:02:52.015Z CONTAINER:nginx 2024/02/04 01:02:52 [notice] 1#1: signal 15 (SIGTERM) received, exiting 2024-02-04T01:02:52.018Z CONTAINER:nginx 2024/02/04 01:02:52 [notice] 34#34: exiting 2024-02-04T01:02:52.018Z CONTAINER:nginx 2024/02/04 01:02:52 [notice] 35#35: exiting 2024-02-04T01:02:52.018Z CONTAINER:nginx 2024/02/04 01:02:52 [notice] 34#34: exit
  16. tracerͷग़ྗྫ (ىಈࣦഊ) 2024-02-04T01:09:02.676Z TASK Created 2024-02-04T01:09:08.180Z TASK LastStatus:STOPPED 2024-02-04T01:09:08.180Z TASK

    LastStatus:DEPROVISIONING 2024-02-04T01:09:08.180Z TASK LastStatus:PROVISIONING 2024-02-04T01:09:06.503Z TASK Connected 2024-02-04T01:09:21.109Z TASK Execution stopped 2024-02-04T01:09:31.147Z TASK Stopping 2024-02-04T01:09:31.147Z TASK StoppedReason:CannotPullContainerError: pull image manifest has been retried 1 time(s): failed to resolve ref docker.io/library/nginx:lates: docker.io/library/nginx:lates: not found 2024-02-04T01:09:31.147Z TASK StoppedCode:TaskFailedToStart imageΛpullͰ͖͍ͯͳ͍ͨΊ StoppedCode:TaskFailedToStart ͳͷ͕͙͢෼͔Δ