Upgrade to Pro — share decks privately, control downloads, hide ads and more …

顧客のアプリケーションコードが動くマルチテナント環境における課題とEKSにたどり着くまで

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

 顧客のアプリケーションコードが動くマルチテナント環境における課題とEKSにたどり着くまで

Avatar for shogomuranushi

shogomuranushi

March 20, 2020
Tweet

More Decks by shogomuranushi

Other Decks in Technology

Transcript

  1. σʔλ औಘ σʔλ ஝ੵ σʔλ ֬ೝ ڭࢣσʔλ ࡞੒ Ϟσϧ ઃܭ

    ֶश ධՁ σϓϩΠ ਪ࿦ ࠶ֶश σʔλ΢ΣΞϋ΢ε ͷ४උͱ؅ཧ σʔλͷόϦσʔγϣϯʢਖ਼֬ੑʣͷ֬ೝ 0͔ΒͷϞσϧઃܭ GPU؀ڥͷ४උͱ ߴ౓ͳ෼ࢄԽ σʔλɺϞσϧɺ݁Ռͷόʔδϣϯ؅ཧ ౷ܭతʹຊ൪ʹσϓϩΠͨ͠ॠؒ ͔Βਫ਼౓͕Լ͕Δ͜ͱΛ୲อ େྔσʔλͷऔಘʹඞཁͳAPI΍ෛՙ෼ࢄ ͷ࢓૊Έ΍४උɺηΩϡϦςΟ୲อ ڭࢣσʔλͷ࡞੒ʹඞཁͳπʔϧͱਓࡐͷ४උ ։ൃ؀ڥ͔Βຊ൪؀ڥ΁ͷҾ͖౉͠ ৑௕ੑ΍GPUϦιʔεͷ୲อɺ Τοδଆͱͷ࿈ܞϓϩηεߏங
  2. σʔλ औಘ σʔλ ஝ੵ σʔλ ֬ೝ ڭࢣσʔλ ࡞੒ Ϟσϧ ઃܭ

    ֶश ධՁ σϓϩΠ ਪ࿦ ࠶ֶश σʔλ΢ΣΞϋ΢ε ͷ४උͱ؅ཧ σʔλͷόϦσʔγϣϯʢਖ਼֬ੑʣͷ֬ೝ 0͔ΒͷϞσϧઃܭ GPU؀ڥͷ४උͱ ߴ౓ͳ෼ࢄԽ σʔλɺϞσϧɺ݁Ռͷόʔδϣϯ؅ཧ ౷ܭతʹຊ൪ʹσϓϩΠͨ͠ॠؒ ͔Βਫ਼౓͕Լ͕Δ͜ͱΛ୲อ େྔσʔλͷऔಘʹඞཁͳAPI΍ෛՙ෼ࢄ ͷ࢓૊Έ΍४උɺηΩϡϦςΟ୲อ ڭࢣσʔλͷ࡞੒ʹඞཁͳπʔϧͱਓࡐͷ४උ ։ൃ؀ڥ͔Βຊ൪؀ڥ΁ͷҾ͖౉͠ ৑௕ੑ΍GPUϦιʔεͷ୲อɺ Τοδଆͱͷ࿈ܞϓϩηεߏங AI׆༻·Ͱʹ਺ଟ͘ͷ՝୊͕ଘࡏ
  3. Ref: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf “ As the machine learning (ML) community continues

    to accumulate years of experience with live systems ” “ ։ൃ͓ΑͼMLγεςϜΛಋೖ͢Δ͜ͱ͸ൺֱతߴ଎Ͱ҆ՁͰ͕͢ɺ࣌ؒΛ͔͚ͯ ͦΕΛҡ࣋͢Δ͜ͱ͸ࠔ೉͔ͭߴՁͰ͋Δ”
  4. ୈҰੈ୅ΞʔΩςΫνϟ of Datalake • Raw σʔλΛ஝ੵ͢ΔͨΊͷαʔϏε • ετϨʔδ͸ S3 •

    ॳظόʔδϣϯ͸ API Gateway ͱ Lambda Ͱ REST API Ͱఏڙ • ౰ॳ͸ϑΝΠϧͷอଘɺऔಘɺҰཡͳͲͷΦϖϨʔγϣϯͷΈ • ߏஙʹ͸Serverless FrameworkΛར༻ • Signed URLΛൃߦ͠Ξοϓϩʔυͯ͠΋Β͏࢓༷
  5. ୈҰੈ୅ΞʔΩςΫνϟ of Datalake Good • ϝϯςφϯεϑϦʔ Bad • API GatewayͭΒ͍

    • ϩʔΧϧͷ࠶ݱੑ͕௿͘։ൃޮ཰ѱ͍ • ϖΠϩʔυαΠζ • Serverless Framework·͊·͊ਏ͍ • ࠓޙଞͷαʔϏε΋ಉ͡ελΠϧͰ։ൃʁ
  6. ୈೋੈ୅ΞʔΩςΫνϟ of Datalake • AWSͷAPI Gateway͸γϯυ͍ͷͰɺAPI GatewayΛ಺੡ • ։ൃޮ཰͕ѱ͍ɺେࣄͳΤϯυϙΠϯτͳͷͰো֐࣌ͷίϯτϩʔϧ͸͔ͨͬͨ͠ •

    API Gateway͕෼཭͞Εͨ͜ͱͰɺೝূɾϧʔςΟϯά͕ڞ௨ʹ • ୔ࢁͷAPI Gatewayͱ͸͓͞Β͹ • Datalake͸LambdaͱS3ͷΈʹ • ౰໘͸͜ͷߏ੒
  7. ୈࡾੈ୅ΞʔΩςΫνϟ of Datalake • ʮݕࡧ΍Χ΢ϯτ͕͍ͨ͠ʯ • ϝλσʔλݕࡧ΍Χ΢ϯτػೳ༻ͷDBΛߏங • ݕࡧ͸PostgreSQL(Aurora)ͷJSON+GIN IndexͰ࣮૷

    • ॊೈͳϝλσʔλͷ෇༩ͱݕࡧΛͰ͖ΔΑ͏ʹ͠ա͗ͯIndexരൃ • Cassandra…? Or ॊೈੑΛܰݮͤ͞Δ͜ͱΛݕ౼த • S3 Event + SQS + Lambda ͔Β Datalake API Λݺͼग़͢ • όοΫΤϯυͷෛՙ͕଱͖͑Εͳ͔ͬͨɺCW Logs͕ߴ͘ͳͬͨͷͰόοΫΤϯυΛLambda ͔ΒECSʹҠߦ • ʮS3ͷSigned URL͸खؒͳͷͰҰճͰDatalakeʹΞοϓϩʔυ͍ͨ͠ʯ • S3΁ͷPut͸API GatewayͰ୲͏Α͏ʹ࢓༷มߋ
  8. ୈࡾੈ୅ΞʔΩςΫνϟ of Datalake Good • Signed URL͸ෆཁʹͳΓUXվળ • ϝλσʔλݕࡧͰ͖ΔΑ͏ʹ Bad

    • ΋͸΍αʔόϨε͸ແ͘ͳͬͨͷͰϝϯ ςίετ૿Ճ • ϝλσʔλػೳ͕ࣗ༝ա͗ͯIndexංେԽ Ͱਏ͍ όοΫΤϯυෛՙɺCWLogsίετ૿ՃʹΑΓ ECSʹมߋ
  9. ୈྵੈ୅ΞʔΩςΫνϟ of Serving Good • γϯϓϧ • ҰׅͰؔ࿈ϦιʔεΛ࡞੒ɾ࡟আͰ͖Δ Bad •

    1Serviceຖʹ1ͭͷELB͸ແବ • CFnͰ࡞੒͞ΕΔͷʹ਺෼͔͔Γ஗͍ • CFn͸ඇಉظͳͷͰΤϥʔݕ஌͕೉͍͠
  10. ୈҰੈ୅ΞʔΩςΫνϟ of Serving • 1 Loadbalancer = Muliti Serviceʹมߋ •

    ALBͷϧʔςΟϯάϧʔϧ͸100ݸ͕ϋʔυϦϛοτ • ސ٬ͷAPI͕ͲΜͲΜ৐ΔͨΊ͙͢ʹഁ୼͢Δ͜ͱ͕ݟ͑ͯͨ • ·ͨ΋΍ࣗલͰGatewayΛ։ൃʢECS + Lambda + DynamoDBʣ orz • ސ٬ͷΞϓϦέʔγϣϯ͕ৗ࣌Τϥʔൃੜɻίϯςφ͕࠶ىಈ͠·͘ΓEBSόʔετΫϨδοτΛ৯͍ͭͿ͢ • ECS Sevice͸ৗʹىಈ͢Δ࢓༷ͰαʔΩοτɾϒϨΠΫ͸ແ͍ • γεςϜىҼͷΤϥʔͰͷαʔΩοτɾϒϨΠΫ͸ޙ೔࣮૷͞Ε͕ͨɺΞϓϦىҼͷαʔΩοτɾϒϨΠ Ϋ͸·ͩແ͍ • ͳͷͰ࠶ىಈ܁Γସ͑͠IO৯͍ͭͿ͢ʢѹ౗తϊΠδʔωΠόʔʣ • ࣗલʢLambdaʣͰϔϧενΣοΫͯ͠ɺҰఆճ਺Failͷ৔߹͸ϧʔςΟϯά͠ͳ͍Α͏ʹࣗલGatewayͷ DynamoDBʹอଘ
  11. ୈҰੈ୅ΞʔΩςΫνϟ of Serving • Blue/Greenػೳͷ࣮૷ • APIͷΤϯυϙΠϯτͷ޲͖ઌͱͳΔϞσϧΛ؆୯ʹ੾Γସ͑ΒΕΔػೳ • ECSλεΫ͔ΒϗετͷIAMϩʔϧ͕৮Εͯ͠·͏໰୊ •

    iptablesͰmetadata΁ͷΞΫηεΛͿͬͨ੾Δ • ࠷ۙ͸ɺawspvc ωοτϫʔΫϞʔυͷ৔߹ͷ৔߹͸ҎԼͷΑ͏ʹ؆୯ • ECS_AWSVPC_BLOCK_IMDS: true
  12. ୈҰੈ୅ΞʔΩςΫνϟ of Serving Good • 1ਪ࿦API = 1ELBͰ͸ແ͘ͳΔͨΊίετμ ΢ϯ •

    Blue/Green͕Ͱ͖ΔΑ͏ʹ • EBSόʔετ໰୊ղܾ • EC2 metadataΞΫηεͰ͖ͳ͍Α͏ʹ Bad • ࣗલ࣮૷͕ଟ͘ͳ͖ͬͯͯϝϯςίετ ͕͕͕ɻͱ͸͍͑ɺAWSͷػೳͰ͸Χ όʔͯ͠ͳ͍՝୊͸ࣗલ࣮૷͔͠ͳ͍
  13. ୈೋੈ୅ΞʔΩςΫνϟ of Serving • ECSͱEC2ͷ૬ੑ͕ѱ͍໰୊ • AutoScalingൃಈ࣌ʹίϯςφΛແࢹͯ͠EC2ΛTerminate͢ΔɻDrainͯ͠Αʢࠓ͸ղফ͞Ε͍ͯΔʁʣ • AutoScalingൃಈ৚݅͸CPU༧໿ྔϕʔε͕جຊͳͷͰɺίϯςφαΠζͷεέʔϧ͕͔ͳΓखؒͩ͠ɺܭ ࢉϩδοΫ΋ΊΜͲ͍͘͞

    • ۭ͖Ϧιʔε͕͋ͬͯ΋ू໿͞Εͳ͍ • ΠϯελϯεΛೖΕସ͑ΔࡍͷBlue/Green Deployment͸ࣗલ࣮૷ • ސ٬ͷਪ࿦APIΛ୔ࢁࡌͤΔͱίετ͕ංେԽ͢Δ • εϙοτΠϯελϯεΛ͏·͘ѻ͑ͳ͍͔ݕ౼ • ΠϯελϯελΠϓ΍κʔϯΛࢄΓ͹ΊͨΓɺεϙοτΠϯελϯε͕ചΓ੾Εͨ࣌ʹΦϯσϚϯυʹ੾ Γସ͑ͨΓ͢Δඞཁ͕͋Δ • ݕূͷ݁ՌɺSpotinstͱ͍͏αʔϏεΛೖΕΔ͜ͱʹ
  14. Bad • εϙοτΠϯελϯεͷద༻ൣғΛؒҧ ͑Δͱҙਤ͠ͳ͍ఀࢭ͕ൃੜ͢Δ ʢJupyter Notebookͱ͔ʣ ୈೋੈ୅ΞʔΩςΫνϟ of Serving Good

    • SpotinstΛར༻͢Δ͜ͱʹΑΓECS͕଍ Γͳ͍ͱ͜ΖΛΧόʔ • Blue/Green Deployment͕ڧ੍͞ΕΔ • ू໿ޮ཰্͕͕Γɺ60-70%ίετμ΢ϯ
  15. ୈࡾੈ୅ΞʔΩςΫνϟ of Serving • 1 ServiceͰ 300ίϯςφΛಈ͔͢πϫϞϊ͕ग़͖ͯͨ • DynamoDBͷύʔςΟγϣχϯάͷภΓ͕ൃੜ͠εϩοτϦϯά͕େྔʹ •

    Kubernetesػӡ͕ߴ·͖ͬͯͨɻKubernetesΛ׆༻͢Δ͜ͱͰ • API Gateway૬౰ͷػೳ͸Ambassador(Envoy)Ͱ୅༻Մೳ • ͔͠͠ɺService/Pod͕େྔʹଘࡏ͢ΔͱEnvoyͷϧʔςΟϯάϧʔϧͷߋ৽ʹ਺ेඵ͔͔Δ͜ͱ΋ • Service DiscoveryɺϔϧενΣοΫ͸ඪ४ػೳͰ೚ͤΒΕΔ • Service؀ڥ΋KubernetesԽͱͱ΋ʹࣗલ࣮૷෦෼ͷݮΒ͢ํ޲ʹ
  16. ୈࡾੈ୅ΞʔΩςΫνϟ of Serving Good • ϔϧενΣοΫɾαʔϏεσΟεΧόϦΛࣗ લ࣮૷͔ΒKubernetesͷҰൠతͳػೳʹஔ͖ ׵͑ • GatewayͷػೳΛKubernetesͷ֦ுػೳ

    ʢAmbassadorʣʹஔ͖׵͑ Bad • Ambassador(Envoy)ͷཧղ͕ඞཁʹ • ࣗલ࣮૷ΑΓ͸༷ʑͳ໘Ͱߟྀ͞Ε͍ͯ ͯϝϯςίετ͸Լ͕Δ͕ΧελϚΠζ ੑ͸མͪΔ
  17. ୈࡾੈ୅ΞʔΩςΫνϟ of Serving • ݱঢ়ͷ՝୊ • ސ٬ຖͷPodͷΞ΢τό΢ϯυͷసૹྔͷܭଌํ๏͕Θ͔Βͳ͍ • GKEʹ͸͋ΔΒ͍͠ɻIstioͷग़൪͔ɾɾʁ •

    ސ٬ຖʹίετΛՄࢹԽ͢ΔͨΊʹKubernetesͷeventΛhook͠·͘Βͳ͍ͱ͍͚ͳ͍ • ͜ΕGKE͸usage meteringͱ͍͏ͷͰग़དྷΔΆ͍ͷͰEKSͰ΋ͥͻ΍ͬͯ΄͍͠ɻࣄۀ෦ϚϧνςφϯτͰ΋ඞཁͱࢥ͏ • Kubernetes ͷ݁Ռ੔߹ੑͷৼΔ෣͍ͷ্ʹࣗ෼ͨͪͷγεςϜΛߏங͢Δ೉͠͞ • ྫ͑͹ • Pod ͷ STATUS ͕ Running ʹͳΔɻૄ௨ग़དྷΔ͔ͱࢥ͍͖΍ Ready (readinessProbe) ͕ 0/1 • Ready ͕ 1/1ʹͳΔɻૄ௨Ͱ͖Δ͔ͱࢥ͍͖΍ Ambassador (Envoy) ͷͱ͋Δ Pod ͸ૄ௨Ͱ͖Δ͕ɺͱ͋Δ Pod ͸ߋ ৽଴ͪͷͨΊૄ௨Ͱ͖ͳ͍ɻ෼ࢄγεςϜʹ͓͚Δ݁Ռ੔߹ͳͷͰʮ͍ͭʯ͔Β࢖͑ΔΑ͏ʹͳͬͨঢ়ଶ͔ΛϢʔβ ʹ஌ΒͤΔͷ͕೉͍͠ • ͦΕͧΕͷػೳ͕݁Ռ੔߹Λ୲อ͍ͯ͠Δ͕ނʹɺ࿈ಈͯ͠ཉ͍͠ͱ͜ΖΛຒΊΔඞཁ͕͋Δ
  18. ୈࡾੈ୅ΞʔΩςΫνϟ of Serving • ݱঢ়ͷ՝୊ • IPΞυϨεͷރׇ • 1ΠϯελϯεลΓ Serving

    = 20ݸɺTraining = 30ݸ Λ࢖༻͍ͯ͠Δɻ20ݸ * 400 Service ͕࡞ΒΕͨ࣌ʹ8,000ݸͷIP Λ࢖༻ɻ·͔͞ͷ /16 αϒωοτ͕ރׇ • એݴతʹͳΓͮΒ͍ • SDKͰσϓϩΠͯ͠ΔͷͰʮyamlΛ΋͏Ұ౓ద༻ͨ͠Β࠶ݱͰ͖ΔΑʯঢ়ଶʢએݴతʣʹ͸ͳ͍ͬͯͳ͍ɻ൵͍͠ • ؂ࢹ͠ਏ͍ • ϢʔβʔىҼͷ໰୊ͱϓϥοτϑΥʔϜͷ໰୊ͷ੾Γ෼͚ํ๏͕೉͍͠ • ސ٬ىҼͰࢮΜͰΔpod͕ଘࡏ͢Δɻશ͕ͯਖ਼ৗʹՔಇ͍ͯ͠Δ༁Ͱ͸ͳ͍ • ಠࣗGatewayͰ5xx/4xxͷ؂ࢹ͕Ͱ͖ͳ͍ɻόοΫΤϯυ͕ސ٬ґଘͷͨΊ5xx͸े෼༗ΓಘΔ • ϚΠΫϩαʔϏε͋Δ͋Δ • LB/Proxy͕ଟஈͳͷͰௐࠪͮ͠Β͍ɻސ٬ʹͲ͜·ͰϩάɾϝτϦΫεΛग़ͤ͹ྑ͍ͷ͔೉͍͠
  19. ୈҰੈ୅ΞʔΩςΫνϟ of Training • ֶशδϣϒΛ࣮ߦ͢Δج൫ • ॳظόʔδϣϯ͔ΒKubernetesϕʔεͰ࣮૷ͨ͠ • JobɺPodʢartifactอଘ༻ίϯςφʣͳͲΛECSͰࣗલ࣮૷͸ඇޮ཰ͩͬͨͨΊ •

    ͔͠͠EKS͸ແ͔ͬͨͷͰ on EC2 Ͱ • ސ٬ͷίϯςφͱ؅ཧܥίϯςφ͸PodΛ෼͚ͨ • TrainingޙͷϞσϧΛs3ʹࣗಈอଘ͢ΔίϯςφɺϩάΛऩू͢Δίϯςφ͸ɺϢʔβͷίʔυͱಉډͤ͞ΔͱIAMͷݖ ݶతʹྑ͘ͳ͍ͷͰɺผPodͱͯ͠agentతʹىಈ • kube2iam Λ༻͍ͯPodຖʹIAMϩʔϧΛΞλονʢࠓ͸ެ͕ࣜग़͍ͯΔʁʣ • privileged͸Կ͕͋ͬͯ΋off • GPUυϥΠόपΓ͸ۤ࿑͢Δ͚ͲؤுΔ
  20. ୈೋੈ୅ΞʔΩςΫνϟ of Training Good • Jupyter NotebookɺTensorboradΛఏڙ • ڞ༗ετϨʔδͷఏڙ Bad

    • Kubernetes on EC2 ͸ӡ༻͕݁ߏਏ͍ • Jupyterະ࢖༻࣌ͷՔಇίετ͕ແବ • ڞ༗ετϨʔδ͕ߴͯ͘NFSͳͷͰ஗͍
  21. ୈࡾੈ୅ΞʔΩςΫνϟ of Training • ݱঢ়ͷ՝୊ • ೥ؒܭըͷച্ɾݪՁ • ਫ਼៛Խ͢Δඞཁ͕͋Δ͕ɺސ٬࣍ୈͳͷͰݪՁΛ༧ଌ͢Δͱ͔΋͸΍Α͘෼͔ΒΜɻAWS͞ΜͲ͏΍ͬͯ ؅ཧ͍ͯ͠ΔͷͩΖ͏

    • όά • p3.16xlarge͕ఀࢭͤͣʹPສҐ͔͔ͬͯΔ࣌΋͋ͬͨ • OS • nvidia-driver͕αϙʔτ͢ΔOSͰ͋Δඞཁ͕͋Δɻͭ·ΓUbuntu or Amazon Linux2ɻʮBottlerocketʯͷ nvidia-driverαϙʔτظ଴
  22. ୈࡾੈ୅ΞʔΩςΫνϟ of Training • ݱঢ়ͷ՝୊ • ίϯςφؒͷґଘؔ܎ • ϩάΛ࿙Εͳ͘ऩू͢ΔͨΊʹ Affinity

    Λۦ࢖ͯ͠ log collector pod -> platform agent pod -> training job ͱ͍͏༏ઌॱҐΛ෇͚ͯPodΛىಈͤ͞Δͱ ɺඞཁͳϦιʔε͕଍Γͳ͍EventͷൃՐ͕஗ΕΔͨΊ Autoscaler ΁ͷ௨஌͕஗ΕɺΠϯελϯεͷىಈ͕஗͘ͳΔ • DockerΠϝʔδɾύοέʔδͷޓ׵ੑɺαϙʔτ • ఏڙ͢ΔDockerΠϝʔδͷޓ׵ੑҡ͕࣋೉͍͠ • αϙʔτର৅ • DLϥΠϒϥϦͷछྨ x όʔδϣϯ x Pythonόʔδϣϯ x CUDAͷόʔδϣϯ …
  23. ୈҰੈ୅ΞʔΩςΫνϟ of Trigger • ਪ࿦δϣϒΛ࣮ߦ͢Δج൫ • Datalakeʹσʔλ͕౤ೖ͞Εͨ͜ͱΛτϦΨʔʹൃಈ • S3->SNS->SQSͰɺҰ୴ΩϡʔΠϯά͢Δ •

    SQSͷQueue͔ΒSubscriber͕औಘ͠ɺAWS Batch΁λεΫΛ౤͛Δ • δϣϒ͕ऴΘΕ͹Πϯελϯε͕ࣗಈఀࢭ͢Δ • ֖Λ։͚Ε͹1෼ʹԿඦͱδϣϒ͕౤͛ΒΕΔ
  24. ୈҰੈ୅ΞʔΩςΫνϟ of Trigger Good • ٸܹͳෛՙ͸SQS͕ٵऩ • Subscriber͸Queueͷ਺Ͱεέʔϧ • AWS

    Batch͸౤͛ͨΒྑ͍͚ͩ Bad • ىಈ͕஗͍ɺϝτϦΫεແ͍ɺϩά͕ू໿ ͞ΕͯJob୯ҐͰݟΕͳ͍ • AWS BatchͷϢʔεέʔεʹ߹ͬͯͳ͔ͬͨ • AZؒͷωοτϫʔΫసૹྔ͕Ϡό͍
  25. ୈҰੈ୅ΞʔΩςΫνϟ of Logging for Customer Good • αʔόϨεܥʢCWLogs / Kinesiss

    / Lambda / DynamoDBʣͳͷͰӡ༻؅ཧෆ ཁ Bad • ϩάྔ͕ٸ૿͢ΔͱLambda͔ DynamoDBͰεϩοτϦϯάى͖Δ • CWLogs࣮࣭࢖ͬͯͳׂ͍ʹ݁ߏߴ͍
  26. ୈೋੈ୅ΞʔΩςΫνϟ of Logging for Customer • ϩάͷόοΫΤϯυΛDatadog Logsʹมߋ • ServiceɺTrainingͷϩάΛDatadog

    Logsʹอଘ • Datadog Logs ͷAPI͔ΒϩάΛऔಘ͠ސ٬ʹఏڙ • ElasticSearchʹ͢Δ͔ߟ͕͑ͨ • ݕ౼࣍఺Ͱϩάྔ͸5ԯϨίʔυ/݄ • ࣄۀͷ੒௕ͱϩΪϯάର৅Λ૿΍͢͜ͱΛߟ͑Δͱ3ϲ݄ຖʹഒʑʹ૿͑Δ • ഒʑʹ૿͑ΔElasticSearchΛӡ༻ͨ͘͠ͳ͔ͬͨ͠ɺϩά͸ίΞίϯϐλϯεͰ͸ͳ͍ͷͰӡ༻ί ετΛֻ͚ͨ͘ͳ͔ͬͨ
  27. ୈೋੈ୅ΞʔΩςΫνϟ of Logging for Customer Good • ϑϧϚωʔδυͳͷͰӡ༻ϑϦʔ Bad •

    ͓͕͔͔ۚΔɻͱ͸͍͑ɺCWLogsΑΓ҆͘ ࣗલͰӡ༻͢ΔΑΓϚγ • Datadogͷ࢓༷ʹҾͬுΒΕΔ • ݁Ռ੔߹ɺॱংอূແ͠ • datadog-agentͷڍಈɾ࢓༷
  28. ୈҰੈ୅ΞʔΩςΫνϟ of Logging for System & Application • ࣾ಺Ͱར༻͢ΔγεςϜϩάɾΞϓϦέʔγϣϯϩάͷऩूج൫ •

    ϩά͸ίΞίϯϐλϯεͰ͸ͳ͍ͨΊɺग़དྷΔݶΓࣗલͰӡ༻ͨ͘͠ͳ͍ • ECS΍LambdaΛத৺ʹར༻͍ͯͨͨ͠ΊCloudWatch Logsʹϩά͸֨ೲ͞Ε͍ͯͨ • ͱΓ͋͑ͣCW LogsΛར༻ͨ͠
  29. ୈҰੈ୅ΞʔΩςΫνϟ of Logging for System & Application Good • Πϯϑϥͷ͜ͱ͸ߟ͑ͳͯ͘ྑ͍

    Bad • Ͳ͜ʹ֨ೲ͞Ε͍ͯΔ͔෼͔Βͳ͍ • ݕࡧੑօແɻ໨grepྗ্͕Δ • ϚΠΫϩαʔϏεؒͷϩάௐࠪͱ͔͔ͳΓπ ϥϛ͔͠ͳ͍ • ݁Ռɺ໰୊͔͋ͬͨ࣌͠ϩάΛݟͳ͍ • Ҏ֎ʹCWLogs͸ߴ͍
  30. ୈೋੈ୅ΞʔΩςΫνϟ of Logging for System & Application Good • Πϯϑϥͷ͜ͱ͸ߟ͑ͳͯ͘ྑ͍

    • ҰՕॴͰશ෦ݕࡧग़དྷΔ • Tag, Attributeʹରͯ͠IndexுΕͯݕࡧର৅ʹ Ͱ͖Δ • Ͳͷ߲໨ʹԿ݅͋Δ͔Ұ໨ྎવ • ϚΠΫϩαʔϏεͳͷͰɺTracing IDͳͲΛຒ ΊࠐΜͰ௥͍΍ͨ͘͢͠Γ • APIͷ܏޲෼ੳͨ͠ΓɺӨڹൣғௐ΂ͨΓͱ ׆༻͕޿͕ͬͨ Bad • ͓͕͔͔ۚΔɻͱ͸͍͑ɺCWLogsΑΓ҆ࣗ͘ લͰӡ༻͢ΔΑΓϚγ