Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LT: 生成 AI 基盤モデル・大規模言語モデル (LLM) のクラウド開発・運用における現実...

LT: 生成 AI 基盤モデル・大規模言語モデル (LLM) のクラウド開発・運用における現実 / Reality of FM/LLM Development

ML@Loft #14. ~LLM運用の現実~
https://ml-loft.connpass.com/event/328160/

Yoshitaka Haribara

August 29, 2024
Tweet

More Decks by Yoshitaka Haribara

Other Decks in Technology

Transcript

  1. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. . - ! - P G U  ʙ - - .  ӡ ༻ ͷ ݱ ࣮ ʙ ੜ੒ "*ج൫Ϟσϧɾ େن໛ݴޠϞσϧ --. ͷ Ϋϥ΢υ։ൃɾӡ༻ʹ͓͚Δݱ࣮ ਑ݪ Ղو γχΞ ελʔτΞοϓ ػցֶश ιϦϡʔγϣϯΞʔΩςΫτ 
  2. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. • "84ϒϩά ʮج൫Ϟσϧ։ൃʹ௅Ή֤͕ࣾ੒ՌΛڞ༗ɻ "84--.։ൃࢧԉϓϩάϥϜ ੒Ռൃදձʯ • "84.-#MPH l6OMPDLJOH+BQBOFTF--.TXJUI"84 5SBJOJVN*OOPWBUPST4IPXDBTFGSPNUIF "84--.%FWFMPQNFOU4VQQPSU1SPHSBNz ج൫Ϟσϧ '. ɾେن໛ݴޠϞσϧ --. ։ൃ 
  3. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. • "84ϒϩά ʮ(&/*"$ʹ͓͚ΔܭࢉϦιʔεఏڙऀͱͯ͠ "84͕બఆ͞Ε·ͨ͠ʯ ج൫Ϟσϧ '. ɾେن໛ݴޠϞσϧ --. ։ൃ  Amazon EC2 P5 インスタンス (NVIDIA H100 Tensor Core GPU)
  4. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. • "84΍ܦ࢈ল͕ٕज़ɾܭࢉϦιʔεɾॿ੒ۚͳͲఏڙʹΑΓ։ൃࢧԉ • ˠཪΛฦ͢ͱ'.--.։ൃʹ͸ലେͳ͓͕͔ۚΔ • ˠ͍͔ʹ୯ՁΛԼ͛Δ͔ɺ͸ਂࠁͳ՝୊ • '.--.͸େن໛ɺਪ࿦ʹ͔͔Δίετ΋ലେ • ˠ͍͔ʹ୯ՁΛԼ͛Δ͔ɺ͸ਂࠁͳ՝୊ • ͔͠΋ຊ౰ʹṶ͔Δͷ͔ʁ ج൫Ϟσϧ '. ɾେن໛ݴޠϞσϧ --. ։ൃ 
  5. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. ٹੈओ '.--.޲͚ΞΫηϥϨʔλ νοϓ  AWS Inferentia2 High performance at the lowest cost per inference for LLMs and diffusion models Up to 40% better price performance than comparable Amazon EC2 instances AWS Trainium The most cost-efficient, high- performance training of LLMs and diffusion models Up to 50% savings on training costs over comparable Amazon EC2 instances
  6. © 2023, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. "84ͷੜ੒"*ελοΫ GPUs Inferentia Trainium SageMaker EC2 Capacity Blocks Neuron UltraClusters EFA Nitro Amazon Bedrock Agents Guardrails Customization Capabilities Amazon Q Business Amazon Q Developer Amazon Q in QuickSight Amazon Q in Connect
  7. © 2023, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. © 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved. "84ͷੜ੒"*ελοΫ GPUs Inferentia Trainium SageMaker EC2 Capacity Blocks Neuron UltraClusters EFA Nitro
  8. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. • "84ϒϩάʮ"845SBJOJVNɺ"84*OGFSFOUJB͕ "84্ͷ -MBNBϞσϧʹߴੑೳͱ௿ίετΛఏڙʯ -MBNB .JTUSBMͳͲͷϞσϧΛαϙʔτ  https://ai.meta.com/blog/meta-llama-3-1/
  9. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. ެฏੑ͞·͟·ͳεςʔΫϗϧμʔͷάϧʔϓ΁ͷ ӨڹΛߟྀ͢Δ આ໌ՄೳੑγεςϜग़ྗΛཧղͯ͠ධՁ͢Δ ϓϥΠόγʔͱηΩϡϦςΟσʔλͱϞσϧΛద੾ ʹऔಘɺ࢖༻ɺอޢ͢Δ ҆શੑ༗֐ͳγεςϜग़ྗͱޡ༻Λ๷͙ ੍ޚੑ"*γεςϜͷಈ࡞ΛϞχλϦϯά͓Αͼ੍ޚ ͢ΔϝΧχζϜΛඋ͑Δ ਖ਼֬ੑͱݎ࿚ੑ༧ظ͠ͳ͍ೖྗ΍ఢରతͳೖྗ͕ ͋ͬͯ΋ɺਖ਼͍͠γεςϜग़ྗΛ࣮ݱ͢Δ ΨόφϯεϓϩόΠμʔ΍σϓϩΠϠʔΛؚΉ "*α ϓϥΠνΣʔϯʹϕετϓϥΫςΟεΛ૊ΈࠐΉ ಁ໌ੑεςʔΫϗϧμʔ͕ "*γεςϜͱͷؔΘΓʹ ͍ͭͯे෼ͳ৘ใʹج͍ͮͨબ୒Λߦ͑ΔΑ͏ʹ͢Δ ੹೚͋Δ"* 3FTQPOTJCMF"*  
  10. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. Thank you! :PTIJUBLB)BSJCBSB 1I% 4S4UBSUVQ.-4PMVUJPOT"SDIJUFDU 
  11. © 2024, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. 3PVOEUBCMFEJTDVTTJPO  ίετ $PTU ੹೚͋Δ"* 3FTQPOTJCMF"* ӡ༻ 0QT ϞσϧධՁ .PESFMFWBM