Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DeepSeek on AWS

DeepSeek on AWS

Agenda

• DeepSeek Models on AWS: Hosting, Fine-tuning, and Training
§ Amazon Bedrock
§ Amazon SageMaker JumpStart
§ Amazon SageMaker Endpoint
§ Amazon SageMaker HyperPod

• Key Benefits of Leveraging DeepSeek Models on AWS
§ Security
§ Operational Excellence
§ Cost

Sungmin Kim

March 19, 2025
Tweet

More Decks by Sungmin Kim

Other Decks in Technology

Transcript

  1. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. Amazon Bedrock과 SageMaker AI를 활용한 DeepSeek R1 모델 배포 및 운영 방법 김성민 Sr. AI/ML Specialist Solutions Architect AWS
  2. © 2025, Amazon Web Services, Inc. or its affiliates. Agenda

    • DeepSeek Models on AWS: Hosting, Fine-tuning, and Training § Amazon Bedrock § Amazon SageMaker JumpStart § Amazon SageMaker Endpoint § Amazon SageMaker HyperPod • Key Benefits of Leveraging DeepSeek Models on AWS § Security § Operational Excellence § Cost
  3. © 2025, Amazon Web Services, Inc. or its affiliates. DeepSeek-R1

    Performance & Evaluation (source: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)
  4. © 2025, Amazon Web Services, Inc. or its affiliates. DeepSeek-R1

    models DeepSeek-R1-Distill models Model #Total Params #Activated Params Context Length DeepSeek-R1-Zero 671B 37B 128K DeepSeek-R1 671B 37B 128K Model Base Model DeepSeek-R1-Distill-Qwen-1.5B Qwen2.5-Math-1.5B DeepSeek-R1-Distill-Qwen-7B Qwen2.5-Math-7B DeepSeek-R1-Distill-Llama-8B Llama-3.1-8B DeepSeek-R1-Distill-Qwen-14B Qwen2.5-14B DeepSeek-R1-Distill-Qwen-32B Qwen2.5-32B DeepSeek-R1-Distill-Llama-70B Llama-3.3-70B-Instruct https://huggingface.co/deepseek-ai/DeepSeek-R1
  5. © 2025, Amazon Web Services, Inc. or its affiliates. Prompts

    Responses Distilled model Advanced model (teacher) Fine-tuned cost-efficient model (student) Match the performance of advanced models with cost- efficient models for your use case with Model Distillation
  6. © 2025, Amazon Web Services, Inc. or its affiliates. Challenges

    Security Cost Operational Excellence ML App interface
  7. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Securely Use DeepSeek Models on AWS Amazon Bedrock Amazon SageMaker AI API Layer Amazon Bedrock Foundation Models Prompt / text embeddings Fine-tune SageMaker Training and Inference Prompt / text embeddings API Layer SageMaker Endpoint Foundation Models SageMaker Jumpstart Model hub, deploy, fine-tune Accelerated Computing Trn1(n), Inf2, P4d, P5 Fine-tune
  8. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. Amazon Bedrock The easiest way to build and scale generative AI applications with powerful tools and foundation models
  9. © 2025, Amazon Web Services, Inc. or its affiliates. AMAZON

    NOVA JAMBA CLAUDE COMMAND EMBED RERANK LLAMA LUMA RAY 2 Effective reasoning & rapid analysis for long context windows High-quality AI image generation, easily deployable at scale Advanced image & language reasoning Knowledge summarization, expert agents, & code completion High-quality video generation from text & images Software engineering AI for large enterprises STABLE DIFFUSION STABLE IMAGE MISTRAL MIXTRAL MALIBU POINT Frontier multimodal intelligence at low- latency, Agent & RAG Applications, high-quality image & video generation Advanced reasoning & coding capabilities, including computer use skills Multimodal search & advanced retrieval powering multilingual knowledge agents Amazon Bedrock B R O A D C H O I C E O F M O D E L S Coming soon Amazon Bedrock Marketplace enables developers to discover, test, and use over 100 popular, emerging, and specialized foundation models (FMs) alongside the current selection of industry-leading models in Amazon Bedrock. DeepSeek-R1 model is now available in Amazon Bedrock Marketplace. As of 10 March 2025, the fully managed DeepSeek-R1 model is now generally available in Amazon Bedrock.
  10. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    Bedrock APIs for Model Invocation API 명칭 주요 기능 스트리밍 지원 InvokeModel 단일 프롬프트 기반의 응답 생성 X Converse 대화 기반의 응답 생성 X InvokeModelWithResponseStream 스트리밍 방식의 단일 프롬프트 응답 생성 O ConverseStream 스트리밍 방식의 대화 기반 응답 생성 O
  11. © 2025, Amazon Web Services, Inc. or its affiliates. Step

    1: Request access for DeepSeek-R1 model
  12. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on Amazon Bedrock Amazon Bedrock Marketplace Custom Model Import SageMaker JumpStart 2 3 4 Fully Managed Serverless Model 1
  13. © 2025, Amazon Web Services, Inc. or its affiliates. Bedrock

    Marketplace implementation • Bedrock Marketplace enables core DeepSeek-R1 deployment in managed endpoints • Complete code samples and step-by-step deployment guides provided for quick implementation • Standard Bedrock security and monitoring features
  14. © 2025, Amazon Web Services, Inc. or its affiliates. Bedrock

    Marketplace delivers 100+ models from 30+ providers EVOLUTIONARY SCALE WIDN CAMB.AI GRETEL ARCEE AI PREFERRED NETWORKS WRITER UPSTAGE NCSOFT STOCKMARK KARAKURI JOHN SNOW LABS LIQUID DATABRICKS CYBERAGENT HUGGING FACE STABILITY AI LG AI RESEARCH M I S T R A L AI SNOWFLAKE N V I D I A DEEPSEEK
  15. © 2025, Amazon Web Services, Inc. or its affiliates. Prerequisite:

    Increase your ml.p5e.48xlarge limits before deployment
  16. © 2025, Amazon Web Services, Inc. or its affiliates. Step

    1: Find the DeepSeek-R1 model on the catalog
  17. © 2025, Amazon Web Services, Inc. or its affiliates. Step

    1: Find the DeepSeek-R1 model on the catalog (cont’d)
  18. © 2025, Amazon Web Services, Inc. or its affiliates. Step

    2: Set options (ml.p5e.48xl by default) and deploy
  19. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on Amazon Bedrock Amazon Bedrock Marketplace Custom Model Import SageMaker JumpStart 2 3 4 Fully Managed Serverless Model 1
  20. © 2025, Amazon Web Services, Inc. or its affiliates. Custom

    Model Import implementation • Bedrock Custom Model Import enables DeepSeek deployment • Support for Llama 8B and 70B distilled DeepSeek R1 variants • Complete code samples and step-by-step deployment guides provided for quick implementation • Standard Bedrock security and monitoring features • Pricing is on-demand in 5-minute window from first successful invocation • There is a cold-start and scaling up/down time
  21. © 2025, Amazon Web Services, Inc. or its affiliates. Step

    2: Import DeepSeek-R1-Distill model (cont’d)
  22. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on Amazon Bedrock Amazon Bedrock Marketplace Custom Model Import SageMaker JumpStart 2 3 4 Fully Managed Serverless Model 1
  23. © 2025, Amazon Web Services, Inc. or its affiliates. You

    are always in control of your data None of the customer’s data is used to train the underlying model Data remains in the Region where the API is processed Support for GDPR, SOC, ISO, CSA compliance, and HIPAA eligibility
  24. © 2025, Amazon Web Services, Inc. or its affiliates. Critical

    Concerns • Models hosted by AWS without any communication with DeepSeek servers or APIs • No customer data used to improve base models • Enterprise-grade data protection capabilities • Privacy control through AWS services
  25. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    Region network Client account Customer VPC Corporate network Client API endpoint Client Amazon Bedrock service account Amazon Bedrock service Amazon Bedrock Client connectivity
  26. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    Region network Client account Customer VPC Corporate network Internet Client API endpoint Client Amazon Bedrock service account Amazon Bedrock service Internet gateway Amazon Bedrock Client connectivity
  27. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    Region network Client account Customer VPC AWS PrivateLink aka VPC endpoint Corporate network Client AWS Direct Connect API endpoint Client Amazon Bedrock service account Amazon Bedrock service Amazon Bedrock Client connectivity
  28. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    Bedrock Integration Choice Customization Security and governance
  29. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Securely Use DeepSeek Models on AWS Amazon Bedrock Amazon SageMaker AI API Layer Amazon Bedrock Foundation Models Prompt / text embeddings Fine-tune SageMaker Training and Inference Prompt / text embeddings API Layer SageMaker Endpoint Foundation Models SageMaker Jumpstart Model hub, deploy, fine-tune Accelerated Computing Trn1(n), Inf2, P4d, P5 Fine-tune
  30. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. Amazon SageMaker AI Build, train, and deploy ML models at scale, including FMs
  31. © 2025, Amazon Web Services, Inc. or its affiliates. Deploy

    DeepSeek models with Amazon SageMaker AI SageMaker Endpoint SageMaker JumpStart 1 2
  32. © 2025, Amazon Web Services, Inc. or its affiliates. Model

    Deployment on Amazon SageMaker AI Single model deployment Single container Multi-container Invoke Response Inference Pipelines Real-time synchronous response Serverless GPUs CPUs Near real-time asynchronous response Invoke Response Offline batch inference Submit Complete Amazon SageMaker AI Multi-Model deployment Model Container Infrastructure
  33. © 2025, Amazon Web Services, Inc. or its affiliates. A

    strong partnership between AWS and Hugging Face Hugging Face is the most popular Open Source company providing state of the art NLP technology Hugging Face SageMaker offers high performance resources to train and use NLP Models AWS https://huggingface.co/ https://aws.amazon.com/sagemaker/
  34. © 2025, Amazon Web Services, Inc. or its affiliates. Large

    Model Inference (LMI) container Large ML models with 100 billion + parameters Easily parallelize models across multiple GPUs to fit models into the instance and achieve low latency Deploy models on the most performant and cost- effective GPU-based instances or on AWS Inferentia Leverage 500GB of Amazon EBS volume per endpoint
  35. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment Hosting Services Inference Image Training Image Training Data Model artifacts Amazon SageMaker Amazon S3 Amazon ECR
  36. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment Hosting Services Inference Image Training Image Training Data Model artifacts Amazon SageMaker Amazon S3 Amazon ECR Model artifacts
  37. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment Hosting Services Inference Image Training Image Training Data Model artifacts Amazon SageMaker Amazon S3 Amazon ECR Model artifacts Inference Image
  38. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment Hosting Services Inference Image Training Image Training Data Model artifacts Endpoint Amazon SageMaker Amazon S3 Amazon ECR Model artifacts Inference Image
  39. © 2025, Amazon Web Services, Inc. or its affiliates. Deploy

    model to SageMaker Real-time Endpoint
  40. © 2025, Amazon Web Services, Inc. or its affiliates. Deploy

    model to SageMaker Real-time Endpoint model.tar.gz ├ model.py └ serving.properties
  41. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment SageMaker Endpoints (Private API) Auto Scaling group Availability Zone 1 Availability Zone 2 Availability Zone 3 Elastic Load Balancing Model Endpoint Client Deployment / Hosting Amazon SageMaker ML Compute Instances Input Data (Request) Prediction (Response)
  42. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker Deployment SageMaker Endpoints (Public API) Auto Scaling group Availability Zone 1 Availability Zone 2 Availability Zone 3 Elastic Load Balancing Model Endpoint Amazon API Gateway Client Deployment / Hosting Amazon SageMaker ML Compute Instances Input Data (Request) Prediction (Response)
  43. © 2025, Amazon Web Services, Inc. or its affiliates. Deploy

    DeepSeek models with Amazon SageMaker AI SageMaker Endpoint SageMaker JumpStart 1 2
  44. © 2025, Amazon Web Services, Inc. or its affiliates. Machine

    learning (ML) hub with foundation models (FMs), built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks Amazon SageMaker Jumpstart − Publicly available FMs − Built-in ML algorithms − Customizable solutions − Supports collaboration
  45. © 2025, Amazon Web Services, Inc. or its affiliates. 60

    Deploy from SageMaker JumpStart (cont’d)
  46. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on Amazon Bedrock Amazon Bedrock Marketplace Custom Model Import SageMaker JumpStart 2 3 4 Fully Managed Serverless Model 1
  47. © 2025, Amazon Web Services, Inc. or its affiliates. Use

    Bedrock tooling with SageMaker JumpStart Models Client Amazon SageMaker JumpStart Amazon Bedrock access using SageMaker SDK, Boto3 access using Bedrock API
  48. © 2025, Amazon Web Services, Inc. or its affiliates. Use

    Bedrock tooling with SageMaker JumpStart Models
  49. © 2025, Amazon Web Services, Inc. or its affiliates. Use

    Bedrock tooling with SageMaker JumpStart Models
  50. © 2025, Amazon Web Services, Inc. or its affiliates. Use

    Bedrock tooling with SageMaker JumpStart Models
  51. © 2025, Amazon Web Services, Inc. or its affiliates. Benefits

    of using Bedrock tooling with SageMaker JumpStart Models Client Amazon SageMaker JumpStart Amazon Bedrock access using SageMaker SDK, Boto3 access using Bedrock API Amazon Bedrock Guardrails Knowledge Bases for Amazon Bedrock Amazon Bedrock Agents …
  52. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on AWS: A One-Page Guide (2) Custom Model Import 3 (3) Bedrock + SageMaker JumpStart (1) SageMaker JumpStart 1 (2) SageMaker Endpoint 2 Amazon Bedrock Amazon Bedrock Marketplace Amazon S3 Client Amazon S3 Hugging Face Amazon SageMaker Endpoint Client Amazon SageMaker JumpStart Client Amazon SageMaker JumpStart Amazon Bedrock 4 (1) AWS Marketplace 2 (1) Fully Managed Serverless Model 1 User
  53. © 2025, Amazon Web Services, Inc. or its affiliates. Challenges

    Security Cost Operational Excellence ML App interface
  54. • AWS protects model tuner/consumer’s data • AWS protects model

    provider’s IP • Proprietary model package and endpoint is hosted in SageMaker/Bedrock owned escrow account • Containers have no outbound network access Security
  55. © 2025, Amazon Web Services, Inc. or its affiliates. Challenges

    Security Cost Operational Excellence ML App interface
  56. © 2025, Amazon Web Services, Inc. or its affiliates. Save

    costs by deploying on Amazon SageMaker Infrastructure cost Operations cost Infrastructure cost Operations cost Security and compliance cost • Compute instances • Storage • Network Operating, managing, and maintaining infrastructure Security and compliance for ML features, encrypt data and models, access policies, track and trace Deploy on SageMaker Self-managed deployment on Amazon EKS or Amazon ECS
  57. © 2025, Amazon Web Services, Inc. or its affiliates. Challenges

    Security Cost Operational Excellence ML App interface
  58. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    AI Chips AWS for generative AI AWS Inferentia AWS Trainium AWS Trainium2 Lowest cost Best price performance Lowest cost to train up to 70b models Highest performance for frontier models AWS Inferentia2
  59. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    Inferentia2: High performance, less power, lower cost R E A L - T I M E D E P L O Y M E N T B E R T - L A R G E W I T H A W S I N F E R E N T I A 2 50% Fewer instances GPU Instances Inf2.2xl Instances Number of instances 50% Less energy GPU Instances Inf2.2xl Watts Power 65% Lower cost GPU Instances Inf2.2xl USD Inference cost
  60. © 2025, Amazon Web Services, Inc. or its affiliates. AWS

    Trainium2: Highest performance for frontier models L L M T R A I N I N G P E R F O R M A N C E Lower cost-to-train Step Time 12.9 Amazon EC2 P5 Instances F u j i 7 0 B T R A I N I N G Amazon EC2 Trn2 Instances JAX/AXLearn framework, 64 node cluster 8.7
  61. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. Amazon SageMaker HyperPod Scale and accelerate generative AI model development across thousands of AI accelerators
  62. © 2025, Amazon Web Services, Inc. or its affiliates. Model

    Builder Model Consumer Model Tuner Use FMs out-of-the-box Finetune FMs for specific domain/workload Build FMs or retrain open source FMs from scratch Low entry cost and complexity, faster TTM Strong control and flexibility Customer Pathways with Foundation Models
  63. © 2025, Amazon Web Services, Inc. or its affiliates. Low

    entry cost and complexity, faster TTM Strong control and flexibility Re-train new FM models using DeepSeek R1 Finetune DeepSeek R1 and R1 distilled Deploy DeepSeek R1 and R1 distilled Use FMs out-of-the-box Finetune FMs for specific domain/workload Build FMs or retrain open source FMs from scratch Customer Pathways with SageMaker AI for DeepSeek
  64. © 2025, Amazon Web Services, Inc. or its affiliates. Build

    models will require Scale this Single instance to this . . . . . . . . . . . . . .
  65. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker AI for Large Scale FM Training SageMaker HyperPod SageMaker Training Jobs Customer is proficient with SLURM & EKS AND Customer wants to persistent cluster & ability to customize and manage orchestration Customer wants a managed user experience with ephemeral clusters & focus on ML (to accelerate time to market) OR Customer needs access to flexible on- demand GPU cluster v You can fine-tune DS R1 and R1 distilled models using your choice of libraries
  66. © 2025, Amazon Web Services, Inc. or its affiliates. SageMaker

    HyperPod example architecture Amazon S3 Head Node VPC Amazon FSx for Lustre Training data Training data Copy Once mount EFA(*) Slurm or EKS(k8s) Compute Nodes * EFA: Elastic Fabric Adapter
  67. © 2025, Amazon Web Services, Inc. or its affiliates. Benefits

    of SageMaker HyperPod YEAR 1957 2012 2014 2016 2018 2019 2020 2021 … … … Model size (# of parameters) VGG16 138M YOLO, GNMT 210M BERT-L 340M GPT-2 1.5B GPT-3 175B 2023 Perceptron 1 Alexnet 62M SWITCH-C 1.6T Ease of use & flexibility Resilience Performance • 훈련 시간 단축 및 대규모 분산 훈련 • 클러스터의 컴퓨팅, 메모리 및 네트워크 리소스 활용 최적화 • 자동 클러스터 상태 확인 및 복구 • Checkpoint 저장 및 자동 복구 기능 • 대규모 훈련 클러스터의 분산 훈련 간소화 • Slurm 또는 Amazon EKS를 통한 유연한 워크로드 관리
  68. © 2025, Amazon Web Services, Inc. or its affiliates. Amazon

    SageMaker HyperPod Recipes R U N F M P R E - T R A I N I N G A N D F I N E - T U N I N G W I T H A S I N G L E L I N E O F C O D E Open Source implementation Launcher scripts and recipes collection Built on NVIDIA NeMo foundations (launcher, configuration hierarchy) Over 30 recipes to get started SageMaker-optimized models (GPU) Neuron-optimized models (Trainium) Native NeMo models Custom models
  69. © 2025, Amazon Web Services, Inc. or its affiliates. Training

    plans Today (3/5) Segment 1 10 instances 7 days Segment 2 10 instances 7 days “Create a training plan with 10 instances of ml.p5.48xlarge for 14 days starting 3/10” 3/10 3/16 3/20 3/26
  70. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. Summary
  71. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on AWS: A One-Page Guide (2) Custom Model Import 3 (3) Bedrock + SageMaker JumpStart (1) SageMaker JumpStart 1 (2) SageMaker Endpoint 2 Amazon Bedrock Amazon Bedrock Marketplace Amazon S3 Client Amazon S3 Hugging Face Amazon SageMaker Endpoint Client Amazon SageMaker JumpStart Client Amazon SageMaker JumpStart Amazon Bedrock 4 (1) AWS Marketplace 2 (1) Fully Managed Serverless Model 1 User
  72. © 2025, Amazon Web Services, Inc. or its affiliates. How

    to Use DeepSeek Models on AWS Model Consumer Model Tuner or Builder Pre-trained Model Custom Model Amazon SageMaker JumpStart Amazon Bedrock Amazon SageMaker Training Job Amazon SageMaker Endpoint Amazon SageMaker HyperPod
  73. © 2025, Amazon Web Services, Inc. or its affiliates. Key

    Benefits of Leveraging DeepSeek Models on AWS Security ML App interface Operational Excellence through Separation of Concerns Cost saving opportunity in production
  74. © 2025, Amazon Web Services, Inc. or its affiliates. (AWS

    Blog) DeepSeek R1 on AWS Hosting DeepSeek models on SageMaker Call to Action SageMaker HyperPod Workshop
  75. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. 여러분의 소중한 피드백을 기다립니다. 강연 종료 후, 강연 평가에 참여해주세요!
  76. © 2025, Amazon Web Services, Inc. or its affiliates. ©

    2025, Amazon Web Services, Inc. or its affiliates. 감사합니다