Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Continuous Delivery for Machine Learning System...

Adarsh Shah
November 19, 2020

Continuous Delivery for Machine Learning Systems - DevOpsDays Warsaw

Machine Learning workflow includes data management, experiment management (model training & development), model deployment, serving, and retraining. Training a model takes hours & some times days & typically deals with a large dataset. Training & serving a model also require special resources like high-density cores & GPUs.

In this talk, we will look at how Continuous Delivery for Machine Learning looks like using anecdotes and how to use cloud-native technologies to perform various steps in a Machine Learning workflow. We will also be talking about how it is different from deploying other software and what are the various aspects to consider. We will also be looking at different tools available to enable Continuous Delivery for machine learning.

Adarsh Shah

November 19, 2020
Tweet

More Decks by Adarsh Shah

Other Decks in Technology

Transcript

  1. Continuous Delivery for Machine Learning Systems Deploying ML Systems to

    Production safely and quickly in a sustainable way Adarsh Sha h Engineering Leader, Coach, Hands-on Architec t Independent Consultan t @shahadarsh 
 https://shahadarsh.com Deck: http://bit.ly/ml-dod-pl
  2. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Hidden Technical Debt in ML Systems

    From the paper Hidden Technical Debt in Machine Learning Systems
  3. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl 1 0 1 0 1 0

    1 0 1 Results Traditional Software Development Machine Learning Program Data { } 1 0 1 0 1 0 1 0 1 Desired Results Model Training Data { } Program { } 1 0 1 0 1 0 1 0 1 Live Data Training Prediction Results
  4. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Data Acquisition Data Preparation Model Development

    Training Prediction Accuracy Evaluation Data Management Experimentation Production Deployment Validation Monitoring / Alerting Accuracy not reached Retrain Data Drift Fix Accuracy reached
  5. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl #2: Experimentation Code Quality Research &

    
 Experimentation Tracking experiments Training Time 
 & Troubleshooting Infrastructure 
 Requirements Model Accuracy Evaluation
  6. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl What is Continuous Delivery? Continuous Delivery

    is the ability to get changes of all types—including new features, con f i guration changes, bug f i xes and experiments—into production, or into the hands of users, safely and quickly in a sustainable way . - Jez Humble & Dave Farley 
 (Continuous Delivery Book Authors)
  7. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Continuous Integration Continuous Integration is a

    software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day . - Martin Fowler
  8. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Principles of Continuous Delivery ๏ Build

    quality i n ๏ Work in small batche s ๏ Computers perform repetitive tasks, people solve problem s ๏ Relentlessly pursue continuous improvement (Kaizen ) ๏ Everyone is responsible
  9. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Data pipeline Dat a Source A

    Dat a Source B Dat a Source C Data Acquisition A Data Validation
 A Data Preparation
 A Training 
 Dataset Versioned Training Process Testing Data Acquisition B Data Validation
 B Data Preparation
 B Data Acquisition C Data Validation
 C Data Preparation
 C Bias & Fairness — — Security 
 & Compliance
  10. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Static Analysis Unit Tests Training Code

    Linting etc. Artifact Repository Build Artifact Continuous Integration (Training Code) Dev Environment Validation Tests Merge to 
 Main Branch
  11. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Data Pipeline Continuous Integration 
 (Training

    Code) Con f i guration Training 
 Dataset Training Environment Accuracy Evaluation Monitoring/ Alerting Testing (Bias & Fairness) Model Trigger Log Aggregation Automated 
 Provisioning/De-provisioning Data Scientist Training
  12. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Static Analysis Unit Tests Application Code

    Linting, Security Scan etc. Artifact Repository Build Artifact Ephemeral Environment Integration Tests Tag as Tested Model Continuous Integration (Application Code) Training
  13. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Data Management Experimentation Production Deployment Data

    Pipeline Continuous Integration 
 (Training Code) Data Scientist Con f i guration Training Model Continuous Integration 
 (Application Code) Deployment Production Environment Smoke Tests Monitoring /Alerting Application 
 Developer Bringing it all together Training 
 Dataset
  14. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl References • continuousdelivery.com • Dr. Deming’s

    14 Points for Management • Challenges Deploying Machine Learning Models to Production • State of DevOps Report • martinfowler.com • Large image datasets: A pyrrhic win for computer vision?
  15. https://shahadarsh.com @shahadarsh Deck: http://bit.ly/ml-dod-pl Adarsh Sha h Engineering Leader, Coach,

    Hands-on Architec t Independent Consultan t @shahadarsh 
 https://shahadarsh.com