Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying Models to production with Azure ML | ...

Deploying Models to production with Azure ML | Scottish Summit

Deploying models is a pretty important aspect to consider while building end-to-end ML applications. I first plan to show how the models could be registered with Azure ML so as to make them accessible and allow them to be loaded for deployment. I then plan to show how configurations could be built for deploying the models with Azure ML. Azure ML allows us to easily deploy models to receive low latency real-time inferences which are required for a lot of applications, so I would majorly focus on this and also show how one could consume these models. I would further show how to also build batch inference pipelines. If time persists I would also show demos for the same.

Rishit Dagli

May 11, 2021
Tweet

More Decks by Rishit Dagli

Other Decks in Programming

Transcript

  1. #ScottishSummit2021 R i s h i t D a g

    l i D e p l o y i n g M o d e l s t o P r o d u c t i o n w i t h A z u r e M L M e t h v e n , S a t 1 7 : 3 0 @rishit_dagli Rishit-dagli www.rishit.tech
  2. S o u r c e : L a u

    r e n c e M o r o n e y
  3. • High School Student • TEDx and Ted-Ed Speaker •

    ♡Hackathons and competitions • ♡Research • My coordinates -www.rishit.tech $whoami
  4. • Devs who have worked on creating Machine Learning Models

    • Devs looking for ways to put their model into production ready manner Ideal Audience
  5. • Package the model • Post the model on Server

    • Maintain the server What things to take care of?
  6. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale What things to take care of?
  7. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale What things to take care of?
  8. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale o Global Availability What things to take care of?
  9. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale o Global Availability o Latency What things to take care of?
  10. • Package the model • Post the model on Server

    • Maintain the server • API What things to take care of?
  11. • Package the model • Post the model on Server

    • Maintain the server • API • Model Versioning What things to take care of?
  12. • Package the model • Post the model on Server

    • Maintain the server • API • Model Versioning • Batch Predictions What things to take care of?
  13. Simple Deployments Why are they Inefficient? • No consistent API

    • No model versioning • No mini-batching • Inefficient for large models Source: Hannes Hapke
  14. What do we need? • Register Your Model • Load

    the Model • Perform Inference • Deploy the model
  15. What do we need? • Register Your Model • Load

    the Model • Perform Inference Do it at Scale
  16. Set up an environment Customizable • Can use a Docker

    Image directly • Can manage the dependencies yourself too • Can specify a custom interpreter • Customizable Spark Settings
  17. Inference with gRPC • Better connections • Data converted to

    protocol buffer • Request types have designated type • Payload converted to base64 • Use gRPCstubs
  18. Batch Inferences • Use hardware efficiently • Save costs and

    compute resources • Take multiple requests process them together • Super cool😎for large models