Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Anaconda Project and JupyterLab

Anaconda Project and JupyterLab

Data Science encapsulation and deployment, JupyterCON 2017

Christine Doig

August 25, 2017
Tweet

More Decks by Christine Doig

Other Decks in Technology

Transcript

  1. © 2016 Continuum Analytics - Confidential & Proprietary © 2017

    Continuum Analytics - Confidential & Proprietary Data Science encapsulation and deployment with Anaconda Project and JupyterLab Christine Doig, Senior Product Manager and Data Scientist Continuum Analytics
  2. © 2017 Continuum Analytics - Confidential & Proprietary • Challenges

    in data science reproducibility and deployment • Encapsulating your data science with Anaconda Project • Using Anaconda Project with JupyterLab • Anaconda Project & JupyterLab powering Anaconda Enterprise v5 Agenda 2
  3. © 2017 Continuum Analytics - Confidential & Proprietary 4 Laptop

    Data Science Development scikit-learn Bokeh Tensorflow Jupyter pandas matplotlib seaborn dask numba script 1 script 2 notebook A dataset Z script 3 Python, R Reproducibility
  4. © 2017 Continuum Analytics - Confidential & Proprietary 5 Workflows

    Data Query Visualize Clean & Tidy Predict, Simulate, & Optimize R P In N In A P M Interactive data visualizations and dashboards Jupyter notebooks Scripts Predictive models Processed Data Deployment
  5. © 2017 Continuum Analytics - Confidential & Proprietary • Data

    Scientists work in different platforms: Windows, macOS, Linux • Data science development environments different than deployment environments • Data science dependencies are more than just software packages: data, variables, commands, services • Managing software packages: versions, build, channel • Data scientists are not necessarily software developers. Current deployment tools are very focused on serving developers • There is a variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… Challenges in Data Science reproducibility and deployment 6
  6. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Data Science Development Anaconda Distribution Anaconda Distribution & conda make data science reproducibility and development easier Laptop / Desktop / Server conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Data Science Reproducibility & Deployment Anaconda Distribution Docker container Windows, macOS, Linux Windows, macOS, Linux
  7. © 2017 Continuum Analytics - Confidential & Proprietary • Manage

    software packages across platforms: Windows, macOS, Linux • Isolate and recreate environments With Anaconda and conda, Data Scientists can: 9
  8. © 2017 Continuum Analytics - Confidential & Proprietary 10 Introducing

    Anaconda Project, now available in Anaconda Distribution
  9. © 2017 Continuum Analytics - Confidential & Proprietary 11 Anaconda

    Project Data science portable encapsulation anaconda-project.yml • Define and manage: • deployment commands • downloads and data • project package dependencies • multiple enviroments • environment variables (with encryption)
  10. © 2017 Continuum Analytics - Confidential & Proprietary 12 Anaconda

    Project Data science portable encapsulation • Lock your environments: • package versions, down to the build numbers • platforms • packages by platform Note: This file is automatically generated for you by Anaconda Project
  11. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop Laptop / Desktop / Server Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Reproducibility & Deployment Windows, macOS, Linux Windows, macOS, Linux Anaconda Project brings additional capabilities for data science reproducibility and development
  12. © 2017 Continuum Analytics - Confidential & Proprietary With Anaconda

    Projects, Data Scientists can: 14 • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker)
  13. © 2016 Continuum Analytics - Confidential & Proprietary Laptop /

    Desktop Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Reproducibility, Development and Deployment Anaconda Enterprise Container 1 Container 2 Container 3 Container 4 Anaconda Enterprise makes project collaboration and deployment secure and scalable
  14. © 2017 Continuum Analytics - Confidential & Proprietary 16 Project

    1 Project 2 Deploy Notebooks Models - REST APIs Dashboards Applications
  15. © 2017 Continuum Analytics - Confidential & Proprietary With Anaconda

    Enterprise, Data Scientists can: 19 • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users
  16. © 2017 Continuum Analytics - Confidential & Proprietary 21 JupyterLab

    is the default experience in Anaconda Enterprise
  17. © 2017 Continuum Analytics - Confidential & Proprietary 22 Anaconda

    Project Lab extension • Manage your Anaconda Project dependencies from a graphical interface inside JupyterLab Nicholas Bollweg Github: bollwyvl
  18. © 2017 Continuum Analytics - Confidential & Proprietary • Anaconda

    Distribution and conda: • Manage software packages across platforms: Windows, macOS, Linux • Isolate and recreate environments • Anaconda Project: • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker) • Anaconda Enterprise: • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users Anaconda helps Data Scientists reproduce and deploy their projects 24