Continuum Analytics - Confidential & Proprietary Data Science encapsulation and deployment with Anaconda Project and JupyterLab Christine Doig, Senior Product Manager and Data Scientist Continuum Analytics
in data science reproducibility and deployment • Encapsulating your data science with Anaconda Project • Using Anaconda Project with JupyterLab • Anaconda Project & JupyterLab powering Anaconda Enterprise v5 Agenda 2
Data Science Development scikit-learn Bokeh Tensorflow Jupyter pandas matplotlib seaborn dask numba script 1 script 2 notebook A dataset Z script 3 Python, R Reproducibility
Data Query Visualize Clean & Tidy Predict, Simulate, & Optimize R P In N In A P M Interactive data visualizations and dashboards Jupyter notebooks Scripts Predictive models Processed Data Deployment
Scientists work in different platforms: Windows, macOS, Linux • Data science development environments different than deployment environments • Data science dependencies are more than just software packages: data, variables, commands, services • Managing software packages: versions, build, channel • Data scientists are not necessarily software developers. Current deployment tools are very focused on serving developers • There is a variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… Challenges in Data Science reproducibility and deployment 6
Project Data science portable encapsulation • Lock your environments: • package versions, down to the build numbers • platforms • packages by platform Note: This file is automatically generated for you by Anaconda Project
Desktop Laptop / Desktop / Server Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Reproducibility & Deployment Windows, macOS, Linux Windows, macOS, Linux Anaconda Project brings additional capabilities for data science reproducibility and development
Projects, Data Scientists can: 14 • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker)
Enterprise, Data Scientists can: 19 • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users
Distribution and conda: • Manage software packages across platforms: Windows, macOS, Linux • Isolate and recreate environments • Anaconda Project: • Export environments (with pinned versions) cross-platform • Manage other dependencies: data, variables, commands, services • Get the right abstraction for data scientists (e.g. Docker) • Anaconda Enterprise: • Easily deploy projects a wide variety of “deployment” types: notebooks, REST APIs, dashboards, web apps… • Collaborate with other data scientists • Securily share deployments with other applications and users Anaconda helps Data Scientists reproduce and deploy their projects 24