Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reducing the Cost of your Data Science Workload...

Reducing the Cost of your Data Science Workloads on the Cloud

By leveraging cloud computing resources, you can pay for just the computing power you need, when you need it. Additionally, GPU acceleration can significantly decrease the amount of time you need computing resources, reducing your overall cost.

We'll discuss RAPIDS, an open-source collection of Python libraries that offer exceptional speed in data science tasks. RAPIDS provides familiar APIs from popular PyData libraries, making it simple to run your data science workloads in the cloud. By using RAPIDS, you can scale your workload and get things done faster and more efficiently.

Jacob Tomlinson

March 19, 2024
Tweet

More Decks by Jacob Tomlinson

Other Decks in Technology

Transcript

  1. 1 Reducing the Cost of your Data Science Workloads on

    the Cloud Jacob Tomlinson Senior Software Engineer RAPIDS GTC 2024
  2. 2 Types of Cost Many types of cost can be

    reduced with accelerating computing • Infrastructure Cost • The cost of the hardware and infrastructure required to perform business operations. • Measured in dollars. • Human Cost • The time and effort people need to put in to achieve a goal. • Measured in hours and dollars. • Environmental Cost • The environmental costs associated with all of the above. • Measured in grams of CO2. • Computational Cost • How much compute power is required to perform a specific operation. • Measured in watts.
  3. 4 4 Example: Accelerating pandas with cudf.pandas • Pandas is

    the most popular PyData dataframe library • Pandas is great (but slow) • Why is it slow? o Largely single-threaded o Not a query engine! • Many alternatives : o Faster underlying implementation (C++, Rust, CUDA) o Query engines o SQL-inspired o Distributed computing o Hardware accelerated (GPUs) Results of the H2O.ai benchmark maintained by DuckDB: https://duckdblabs.github.io/db-benchmark/
  4. 5 5 What is cudf.pandas? • Lets you keep using

    pandas o Accelerates it on the GPU with no changes • 100% of the pandas API o Uses the GPU for supported operations o Falls back to using the CPU otherwise • 3rd-party code acceleration o Everything is accelerated. No one changes their code Jupyter/IPython: %load_ext cudf.pandas Command line: python –m cudf.pandas script.py Direct import: import cudf.pandas cudf.pandas.install()
  5. 6 150x Faster pandas with Zero Code Change DuckDB Data

    Benchmark, 5GB Performance comparison between Traditional pandas v1.5 on Intel Xeon Platinum 8480CL CPU and pandas v1.5 with RAPIDS cuDF on NVIDIA Grace Hopper Source: https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/
  6. 8 RAPIDS Deployment Models Scales from sharing GPUs to leveraging

    many GPUs at once Single Node Multi Node Shared Node Scale up interactive data science sessions with NVIDIA accelerated tools like cudf.pandas Scale out processing and training by leveraging GPU acceleration in distributed frameworks like Dask and Spark Scale out AI/ML APIs and model serving with NVIDIA Triton Inference Server and the Forest Inference Library
  7. 9 RAPIDS in the Cloud Current Focus Areas • NVIDIA

    DGX™ Cloud • Kubernetes • Helm Charts • Operator • Kubeflow • Cloud AI/ML Platforms • Amazon Sagemaker Studio • Google Vertex AI • Cloud Compute • Amazon EC2, ECS, Fargate, EKS • Google Compute Engine, Dataproc, GKE • AI and Machine Learning examples gallery RAPIDS Deployment documentation website docs.rapids.ai/deployment/stable
  8. 10 RAPIDS on Managed Notebook Platforms Serverless Jupyter in the

    cloud Example screenshot from Vertex AI documentation https://docs.rapids.ai/deployment/stable/cloud/gcp/vertex-ai/
  9. 11 RAPIDS on Compute pipelines Data processing services Example from

    AWS EMR documentation https://docs.nvidia.com/spark-rapids/user-guide/latest/getting-started/aws-emr.html
  10. 12 RAPIDS on Virtual Machines Servers and workstations in the

    cloud Example from Azure Virtual Machine documentation https://docs.rapids.ai/deployment/stable/cloud/azure/azure-vm/
  11. 13 GPU Operator Kubernetes GPU GPU GPU GPU GPU GPU

    GPU GPU RAPIDS on Kubernetes Unified Cloud Deployments
  12. 15 RAPIDS runs your workloads faster How do you want

    to spend those gains? Reduce cost Reduce the amount of time you need to run servers. Beneficial for reducing cloud costs. Do more work Run more workloads for the same time/cost. Process things that were not possible before. Performance boost Get work done faster. May help give a competitive advantage or reduce pressure on SLAs. Environment impact Reduce power needed to perform the same calculation. Using less power produces less CO2. Reduce context switching Reduce time people need to wait for calculations to complete which helps avoid switching to a different task. Improve accuracy Acceleration could allow for more iterations or to process more data leading to improved model accuracy
  13. 17 Lightning-Fast End-to-End Performance Reducing Data Science Processes from Hours

    to Seconds *CPU approximate to n1-highmem-8 (8 vCPUs, 52GB memory) on Google Cloud Platform. TCO calculations-based on Cloud instance costs. A100s Provide More Power than 100 CPU Nodes 16 More Cost-Effective than Similar CPU Configuration 20x Faster Performance than Similar CPU Configuration 70x
  14. 19 Accelerated Analytics Cuts Costs and Carbon “A 2023 benchmark

    showed that the RAPIDS Accelerator can reduce a company’s carbon footprint by as much as 80% while delivering 5x average speedups and 4x reductions in computing costs.” RAPIDS Accelerator for Apache Spark https://blogs.nvidia.com/blog/spark-rapids-energy-efficiency/
  15. 21 Sharing resources with multi-tenancy Smoothing out demand peaks with

    shared capacity Using Kubernetes we created an autoscaling cluster for interactive Jupyter sessions. Users only use GPUs when they are running computations. The cluster keeps some reserved GPU capacity so that user computations are fulfilled quickly. An overhead of 30% meant that 60% of user computations started within 2 seconds, and 90% within 60 seconds. This can be tuned to suit your needs, more overhead capacity results in reduced wait times. Whatever your preference your cost is always correlated to your compute demand. https://docs.rapids.ai/deployment/stable/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook/
  16. 22 Recap Reducing the Cost of your Data Science Workloads

    on the Cloud • Accelerated RAPIDS libraries can give 150x speedup with zero code changes • Using NVIDIA accelerated hardware on the cloud can reduce costs • Different businesses prefer to reduce capital, environmental and human costs differently • RAPIDS + GPU Cloud computing allows you to tune benefits to suit your goals