Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GPU Acceleration in the PyData community

GPU Acceleration in the PyData community

Avatar for Jacob Tomlinson

Jacob Tomlinson

November 12, 2024
Tweet

More Decks by Jacob Tomlinson

Other Decks in Technology

Transcript

  1. 1 GPU Acceleration in the PyData community Jacob Tomlinson, Dask

    Maintainer and RAPIDS Developer Pangeo CNES 2024
  2. 2 Modern Applications Need Accelerated Computing Petabyte scale data |

    Massive models | Real-time performance LLMs Forecasting Fraud Detection Genomic Analysis Cybersecurity Single-threaded perf 1.5X per year 1.1X per year 102 103 104 105 106 107 101 ACCELERATED COMPUTING Recommenders
  3. 3 Accelerated Computing Swim Lanes RAPIDS makes accelerated computing more

    seamless while enabling specialization for maximum performance
  4. 4 100x faster feature engineering 20x faster model training Increased

    forecast accuracy RAPIDS | Dask | XGBoost Processing relationships between 10 million biological entities through more than a billion edges. cuGraph 70% Cost savings 33% Performance improvement RAPIDS Accelerator for Apache Spark RAPIDS Adopted Across Industries
  5. Bringing NVIDIA accelerated computing to Polars Polars GPU Engine Powered

    by RAPIDS cuDF https://developer.nvidia.com/blog/polars-gpu-engine-powered-by-rapids-cudf-now-available-in-open-beta/
  6. 7 Accelerated Dask Just set “cudf” and “cupy” as the

    backend and use Dask-CUDA Workers • Configurable Backend and GPU-Aware Workers • Memory Spilling (GPU->CPU->Disk) • Optimized Memory Management • Accelerated RDMA and Networking (UCX) • Community tools like xarray-cupy
  7. 8 Accelerated Apache Spark Zero code change acceleration for Spark

    DataFrames and SQL spark.sql(""" select order count(*) as order_count from orders""" ) spark.conf.set("spark.plugins", "com.nvidia.spark.SQLPlugin") spark.sql(""" select order count(*) as order_count from orders""" ) CPU Spark GPU Spark Average Speed-Ups: >5x • Operates as a software plugin to popular Apache Spark platform • Automatically accelerates supported operations (with CPU fallback if needed) • Requires no code changes • Works with Spark standalone, YARN clusters, Kubernetes clusters • Deploy on: Apache Spark 3.4.1, RAPIDS Spark release 24.04 See GTC session S62257 for details NVIDIA Decision Support Benchmark 3TB (Public Cloud) Amazon EMR Google Cloud Dataproc
  8. 9 cuML Accelerated machine learning with a scikit-learn API >>>

    from sklearn.ensemble import RandomForestClassifier >>> clf = RandomForestClassifier() >>> clf.fit(x, y) >>> from cuml.ensemble import RandomForestClassifier >>> clf = RandomForestClassifier() >>> clf.fit(x, y) GPU CPU Scikit-learn cuML Time Series Preprocessing Classification Tree Models Cross Validation Clustering Explainability Dimensionality Reduction Regression 50+ GPU-Accelerated Algorithms A100 GPU vs. AMD EPYC 7642 (96 logical cores) cuML 23.04, scikit-learn 1.2.2, umap-learn 0.5.3
  9. 10 Accelerated NetworkX nx-cugraph: the zero-code change GPU backend for

    NetworkX • Zero-code-change GPU-acceleration of for NetworkX code • Accelerates algorithms up to 600x, based on algorithm and graph size • Support for 60 popular graph algorithms and growing • Falls back to using CPU NetworkX for unsupported algorithms NetworkX 3.2, CPU: Intel(R) Xeon(R) Platinum 8480CL 2TB, GPU: NVIDIA H100 80GB pip install nx-cugraph-cu12 --extra-index-url https://pypi.nvidia.com conda install -c rapidsai -c conda-forge -c nvidia nx-cugraph