cudf.pandas: the Zero Code Change GPU Accelerator for pandas

cudf.pandas: the Zero Code Change GPU Accelerator for pandas Jacob
Tomlinson, Senior Software Engineer | PyData Exeter Meetup Feb 2023

NVIDIA RAPIDS PyData Team OSS PyData maintainers hired by NVIDIA
to make GPU acceleration ubiquitous and many more…

• Pandas and alternatives • Demo • How it works
• FAQs and limitations • Conclusion Outline

Pandas and alternatives • Pandas is great (but slow) •
Why is it slow? o Largely single-threaded o Not a query engine! • Many alternatives : o Faster underlying implementation (C++, Rust, CUDA) o Query engines o SQL-inspired o Distributed computing o Hardware accelerated (GPUs) Results of the H2O.ai benchmark maintained by DuckDB: https://duckdblabs.github.io/db-benchmark/

cuDF: GPU DataFrames • Pandas-like API, runs on the GPU
• Powered by CUDA and libcudf, a C++ DataFrame library for GPUs • Operations are ~10-100x faster than pandas • Provides 60-75% of the pandas API • Not what this talk is about! cuDF speedups relative to Pandas for a number of different operations NVIDIA A100, AMD EPYC 7642 48-Core Processor

"Should I switch from pandas to something else?" • Reasons
to use something other than pandas: o Performance above all o Data size o Rewriting code ≠ problem • Reasons to use pandas: o API flexibility o Collaboration o Ecosystem built on pandas o pandas is getting faster

What is cudf.pandas? • Lets you keep using pandas o
Accelerates it on the GPU with no changes • 100% of the pandas API o Uses the GPU for supported operations o Falls back to using the CPU otherwise • 3rd-party code acceleration o Everything is accelerated. No one changes their code Jupyter/IPython: %load_ext cudf.pandas Command line: python –m cudf.pandas script.py Direct import: import cudf.pandas cudf.pandas.install()

Demo Time https://github.com/shwina/pydata-global-2023-demo

Demo Recap • Accelerates your code on GPUs with no
changes • Key to getting good performance is to minimize CPU execution • 3rd party libraries written to use pandas can be accelerated on the GPUs

Under the hood • How does it work?

Under the hood • How does it work? o Proxy
objects that dispatch to cudf or pandas

objects that dispatch to cudf or pandas o Deep import customization to hijack pandas imports

objects that dispatch to cudf or pandas o Deep import customization to hijack pandas imports • What about...

objects that dispatch to cudf or pandas o Deep import customization to hijack pandas imports • What about... o Duck typing? ▪ Doesn't work for free functions like pd.read_csv ▪ Lots of code doing hard isinstance checks

objects that dispatch to cudf or pandas o Deep import customization to hijack pandas imports • What about... o Duck typing? ▪ Doesn't work for free functions like pd.read_csv ▪ Lots of code doing hard isinstance checks o DataFrame Standard API? ▪ Solves a different problem (developer-focused API) ▪ Exciting possibilities! • Fallback to a faster DataFrame library like Polars?

FAQs • Will my code run up to 100x faster
with no code changes? o Yes, with idiomatic pandas usage o The profiler helps you identify where it's falling back to the CPU § As a bonus, you'll likely improve performance on CPUs

with no code changes? o Yes, with idiomatic pandas usage o The profiler helps you identify where it's falling back to the CPU § As a bonus, you'll likely improve performance on CPUs • How much of the pandas API does this support? o 100%, with the following caveats § Some operations fall back to using the CPU via pandas § There may be small differences from pandas o We test against the pandas unit test suite (94% tests passing)

with no code changes? o Yes, with idiomatic pandas usage o The profiler helps you identify where it's falling back to the CPU § As a bonus, you'll likely improve performance on CPUs • How much of the pandas API does this support? o 100%, with the following caveats § Some operations fall back to using the CPU via pandas § There may be small differences from pandas o We test against the pandas unit test suite (94% tests passing) • Will cudf.pandas work with <insert third party library>? o Yes, if the library uses pandas in a standard way o Some known limitations: § Isinstance() checks for numpy arrays § Use of the C-API to talk to NumPy or Pandas § Subclassing pd.DataFrame (this kinda works)

with no code changes? o Yes, with idiomatic pandas usage o The profiler helps you identify where it's falling back to the CPU § As a bonus, you'll likely improve performance on CPUs • How much of the pandas API does this support? o 100%, with the following caveats § Some operations fall back to using the CPU via pandas § There may be small differences from pandas o We test against the pandas unit test suite (94% tests passing) • Will cudf.pandas work with <insert third party library>? o Yes, if the library uses pandas in a standard way o Some known limitations: § Isinstance() checks for numpy arrays § Use of the C-API to talk to NumPy or Pandas § Subclassing pd.DataFrame (this kinda works) • What about working with data larger than GPU memory ? o Right now, this will fall back to using the CPU

Get started with cudf.pandas • Code for today's talk: o
https://github.com/shwina/pydata-global-2023-demo • Try it on Google Colab: o https://nvda.ws/rapids-cudf • Report issues or feedback on our GitHub repo! o https://github.com/rapidsai/cudf

Thank you! Social links at https://jacobtomlinson.dev

cudf.pandas: the Zero Code Change GPU Accelerat...

cudf.pandas: the Zero Code Change GPU Accelerator for pandas

Jacob Tomlinson

More Decks by Jacob Tomlinson

Other Decks in Technology

Featured

Transcript