MLtraq: Track your ML/AI experiments at hyperspeed

Michele Dallachiesa Data Products & AI Consulting [email protected] EUROPYTHON 2024
MLtraq: Track your ML/AI experiments at hyperspeed

Why tracking? Explore and understand impact on performance varying algorithms,
parameters, datasets Experimentation process Hypothesis Design model Train model Evaluate model

Tracking code notebooks scripts environment setup parameters configurations inputs metrics
model weights system stats outputs LLM prompts cost metadata predictions git commit version author images generated text audio video debug data

Why fast? Worry less on what/when to track

Why fast? Worry less on what/when to track Recompute less
Iterate faster Experiment more

What is an experiment? Experiment Run N Run 2 Run
1 ... • Experiment: collection of runs • Run: experiment instantiation varying inputs results1 = train_eval(inputs1) results2 = train_eval(inputs2) ...

Example of experiment Classifier • DummyClassifier • LogisticRegression • DecisionTreeClassifier
• RandomForestClassifier Dataset • Iris • Digits • Wine Random seed • 1 • 2 • .. • 10 4 x 3 x 10 configurations ⇒ 120 runs x x

Solutions for experiment tracking MLflow 51% W&B 45% Comet 3%
Aim 1% Neptune 1% 😕 Slowness and type limitations Others < 1% Percentage of PyPI monthly downloads as proxy to market share

• Containers: dict, list, set, tuple • Scalars: int, str,
time, bool • Arrays: NumPy, data frames • ... Beyond float and bytes

Star it! Reach out to contribute! Try it! pip install
mltraq

Team DB Execution Reporting Tracking Copy Public DB Compute nodes
MLtraq is flexible execute and persist locally or remote Private DB ...

MLtraq is interoperable SQL native types, PyArrow, safe Python pickles

Debug with MLtraq just like any other Python script

01 from mltraq import create_session 02 03 session = create_session("sqlite:../local.db")
04 experiment = session.create_experiment("test") 05 06 with experiment.run() as run: 07 run.fields.accuracy = .9 08 09 experiment.persist() 10 11 session.db.query("SELECT * FROM experiment_test") ╭────────────────┬──────────────┬──────────╮ │ id_experiment │ id_run │ accuracy │ ├────────────────┼──────────────┼──────────┤ │ 4d4c4f7a... │ 457f89c38... │ 0.9 │ ╰────────────────┴──────────────┴──────────╯

Start up time High frequency tracking Large objects tracking •
Full analysis at https://mltraq.com/benchmarks/speed • Tracking speed of floats (scalars and NumPy arrays) • Statistical profiling with Pyinstrument • Reporting averaged results on 10 repeated runs Let’s experiment!

Start up time What takes most of the time? •
W&B: threading, IPC • MLflow: Alembic migration • Aim: threading, RocksDB • Comet: threading • MLtraq: SQLite operations • Neptune: direct writes to FS How much time to track 1-10 float values? 1.6s

55% 8% W&B

Exponentially worse with more runs How much time to track
1 float on 100 runs? 208s Start up time

How much time to track 100-100K float values? 0.85 “accuracy”
Run ID High frequency tracking • MLflow uses entity-attribute-value model • DB INSERT at every .log_metric(...) call 5.8s

Tracking large objects How much time to track 1 million
float64 values? Storage formats: • MLtraq: safe Pickle, NumPy • W&B: JSON • Neptune: JSON, binary blob • MLflow: binary blob • Aim: binary blob • Comet: binary blob binary blob ⇒ weak semantics 2.4s

01010 10101 01010 10110 10111 01010 10111 11101 Safe-Pickling Safe-Unpickling
If dangerous opcodes encountered, exception Pickle binary format Python objects

• Write speed of np.zeros(size, dtype=np.int8) • MLtraq-fs: direct write
to filesystem • MLtraq-db-mem: in-memory sqlite DB • MLtraq-db-fs: sqlite DB stored on filesystem How much time to track 1 billion int8 values? Tracking large objects

• No size fits all: threading or IPC, web API
or DB, storage design, batching and streaming • Use native SQL and Python types, PyArrow: uuencoding or JSON-like formats are slow, poor semantics • Impact: contributed to make new W&B SDK 36-88% faster! Conclusion • floats

Michele Dallachiesa Data Products & AI Consulting [email protected] Thank You.

MLtraq: Track your ML/AI experiments at hypers...

MLtraq: Track your ML/AI experiments at hyperspeed

Michele Dallachiesa

More Decks by Michele Dallachiesa

Featured

Transcript

Michele Dallachiesa Data Products & AI Consulting [email protected] EUROPYTHON 2024

Why tracking? Explore and understand impact on performance varying algorithms,

Tracking code notebooks scripts environment setup parameters configurations inputs metrics

Why fast? Worry less on what/when to track

Why fast? Worry less on what/when to track Recompute less

What is an experiment? Experiment Run N Run 2 Run

Example of experiment Classifier • DummyClassifier • LogisticRegression • DecisionTreeClassifier

Solutions for experiment tracking MLflow 51% W&B 45% Comet 3%

• Containers: dict, list, set, tuple • Scalars: int, str,

Star it! Reach out to contribute! Try it! pip install

Team DB Execution Reporting Tracking Copy Public DB Compute nodes

MLtraq is interoperable SQL native types, PyArrow, safe Python pickles

Debug with MLtraq just like any other Python script

01 from mltraq import create_session 02 03 session = create_session("sqlite:../local.db")

Start up time High frequency tracking Large objects tracking •

Start up time What takes most of the time? •

55% 8% W&B

Exponentially worse with more runs How much time to track

How much time to track 100-100K float values? 0.85 “accuracy”

Tracking large objects How much time to track 1 million

01010 10101 01010 10110 10111 01010 10111 11101 Safe-Pickling Safe-Unpickling

• Write speed of np.zeros(size, dtype=np.int8) • MLtraq-fs: direct write

• No size fits all: threading or IPC, web API

Michele Dallachiesa Data Products & AI Consulting [email protected] Thank You.