Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to guarantee your machine learning model will fail on first contact with the real world.

How to guarantee your machine learning model will fail on first contact with the real world.

Recently I had my PhD thesis rejected. As a failure, I am uniquely positioned to recognize failure. While I am unwaveringly enthusiastic about machine learning, I aim to share my insights into failed machine learning modelling from real-world examples in science and industry. This talk is for you if you have an introductory understanding of machine learning and would like to avoid common pitfalls.

Jesper Dramsch

November 15, 2020
Tweet

More Decks by Jesper Dramsch

Other Decks in Technology

Transcript

  1. How to guarantee
    your machine learning model
    will fail on first contact
    with the real world
    Jesper Dramsch
    PyData Global 2020

    View full-size slide

  2. 99.8 %
    Accuracy

    View full-size slide

  3. Chihuahua or Blueberry Muffin?
    Real-World Applications are rarely 100%

    View full-size slide

  4. Augmentations build better models
    Yet decrease metrics

    View full-size slide

  5. Ignore the possibility Overfitting
    and Data Leakage

    View full-size slide

  6. Are you overfitting?

    View full-size slide

  7. Is there Data Leakage? [1]

    View full-size slide

  8. Assume everything is IID

    View full-size slide

  9. I I D
    Independent
    and
    Identically Distributed

    View full-size slide

  10. The real world rarely is independent nor identically distributed.

    View full-size slide

  11. Did you account for class imbalances? [1]

    View full-size slide

  12. Always use Accuracy

    View full-size slide

  13. Imbalanced Metrics [1]

    View full-size slide

  14. Cities Dataset for Semantic Segmentation [1]

    View full-size slide

  15. Losses for Semantic Segmentation [1]

    View full-size slide

  16. Collecting more Data is a Better

    View full-size slide

  17. A Good Data
    Scientist is Data
    Critical
    ● CERN throws away most of
    collected data at 25 GB/s [1]
    ● Geophysical data has to be
    reprocessed for many different
    use cases [2]
    ● Someone decides on social
    taxonomies. ImageNet class
    “looser / failure” as person. [3]
    ● GPT-2 was trained on Reddit
    comments. Try and ask it about
    Earth Science. [4]

    View full-size slide

  18. Strategies That Work (Sometimes)
    ● Multiple Interpreters (Inter-interpreter)
    ● Repeat Interpretations (Intra-interpreter)
    ● Take Responsibility to Change Questionable Taxonomies
    ● Collect Representative Samples

    View full-size slide

  19. Cross-Validation solves Everything

    View full-size slide

  20. Cross
    Validation
    to the rescue?

    View full-size slide

  21. Class Imbalances call for Stratification

    View full-size slide

  22. Cross-Validation for Time Series Data

    View full-size slide

  23. Cross Validation for Spatial Data

    View full-size slide

  24. Are you Cross-Validating your data preparation? [6]

    View full-size slide

  25. Even Cross-Validation has its Flaws [5]

    View full-size slide

  26. Absolutely ignore Model Simplicity

    View full-size slide

  27. News Item on AI for Earthquake Aftershock Prediction [8]

    View full-size slide

  28. One Neuron outperforms a Deep Neural Network [9]

    View full-size slide

  29. Can we Crash Test our Machine Learning? []

    View full-size slide

  30. Trust any Counter-Intuitive Results

    View full-size slide

  31. Extraordinary
    Claims
    Require
    Extraordinary
    Evidence

    View full-size slide

  32. Inferring the face of a person from its speech patterns surely is extraordinary [1]

    View full-size slide

  33. “AI” hiring decisions directly from video [1]

    View full-size slide

  34. Gender Shades: Intersectional
    Accuracy Disparities in Commercial
    Gender Classification - Buolamwini
    and Gebru [1]
    Critical Perspectives on Computer
    Vision - Denton [2]
    Excavating AI - Crawford and Paglen [3]
    Tutorial on Fairness Accountability
    Transparency and Ethics in Computer
    Vision at CVPR 2020 [4]
    The Uncanny Valley of ML - Andrews [5]
    Bias in Facial Recognition [6]
    Research into
    Bias in ML

    View full-size slide

  35. Subject Matter Experts often forgot more about a Subject than a Data Scientists has learned during a Project

    View full-size slide

  36. Data can often be explained by many hypotheses. [1]

    View full-size slide

  37. Explainability shows how a
    Machine Learning Model thinks

    View full-size slide

  38. Post-Hoc Explainability will explain “Why?” even on wrong decisions with 99% [1]

    View full-size slide

  39. Calibration of Classifiers [1]

    View full-size slide

  40. Shap Library for machine learning explainability [1]

    View full-size slide

  41. Interpretability
    Explainable Forests, Linear
    Models, RuleFit
    Explainability
    SHAP, Partial Dependence Plots,
    Lime, Feature Importance

    View full-size slide

  42. A Machine Learning Model can
    outperform your Assumptions
    and Baseline Data

    View full-size slide

  43. Extracting information and establishing relationships is limiting the machine learning model.

    View full-size slide

  44. Speedround: When your
    Machine Learning Model isn’t
    scoring perfectly, you can
    still Spice Up Your Results

    View full-size slide

  45. It is uncomfortably common to hand-select “good” results [1]

    View full-size slide

  46. It is uncomfortably common to overfit on benchmarks to “sell” a method [1]

    View full-size slide

  47. Committing the “Inverse Crime” [1]

    View full-size slide

  48. Measuring what’s easy to measure rather than meaningful

    View full-size slide

  49. ● Use nothing else but accuracy
    ● Under no circumstance spend extensive time on validation
    ● Blindly trust counter-intuitive results because the model converged
    ● Explainability is overrated but has all the answer
    ● Take all these points as gospel
    Main Take Aways

    View full-size slide