Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning and Performance Evaluation @ ...

Machine Learning and Performance Evaluation @ DataPhilly 2016

Every day, in scientific research and business applications, we rely on statistics and machine learning as our support tools for predictive modeling. To satisfy our desire of modeling uncertainty, to predict trends and to predict patterns that may occur in future, we developed a vast library of tools for decision making. In other words, we learned to take advantage of computers to replicate the real world, making intuitive decisions more quantitative, labeling unlabeled data, predicting trends, and ultimately trying to predict the future. Now, whether we are applying predictive modeling techniques to our research or business problems, we want to make "good" predictions!

In the presence of modern machine learning libraries, choosing a machine learning algorithm to fit a model to our training data has never been that simple. However, making sure that our model generalizes well to unseen data is still up to us — the machine learning practitioners and researchers. In this talk, we will discuss the two most important components of various estimators of generalization performance: bias and variance. We will discuss how we can make the best use of our data at hand — proper (re)sampling -- and how to pick appropriate performance metrics. Then, we will compare various techniques for algorithm selection and model selection to find the right tool and approach for our task at hand. In the context of the "bias-variance trade-off," we will go over potential weaknesses in common modeling techniques, and we will learn how to take uncertainty into account to build predictive model performs well on unseen data.

Sebastian Raschka

December 01, 2016
Tweet

More Decks by Sebastian Raschka

Other Decks in Technology

Transcript

  1. Wednesday, November 30, 2016 6:30 PM to 7:00 PM 441

    N 5th St, Suite 301, Philadelphia, PA Machine Learning and Performance Evaluation Sebastian Raschka DATAPHILLY
  2. Bias = E ⇥ ˆ ⇤ Variance = E 

    ˆ E[ˆ] 2 Low Variance (Precise) High Variance (Not Precise) Low Bias (Accurate) High Bias (Not Accurate) This work by Sebastian Raschka is licensed under a Bias = E ⇥ ˆ ⇤ Variance = E  ˆ E[ˆ] 2 expected estimated value VARIANCE BIAS
  3. Train (70%) Test (30%) Train (70%) Test (30%) n=1000 n=100

    Real World Distribution Sample 1 Sample 2 Sample 3 Resampling
  4. 1st 2nd 3rd 4th 5th K Iterations (K-Folds) Validation Fold

    Training Fold Performance Performance Performance Performance Performance 1 2 3 4 5 Performance 1 5 ∑ 5 i =1 Performancei = K-fold Cross Validation
  5. Learning Algorithm Hyperparameter Values Model Training Fold Data Training Fold

    Labels Prediction Performance Model Validation Fold Data Validation Fold Labels Hyperparameter Values Training Fold Data Training Fold Labels Prediction Performance Model Validation Fold Data Validation Fold Labels 1st 2nd 3rd 4th 5th K Iterations (K-Folds) Validation Fold Training Fold Performance Performance Performance Performance Performance 1 2 3 4 5 Performance 1 5 ∑ 5 i =1 Performancei =
  6. Σ Logis'c cost . . . Net input func'on (weighted

    sum) Logis'c (sigmoid) func'on Quan'zer Predicted class label y Update Model parameters w 1 w 2 w m w 0 1 x 1 x 2 x m y True class label Number of itera'ons w λ 2 L2-regulariza'on strength
  7. Test Labels Test Data Training Data Training Labels Data Labels

    1 K-fold for Model Selection step-by-step
  8. Learning Algorithm Hyperparameter values Hyperparameter values Hyperparameter values Training Data

    Training Labels 2 Performance Performance Performance Test Labels Test Data Training Data Training Labels Data Labels 1
  9. Learning Algorithm Best Hyperparameter Values Model Training Data Training Labels

    3 Learning Algorithm Hyperparameter values Hyperparameter values Hyperparameter values Training Data Training Labels 2 Performance Performance Performance
  10. Prediction Test Labels Performance Model Test Data 4 Learning Algorithm

    Best Hyperparameter Values Model Training Data Training Labels 3
  11. Prediction Test Labels Performance Model Test Data 4 Learning Algorithm

    Best Hyperparameter Values Final Model Data Labels 5
  12. 1st 2nd 3rd 4th 5th Outer Loop Outer Validation Fold

    Outer Training Fold Performance Performance Performance Performance Performance 1 2 3 4 5 Performance 1 10 ∑ 10 i=1 Performancei = Inner Loop Inner Training Fold Inner Validation Fold Performance Performance 5,1 5, 2 Performance 1 2 ∑ 2 j=1 Performance5,j Best Algorithm Best Model Nested Cross-Validation for Algorithm Selection
  13. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why

    Should I Trust You?”: Explaining the Predictions of Any Classifier. In Knowledge Discovery and Data Mining (KDD).