Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Brief Introduction to Hyperparamter Optimizat...

Jill Cates
November 18, 2018

A Brief Introduction to Hyperparamter Optimization (*with a focus on medical data)

This talk walks through a case study of building a sepsis prediction model, and discusses 3 techniques for sampling hyperparameters:
1) grid search
2) random search
3) sequential model-based optimization

Jill Cates

November 18, 2018
Tweet

More Decks by Jill Cates

Other Decks in Science

Transcript

  1. R.J. Urbanowicz et al. 2018 A Typical ML Pipeline Pre-processing

    Modeling Post-processing Hyperparameter optimization Bad Hyperparameters = Bad Model = Bad Predictions
  2. Defining Sepsis What is sepsis? “life-threatening condition that arises when

    the body's response to infection causes injury to its own tissues and organs” [1] 750, 000 patients are diagnosed with severe sepsis in the United States each year with a 30% mortality rate [2] costs $20.3  billion each year ($55.6  million per day) in U.S. hospitals [3] every hour that passes before treatment begins, a patients’ risk of death from sepsis increases by 8% [4]
  3. An Overview of Our Pipeline EMR data Past medical history

    Blood test results Microbiology results Imaging (MRI, US, CT) Predict sepsis Demographics (age, gender, ethnicity) Modeling Feature Engineering & Feature Selection Model selection Hyperparameter tuning Create new features Evaluation Select best features
  4. Our Data Data Description Admissions information Diagnosis upon admission, time

    of admission/discharge Patient demographics Age, gender, religion, marital status Prescriptions Which drugs were they prescribed and when? Unit transfers Did they move from the medical ward to ICU? Vital signs Heart rate, blood pressure, respiratory rate, spO2 Lab results Blood tests, urine tests Diagnoses ICD-10 codes Chest X-ray images DICOM format 50,000 hospital admissions and 40,000 patients
  5. Data Pre-processing Generate new features from imaging data • identify

    lung opacities in X-ray image • lung_abnormality = (0,1) • infection_size = [x,y,width,height] Pneumonia Pulmonary assess Clean up inconsistencies in medical terms • Aspirin vs. ASA (acetylsalicylic acid) • NS (normal saline) vs. 0.9% sodium chloride Unified Medical Language System This is a separate model in itself! NIH CXR dataset contains +100,000 annotated X-ray images
  6. Creating a sepsis score How do we identify sepsis in

    a patient? • ICD-10 codes [4], [5] : - Bacteremia - R78.81 - Sepsis unspecific - A41.9 - Acute hepatic failure without coma - K72.00 • Severity scores based on lab results and vitals: - SOFA: Sequential Organ Failure Assessment [6] - SIRS: Systemic Inflammatory Response Syndrome [7] - LODS: Logistic Organ Dysfunction System [8] * International Statistical Classification of Diseases and Related Health Problems (ICD), 10th revision, developed by the World Health Organization (WHO) * ICD codes are listed for billing patients at end of stay
  7. Creating a sepsis score SOFA: Sequential Organ Failure Assessment mortality

    prediction score that is based on the degree of dysfunction of six organ systems Jones et al. 2010. Crit Care Med. vitals blood test results urine test results Sepsis = acute change in total SOFA score ≥ 2 points upon infection (regardless of baseline) [9]
  8. Picking a Model Random Forest Classifier admission_id sepsis 1001 0

    1002 1 1003 0 1004 1 A binary classification problem Output A probability score between 0 and 1 representing a patient’s likelihood of sepsis A forest of decision trees Patient Sepsis Sepsis No sepsis Final prediction: SEPSIS prob=0.667
  9. No Free Lunch Theorem “all optimization problem strategies perform equally

    well when averaged over all possible problems” Free Lunch (See Seinfeld’s Soup Nazi episode)
  10. Evaluating the Quality of Our Model RMSE = ΣN i=1

    (y − ̂ y)2 N Area Under the Receiver Operating Curve (AUROC) precision = TP TP + FP recall = TP TP + FN F1 = 2 ⋅ precision ⋅ recall precision + recall
  11. What is a hyperparameter? model hyperparameters Configuration that is external

    to the model Set to a pre-determined value before model training
  12. What is a hyperparameter? Example: clinical trials goal: maximize drug

    effectiveness active ingredients concentrations Did it cure the patient?
  13. What is a hyperparameter? 0174413 Cdk4/D: 0.210 μM Cdk2/A: 0.012

    μM 0204661 Cdk4/D: 0.092 μM Cdk2/A: 0.002 μM 0205783 Cdk4/D: 0.145 μM Cdk2/A: 5.010 μM Toxic Therapeutic Example: drug discovery
  14. Hyperparameter Examples • Random Forest Classifier - n_estimators (# of

    decision trees) - max_depth • Singular Value Decomposition - n_components (# latent factors) • Support Vector Machine - Regularization (C) - Tolerance threshold (Ɛ) - Kernel • Gradient descent - Learning rate - Regularization (λ) • K-means clustering - K clusters
  15. Sampling Techniques 1. Grad Student Descent 2. Grid Search 3.

    Random Search 4. Sequential Model-based Optimization
  16. Grid Search Search Space skelarn.ensemble.RandomForestClassifier() • n_estimators = [5,10,50] •

    max_depth = [3,5] Models 1) n_estimators=5, max_depth=3 2) n_estimators=5, max_depth=5 3) n_estimators=10, max_depth=3 4) n_estimators=10, max_depth=5 5) n_estimators=50, max_depth=3 6) n_estimators=50, max_depth=5 Provide discrete set of hyperparamter values max_depth n_estimators 3 5 10 5 10 50
  17. “for most data sets only a few of the hyper-parameters

    really matter…” “…different hyper-parameters are important on different data sets” • Based on assumption that not all hyperparameters are equally important • Works by sampling hyperparamater values from a distribution Random Search
  18. No Free Lunch Theorem “all optimization problem strategies perform equally

    well when averaged over all possible problems” Free Lunch
  19. What is Overfitting?! The Bias-Variance Trade-off Learning from noise vs.

    signal Model is tightly bound to training set How to detect overfitting High performance on training set Poor performance on test set
  20. Regularization A penalty term L1 norm (Lasso Regression) Good for

    feature selection Sets weight of irrelevant features to 0 L2 norm (Ridge Regression) Handles multicollinearity Reduces weight of less important features ElasticNet Combination of L1 and L2 Define “mixture ratio” λ
  21. Cross-validation Training Validation entire dataset 1 2 3 4 score

    0.81 0.79 0.80 0.73 }avg 0.78 Divide training data into k subsets (“folds”) Train model on k-1 folds over k iterations Calculate average score iter
  22. Imbalanced Data Inflated accuracy How to overcome it • Upsampling/downsampling

    - Bootstrapping - e.g. Synthetic Minority Over-sampling Technique (SMOTE) • Use information retrieval metrics (recall, precision, F1, confusion matrix) rather than accuracy • Example: 90% of patients did not have sepsis • Predict that all patients did not have sepsis = 90% accuracy
  23. A Word of Caution Biased datasets “Fluctuating hormones and differences

    between male and female study subjects could all complicate the design of the study” Defining the “ground truth” Selecting the appropriate evaluation metric False positives vs. False negatives Is SOFA a reliable indicator of sepsis?
  24. 1) Sepsis article. Wikipedia. 2) Stevenson EK et al. Two

    decades of mortality trends among patients with severe sepsis: a comparative meta-analysis. Crit Care Med 2014;42:625. 3) Cost H et al. In Healthcare Cost and Utilization Project (HCUP) Statistical Briefs: MDAgency for Healthcare Research and Quality USA, 2006. 4) Angus DC et al. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Criti Care Med. 2001;1303-10. 5) Martin GS et al. The Epidemiology of Sepsis in the United States from 1979 through 2000. N Engl J Med 2003; 348:1546-1554. 6) Vincent JL et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. Intensive Care Med. 1996;22:707–710. 7) Bone RC, Balk RA, Cerra FB, et al. Definitions for Sepsis and Organ Failure and Guidelines for the Use of Innovative Therapies in Sepsis. Chest 1992;101:1644-55. 8) Le Gall JR. et al. The Logistic Organ Dysfunction system. A new way to assess organ dysfunction in the intensive care unit. ICU Scoring Group. JAMA. 1996;276(10):802–10. 9) Seymour CW, Rea TD, Kahn JM, Walkey AJ, Yealy DM, Angus DC. Severe sepsis in pre-hospital emergency care: analysis of incidence, care, and outcome. Am J Respir Crit Care Med. 2012;186(12):1264–1271. References