Upgrade to Pro — share decks privately, control downloads, hide ads and more …

機械学習と解釈可能性

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Sinhrks Sinhrks
January 22, 2018
5.8k

 機械学習と解釈可能性

Avatar for Sinhrks

Sinhrks

January 22, 2018
Tweet

Transcript

  1. ࣗݾ঺հ • R • ύοέʔδ։ൃͳͲ • Git Awards ࠃ಺1Ґ •

    Python • http://git-awards.com/users/search?login=sinhrks
  2. ղऍՄೳੑͱ͸ • ໌֬ͳఆٛ͸ͳ͍͕ • Ϟσϧͷग़ྗ (what) ͚ͩͰͳ͘ɺͦͷཧ༝ (why) Λઆ໌͢Δ͜ͱ •

    ͦͷͨΊʹɺͳΜΒ͔ͷख๏/ج४Λ༻͍ͯϞσϧΛղऍ͢Δ͜ͱ The Mythos of Model Interpretability (Lipton, 2016)
  3. ͳʹ͕خ͍͔͠ • Trust • Causality • Transferability • Informativeness •

    Fair and Ethical Decision Making The Mythos of Model Interpretability (Lipton, 2016)
  4. ղऍՄೳੑ • Global Interpretability • Ϟσϧ΍σʔλશମͷ܏޲Λղऍ • ۙࣅ΍ཁ໿౷ܭྔΛར༻ => ہॴతʹ͸ෆਖ਼֬ͳ৔߹΋

    • Local Interpretability • Ϟσϧ΍σʔλͷݶΒΕͨྖҬΛղऍ • ΑΓਖ਼֬ͳઆ໌͕Մೳ
  5. ղऍՄೳੑ • ద੾ͳख๏͸ʮԿΛʯղऍ͍͔ͨ͠ʹґଘ .PEFM4QFDJpD .PEFM"HOPTUJD (MPCBM *OUFSQSFUBCJMJUZ w 3FHSFTTJPO$PF⒏DJFOUT w

    'FBUVSF*NQPSUBODF ʜ w 4VSSPHBUF.PEFMT w 4FOTJUJWJUZ"OBMZTJT ʜ -PDBM *OUFSQSFUBCJMJUZ w .BYJNVN"DUJWBUJPO"OBMZTJT ʜ w -*.& w -0$0 w 4)"1 ʜ
  6. Regression Coefficients • ඪ४Խภճؼ܎਺ library(dplyr) library(mlbench) data(BostonHousing) df <- BostonHousing

    %>% mutate_if(is.factor, as.numeric) %>% mutate_at(-14, scale) head(df, 3) crim zn indus chas nox rm age 1 -0.4193669 0.2845483 -1.2866362 -0.2723291 -0.1440749 0.4132629 -0.1198948 2 -0.4169267 -0.4872402 -0.5927944 -0.2723291 -0.7395304 0.1940824 0.3668034 3 -0.4169290 -0.4872402 -0.5927944 -0.2723291 -0.7395304 1.2814456 -0.2655490 dis rad tax ptratio b lstat medv 1 0.140075 -0.9818712 -0.6659492 -1.4575580 0.4406159 -1.0744990 24.0 2 0.556609 -0.8670245 -0.9863534 -0.3027945 0.4406159 -0.4919525 21.6 3 0.556609 -0.8670245 -0.9863534 -0.3027945 0.3960351 -1.2075324 34.7
  7. Regression Coefficients • ඪ४Խภճؼ܎਺ • coefplot ύοέʔδͰภճؼ܎਺ͷՄࢹԽ ͕Մೳ lm.fit <-

    lm(medv ~ ., data = df) coef(lm.fit) (Intercept) crim zn indus chas nox rm 22.53280632 -0.92906457 1.08263896 0.14103943 0.68241438 -2.05875361 2.67687661 age dis rad tax ptratio b lstat 0.01948534 -3.10711605 2.66485220 -2.07883689 -2.06264585 0.85010886 -3.74733185
  8. • Feature Importance (Random Forest) Feature Importance library(dplyr) library(mlbench) data(Sonar)

    head(Sonar, 3) V1 V2 V3 V4 V5 V6 V7 V8 V9 1 0.0200 0.0371 0.0428 0.0207 0.0954 0.0986 0.1539 0.1601 0.3109 2 0.0453 0.0523 0.0843 0.0689 0.1183 0.2583 0.2156 0.3481 0.3337 3 0.0262 0.0582 0.1099 0.1083 0.0974 0.2280 0.2431 0.3771 0.5598 … V55 V56 V57 V58 V59 V60 Class 1 0.0072 0.0167 0.0180 0.0084 0.0090 0.0032 R 2 0.0094 0.0191 0.0140 0.0049 0.0052 0.0044 R 3 0.0180 0.0244 0.0316 0.0164 0.0095 0.0078 R
  9. • Feature Importance (Random Forest) Feature Importance library(caret) ca.fit <-

    train(Class ~ ., data = Sonar, method = "rf", ntree = 100) varImp(ca.fit) rf variable importance only 20 most important variables shown (out of 60) Overall V11 100.00 V48 96.25 V45 75.45 V13 71.74 V10 60.50 …
  10. • caretͷvarImp͸ɺطఆͰख๏ґଘͷܭࢉํ๏Λར༻ • help(varImp) Feature Importance randomForest::importance(ca.fit$finalModel) %>% as.data.frame %>%

    tibble::rownames_to_column(var = 'variable') %>% arrange(desc(MeanDecreaseGini)) %>% head(20) varImp(ca.fit, scale = F) Overall V11 3.793 V48 3.684 V45 3.078 V13 2.970 V10 2.643 … variable MeanDecreaseGini 1 V11 3.792926 2 V48 3.683850 3 V45 3.077996 4 V13 2.970039 5 V10 2.642881 …
  11. Feature Importance • randomForest::importance Ͱ͸ҎԼͷ૊Έ߹ ΘͤͰಛ௃ྔͷॏཁ౓Λܭࢉ $MBTTJpDBUJPO 3FHSFTTJPO .FBOEFDSFBTFJOBDDVSBDZDPNQVUFE GSPNQFSNVUJOH00#EBUB

    &SSPS3BUF .FBO%FDSFBTF"DDVSBDZ .4& *OD.4& .FBOEFDSFBTFJOOPEFJNQVSJUZ (JOJJOEFY .FBO%FDSFBTF(JOJ 344 *OD/PEF1VSJUZ
  12. OOB αϯϓϦϯά (෮ݩநग़) … OOB ࢀߟ: OOBσʔλ͔Βͷಛ௃ྔ ͷॏཁ౓ͷܭࢉ ֶश ֶश

    OOB ༧ଌ OOB ༧ଌ … ֤ྻͷ஋Λγϟοϑϧ͠ɺOOBޡΓ཰ͱൺֱ ༧ଌ ൺֱ … ༧ଌ ൺֱ
  13. • caret ͸Ϟσϧґଘ͠ͳ͍ॏཁ౓΋αϙʔτ Feature Importance (Model Agnostic) varImp(ca.fit, useModel =

    F) ROC curve variable importance only 20 most important variables shown (out of 60) Importance V11 100.00 V12 86.30 V10 82.64 V49 82.12 V9 81.97 …
  14. ղऍՄೳੑ .PEFM4QFDJpD .PEFM"HOPTUJD (MPCBM *OUFSQSFUBCJMJUZ w 3FHSFTTJPO$PF⒏DJFOUT w 'FBUVSF*NQPSUBODF ʜ

    w 4VSSPHBUF.PEFMT w 4FOTJUJWJUZ"OBMZTJT ʜ -PDBM *OUFSQSFUBCJMJUZ w .BYJNVN"DUJWBUJPO"OBMZTJT ʜ w -*.& w -0$0 w 4)"1 ʜ
  15. Surrogate Models • ෳࡶͳϞσϧΛ୅ସ͢ΔγϯϓϧͳϞσϧΛ ௐ΂Δ Ideas on interpreting machine learning

    https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning Figure 14. An illustration of surrogate models for explaining a complex neural network. Figure courtesy of Patrick Hall and the H2O.ai team.
  16. Sensitivity Analysis • ͋Δಛ௃ྔΛมԽͤͨ࣌͞ͷϞσϧͷग़ྗΛ ௐ΂Δ Sensitivity analysis for neural networks

    https://beckmw.wordpress.com/2013/10/07/sensitivity-analysis-for-neural-networks/
  17. ղऍՄೳੑ .PEFM4QFDJpD .PEFM"HOPTUJD (MPCBM *OUFSQSFUBCJMJUZ w 3FHSFTTJPO$PF⒏DJFOUT w 'FBUVSF*NQPSUBODF ʜ

    w 1BSUJBM%FQFOEFODF w 4VSSPHBUF.PEFMT w 4FOTJUJWJUZ"OBMZTJT ʜ -PDBM *OUFSQSFUBCJMJUZ w .BYJNVN"DUJWBUJPO"OBMZTJT ʜ w -*.& w -0$0 w 4)"1 ʜ
  18. • ͋ΔσʔλͷۙลͰݩͷϞσϧΛ୅ସ͢ΔϞσϧΛ࡞੒ˠͦͷϞσϧͷ ಛ௃Λௐ΂Δ • Local Surrogate Model “Why Should I

    Trust You?” Explaining the Predictions of Any Classifier (Ribeiro, Singh, Guestrin 2016) LIME
  19. LIME • LIME͸ҎԼͷؔ਺Λ΋ͱʹσʔλ x ͷղऍΛಘΔ • G: ղऍ༻ͷֶशثͷू߹ • L:

    ղऍֶ͍ͨ͠शثͱղऍ༻ͷֶशثͷ ΠxͷݩͰͷࠩ • f: ղऍֶ͍ͨ͠शث • Πx: σʔλ x ͱͷྨࣅ౓ • Ω: ղऍ༻ͷֶशثͷෳࡶ͞ʹର͢Δേଇ߲ • ۩ମతखஈ͸υϝΠϯʹґଘ
  20. ςʔϒϧσʔλɾ෼ྨ໰୊ͷྫ • σʔλ x ͷपลͰαϯϓϦϯά • طఆͰ5,000 • αϯϓϦϯάํ๏͸ม਺ͷछྨʹґଘ •

    Exponential KernelͰॏΈ෇͚ • ม਺બ୒ • Forward/Backward, LARSͳͲ • RidgeճؼͳͲ ˞1ZUIPO࣮૷ ޙड़ ʹ΋ͱͮ͘
  21. ύοέʔδ • Python • ࿦จஶऀ࡞ • https://github.com/marcotcr/lime • R •

    ্هͷϙʔςΟϯά • https://github.com/thomasp85/lime install.packages(‘lime’)
  22. LIME (R) • αϯϓϧ library(caret) library(lime) model <- train(iris[-5], iris[[5]],

    method = 'rf') explainer <- lime(iris[-5], model) explanations <- explain(iris[1, -5], explainer, n_labels = 1, n_features = 2) explanations model_type case label label_prob model_r2 model_intercept 1 classification 1 setosa 1 0.3776584 0.2544468 2 classification 1 setosa 1 0.3776584 0.2544468 model_prediction feature feature_value feature_weight feature_desc 1 0.7113922 Sepal.Width 3.5 0.02101138 3.3 < Sepal.Width 2 0.7113922 Petal.Length 1.4 0.43593404 Petal.Length <= 1.60 data prediction 1 5.1, 3.5, 1.4, 0.2 1, 0, 0 2 5.1, 3.5, 1.4, 0.2 1, 0, 0 ֶशثΛ܇࿅ ղऍ༻ͷΫϥεΛ࡞੒ ղऍΛग़ྗ
  23. LIME (R) model_type case label label_prob model_r2 model_intercept 1 classification

    1 setosa 1 0.3776584 0.2544468 2 classification 1 setosa 1 0.3776584 0.2544468 model_prediction feature feature_value feature_weight feature_desc 1 0.7113922 Sepal.Width 3.5 0.02101138 3.3 < Sepal.Width 2 0.7113922 Petal.Length 1.4 0.43593404 Petal.Length <= 1.60 &YQMBOBUJPO NPEFM@UZQF 5IFUZQFPGUIFNPEFMVTFEGPSQSFEJDUJPO DBTF 5IFDBTFCFJOHFYQMBJOFE UIFSPXOBNFJODBTFT  NPEFM@S 5IFRVBMJUZPGUIFNPEFMVTFEGPSUIFFYQMBOBUJPO NPEFM@JOUFSDFQU 5IFJOUFSDFQUPGUIFNPEFMVTFEGPSUIFFYQMBOBUJPO NPEFM@QSFEJDUJPO 5IFQSFEJDUJPOPGUIFPCTFSWBUJPOCBTFEPOUIFNPEFMVTFEGPSUIFFYQMBOBUJPO GFBUVSF 5IFGFBUVSFVTFEGPSUIFFYQMBOBUJPO GFBUVSF@WBMVF 5IFWBMVFPGUIFGFBUVSFVTFE GFBUVSF@XFJHIU 5IFXFJHIUPGUIFGFBUVSFJOUIFFYQMBOBUJPO GFBUVSF@EFTD "IVNBOSFBEBCMFEFTDSJQUJPOPGUIFGFBUVSFJNQPSUBODF
  24. ςΩετσʔλͷྫ • ϥϯμϜʹ୯ޠΛআ֎ͨ͠σʔλΛੜ੒ • ίαΠϯྨࣅ౓ΛExponential KernelͰॏΈ෇͚ • Ҏ߱ɺςʔϒϧσʔλͱಉ༷ LIME -

    Local Interpretable Model-Agnostic Explanations https://homes.cs.washington.edu/~marcotcr/blog/lime/ ˞1ZUIPO࣮૷ʹ΋ͱͮ͘
  25. ը૾σʔλͷྫ Ideas on interpreting machine learning https://www.oreilly.com/ideas/ideas-on-interpreting-machine-learning Figure 15. An

    illustration of the LIME process in which a weighted linear model is used to explain a single prediction from a complex neural network. Figure courtesy of Marco Tulio Ribeiro; image used with permission. ˞1ZUIPO࣮૷ʹ΋ͱͮ͘ • ϥϯμϜʹηάϝϯτԽͨ͠σʔλΛੜ੒ • skimage.segmentationͷख๏͕ར༻Մ (quickshiftͳͲ) • ίαΠϯྨࣅ౓ͰॏΈ෇͚ • Ҏ߱ɺςʔϒϧσʔλͱಉ༷ ※RͰ͸ະ࣮૷
  26. ·ͱΊ .PEFM4QFDJpD .PEFM"HOPTUJD (MPCBM *OUFSQSFUBCJMJUZ w 3FHSFTTJPO$PF⒏DJFOUT w 'FBUVSF*NQPSUBODF ʜ

    w 4VSSPHBUF.PEFMT w 4FOTJUJWJUZ"OBMZTJT ʜ -PDBM *OUFSQSFUBCJMJUZ w .BYJNVN"DUJWBUJPO"OBMZTJT ʜ w -*.& w -0$0 w 4)"1 ʜ