Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LiNGAM Python package

Avatar for Shohei SHIMIZU Shohei SHIMIZU
November 05, 2021

LiNGAM Python package

Explains what LiNGAM python package can do at a seminar with causal discovery users

Avatar for Shohei SHIMIZU

Shohei SHIMIZU

November 05, 2021
Tweet

More Decks by Shohei SHIMIZU

Other Decks in Science

Transcript

  1. LiNGAM model is identifiable (Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)

    • Linear Non-Gaussian Acyclic Model: – 𝑘(𝑖) (𝑖 = 1, … , 𝑝): causal (topological) order of 𝑥! – Error variables 𝑒! are independent and non-Gaussian • Coefficients and causal orders identifiable • Causal graph identifiable 4 or 𝑥" 𝑥# 𝑥$ Causal graph 𝑥! = # " # $"(!) 𝑏!# 𝑥# + 𝑒! 𝒙 = 𝐵𝒙 + 𝒆 𝑒$ 𝑒" 𝑒# 𝑏#" 𝑏#$ 𝑏"$
  2. Statistical reliability assessment • Bootstrap probability (bp) of directed paths

    and edges • Interpret causal effects having bp larger than a threshold, say 5% 5 x3 x1 … … x3 x1 x0 x3 x1 x2 x3 x1 99% 96% Total effect: 20.9 10% LiNGAM Python package: https://github.com/cdt15/lingam
  3. Before estimating causal graphs • Assessing assumptions by – Gaussianity

    test – Histograms • continuous? – Too high correlation? • multicollinearity? – Background knowledge 6
  4. After estimating causal graphs • Assessing assumptions by – Testing

    independence of error variables, for example, by HSIC (Gretton et al., 2005) – Prediction accuracy using Markov boundary (Biza et al., 2020) – Compare with the results of other datasets in which causal graphs are expected to be similar – Check against background knowledge 7
  5. DirectLiNGAM algorithm (Shimizu et al., 2011) • Repeat linear regression

    and independence evaluation – https://lingam.readthedocs.io/en/latest/tutorial/lingam.html • p>n cases (Wang & Drton, 2020) – https://github.com/ysamwang/highDNG 8 ú ú ú û ù ê ê ê ë é + ú ú ú û ù ê ê ê ë é ú ú ú û ù ê ê ê ë é - = ú ú ú û ù ê ê ê ë é 2 1 3 2 1 3 2 1 3 0 3 . 1 0 0 0 5 . 1 0 0 0 e e e x x x x x x 0 0 0 0 0 0 0 0 ú û ù ê ë é + ú û ù ê ë é ú û ù ê ë é - = ú û ù ê ë é 2 1 ) 3 ( 2 ) 3 ( 1 ) 3 ( 2 ) 3 ( 1 0 3 . 1 0 0 e e r r r r 0 0 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2 0
  6. Prior knowledge https://lingam.readthedocs.io/en/latest/tutorial/pk_direct.html • Prior knowledge about topological orders: k(3)

    < k(1) < k(2) • Use prior knowledge in estimating topological causal orders and in pruning redundant edges 9 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2
  7. Multiple datasets • Simultaneously analyze different datasets to use similarity

    (Ramsey et al. 2011; Shimizu, 2012) – Similarity: Causal orders same, distributions and coefficients may differ – https://lingam.readthedocs.io/en/latest/tutorial/multiple_dataset.html 10 x3 x1 x2 e1 e2 e3 4 -3 2 x3 x1 x2 e1 e2 e3 -0.5 5 Dataset 1 Dataset 2
  8. Multiple datasets: Longitudinal data • Longitudinal data consist of multiple

    samples collected over a period of time (Kadowaki et al., 2013) • https://lingam.readthedocs.io/en/latest/tutorial/longitudinal.html 11
  9. Analysis of predictive mechanisms • Combine the causal model and

    predictive model to model the prediction mechanism 12 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 𝑋! 𝑋" # 𝑌 𝑋# 𝑋$ 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 Causal model Predictive model # 𝑌 Prediction mechanism model ( ) 4 4 4 ,e y f x = ( ) 4 3 2 1 , , , ˆ x x x x f y = ( ) ( ) c x do y E i = | ˆ https://lingam.readthedocs.io/en/latest/tutorial/causal_effect.html#identification-of- feature-with-greatest-causal-influence-on-prediction
  10. Illustrative example • Auto-MPG (miles per gallon) dataset • Linear

    regression • Which variable has the greatest intervention effect on MPG prediction? • Which variable should be intervened on to obtain a certain MPG prediction? (Control) 13 Cylinders Displacement Weight Horsepower Acceleration MPG ! 𝑀𝑃𝐺 Desired MPG prediction Suggested intervention on cylinders 15 8 21 6 30 4
  11. Time series model • Subsampling data: – SVAR: Structural Vector

    Autoregressive model (Swanson & Granger, 1997) – Identifiability using non-Gaussianity (Hyvarinen et al., 2010) • https://lingam.readthedocs.io/en/latest/tutorial/var.html – VARMA instead of VAR (Kawahara et al., 2011) • https://lingam.readthedocs.io/en/latest/tutorial/varma.html • Nonstationarity – Assumption: Differences are stationarity (Moneta et al., 2013) 14 ) ( ) ( ) ( 0 t t t k e x B x + - = å = t t t x1(t) x1(t-1) x2(t-1) x2(t) e1(t-1) e2(t-1) e1(t) e2(t)
  12. Hidden common cause (1) 15 • Assumption: only exogenous variables

    allow hidden common causes x2 x3 x1 x2 x3 x1 f1 https://lingam.readthedocs.io/en/latest/tutorial/bottom_up_parce.html
  13. Hidden common cause (2) RCD • For unconfounded pairs with

    no hidden common causes, estimate the causal directions • For confounded pairs with hidden common causes, let them remain unknown 16 𝑥# 𝑥" 𝑓" 𝑥$ Underlying model Output 𝑥% 𝑥# 𝑥" 𝑥$ 𝑥% 𝑓# https://lingam.readthedocs.io/en/latest/tutorial/rcd.html
  14. Time series model with hidden common causes • SVAR with

    hidden common causes – Malinsky and Spirtes (2018) – Gerhardus and Runge (2020) – Nonparametric – Conditional independence – Python: https://github.com/jakobrunge/tigramite 17
  15. Methods based on conditional independencies • GUI: Tetrad – https://github.com/cmu-phil/tetrad

    • Python: causal-learn (including LiNGAM variants) – https://github.com/cmu-phil/causal-learn • R: pcalg – https://cran.r-project.org/web/packages/pcalg/index.html 19
  16. Future plan • A nonlinear version of RCD: CAM-UV •

    Latent factors • Mixed data with continuous and discrete variables • Overcomplete ICA based method for hidden common cause cases under development 20
  17. LiNGAM for latent factors (Shimizu et al., 2009) • Model:

    – Two pure measurement variables per latent factor needed to identify the measurement model (Silva et al., 2006; Xie et al., 2020) • Estimate the latent factors and then their causal graph 21 𝒇 = 𝐵𝒇+𝝐 𝒙 = 𝐺𝒇+𝒆 𝑥! 𝑥" & 𝑓! & 𝑓" 𝑥# 𝑥$ ?
  18. Find common and unique factors across multiple datasets (Zeng et

    al., 2021) • Model • Score function: likelihood + DAGness (Zheng et al., 2018) • Feature extraction across multiple datasets + causal discovery of latent factors 22 𝒇(') = 𝐵(') 𝒇(')+ 𝝐(') 𝒙(') = 𝐺(') 𝒇(')+ 𝒆(') 𝑚 = 1, … , 𝑀 ! " ! (#) ! ! (!) ! $ (!) ! % (!) ! & (!) ? ! ! ($) ! $ ($) ! " ! (!) ! % (%) ! & (&) ? ! " # (!) ! " # (#) ! " # (#) = ! " ! (!)?