Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
model_pipeline_final.pdf
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Maxwell
September 18, 2018
Science
240
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
model_pipeline_final.pdf
model pipeline and others in Home Credit Default Risk competition.
Thanks to team mates.
Maxwell
September 18, 2018
More Decks by Maxwell
See All by Maxwell
Causal Impact -paper summary-
hoxomaxwell
3
990
Great Barrier Reef Model Pipeline: 15th place
hoxomaxwell
1
250
Lecture materials at the University of Tokyo School of Medicine
hoxomaxwell
1
200
Kaggle Hungry Geese
hoxomaxwell
1
160
HuBMAP 17th place model pipeline
hoxomaxwell
1
160
LT: Shallow Dive into Bayes Factor
hoxomaxwell
6
1.4k
Kaggle APTOS 2019 @ U-Tokyo Med
hoxomaxwell
1
450
Cornell Birdcall 36th place solution
hoxomaxwell
2
270
Kaggle Bengali.AI 6 th place solution
hoxomaxwell
4
8.9k
Other Decks in Science
See All in Science
Physical AIを支えるWeights & Biases
olachinkei
1
380
Testing the Longevity Bottleneck Hypothesis
chinson03
0
330
(2025) Balade en cyclotomie
mansuy
0
630
機械学習 - K近傍法 & 機械学習のお作法
trycycle
PRO
1
1.5k
TypeScript で WebAssembly を用いた 型安全なプラグイン設計
nagano
2
530
機械学習 - 決定木からはじめる機械学習
trycycle
PRO
0
1.5k
Bear-safety-running
akirun_run
0
160
HDC tutorial
michielstock
2
720
MATSUO Makiko
genomethica
0
150
プロジェクト「Azayaka」のSARの数式とジオメトリ
syuchimu
0
350
チュートリアル:世界モデル
hf149
0
1.8k
Rashomon at the Sound: Reconstructing all possible paleoearthquake histories in the Puget Lowland through topological search
cossatot
0
1k
Featured
See All Featured
30 Presentation Tips
portentint
PRO
1
330
Side Projects
sachag
455
43k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.5k
KATA
mclloyd
PRO
35
15k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
210
The Pragmatic Product Professional
lauravandoore
37
7.3k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
590
Marketing to machines
jonoalderson
1
5.5k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2.1k
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
140
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
630
Being A Developer After 40
akosma
91
590k
Transcript
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features + LGBM 5 3 tosh 5 + CatBoost 5 2 1 + LGBM * 4 3 1 + CNN 7 Residual 2 + ExtTree 4 3 1 Residual 1 ( corrected with residual regression ) Blending CV 0.8094 Adversarial Stochastic Blending CV 0.8096 Adversarial Stochastic Blending CV 0.81050 * model drawn in next page + NN 1 3 ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 1 or 2 or 5 + LGBM 1 or 2 or 5 + CatBoost or + LGBM 5 1 or 2 5 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + RGF 1 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 + LGBM 3 1 5 or 3 2 5 + LGBM 8 1 12 or 8 2 12 Public 0.8085 17 th Private 0.8017 18 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8093 10 th Private 0.8016 18 th Public 0.8080 23 th Private 0.8028 14 th + RNN 7 1 Public 0.8110 3 rd Private 0.8042 5 th Giba Post Processing Public 2nd 0.81241 Private 2nd 0.80561 Home Credit Default Risk partial partial partial + LGBM 8 1 or 2 8 or 12 + LGBM 3 1 or 2 3 or 12 3 + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features tosh + LGBM * 4 3 1 + CNN 7 Residual 2 Residual 1 ( corrected with residual regression ) Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8097 * model drawn in next page ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 Public 0.8071 26 th Private 0.8009 37 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8082 23 th Private 0.8022 18 th Public 0.8080 23 th Private 0.8028 14 th Public 0.8099 7 th Private 0.8040 6 th Giba Post Processing Home Credit Default Risk partial + LGBM 8 1 12 or 8 2 12 partial 1 or 2 + LGBM + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending + ExtTree 4 3 1 + NN 1 3 + RGF 1 + LGBM 4 3 2 + XGB 4 3 1 + NN 1 + RNN 7 1 + hidden + Res3 + LGBM 1 6 + Res1 + LGBM 1 6 + hidden + Res4 + LGBM 1 6 stacking with LGBM CV 0.8080 Public 0.8070 / Private 0.8015 Stacking prediction Stacking + LGBM 3 1 or 2 3
application bureau bureau balance AUC : 0.683 (SEED71) 0.683 (SEEDs
avg) AUC 0.772 (SEED71) 0.773 (SEEDs avg) XGBoost app meta feature XGBoost prev meta feature 229 features 300 features all data stacking-like Light GBM 5 stratified fold ( shuffle = True ) 5 / 8 SEEDs rank averaged SEED : 71 for model fit SEED : 710, 711, 712, 713, 714 ( 715, 716, 717 ) for OOF prediction hyper parameter tuned for 603 features (reflected on meta features) XGBoost bureau meta feature ONODERA BASIC FEATURES 600 features NEJUMI FEATURES ( interest rate ) 1 feature 603 ( 604 ) features Local CV 0.80641 Public LB / Private LB 0.80569 / 0.79853 100 th / 105 th AUC 0.710 (SEED71) 0.712 (SEEDs avg) previous inst POS_CASH credit 952 features Local CV 0.80646 LB 0.804 ( ~ 0.805 ) Maxwell 603 ( 604 ) selected features based on ONODERA criteria w/o feature selection Stacking-like Light GBM