Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
450
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
350
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.1k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
760
Kaggle Criteo Challenge and Online Learning
kriss
1
270
The #FrenchData landscape
kriss
0
480
Other Decks in Technology
See All in Technology
Ninno LT
kawaguti
PRO
1
110
Oracle Cloud Infrastructure:2025年4月度サービス・アップデート
oracle4engineer
PRO
0
380
Part2 GitHub Copilotってなんだろう
tomokusaba
1
550
雑に疎通確認だけしたい...せや!CloudShell使ったろ!
alchemy1115
0
130
Previewでもここまで追える! Azure AI Foundryで始めるLLMトレース
tomodo_ysys
2
430
AI-in-the-Enterprise|OpenAIが公開した「AI導入7つの教訓」——ChatGPTで変わる企業の未来とは?
customercloud
PRO
0
150
Terraform にコントリビュートしていたら Azure のコストをやらかした話 / How I Messed Up Azure Costs While Contributing to Terraform
nnstt1
1
260
MySQL InnoDB Data Recovery - The Last Resort
lefred
0
110
正式リリースされた Semantic Kernel の Agent Framework 全部紹介!
okazuki
1
790
本当に必要なのは「QAという技術」だった!試行錯誤から生まれた、品質とデリバリーの両取りアプローチ / Turns Out, "QA as a Discipline" Was the Key!
ar_tama
9
2.9k
2025-04-14 Data & Analytics 井戸端会議 Multi tenant log platform with Iceberg
kamijin_fanta
1
180
3D生成AIのための画像生成
kosukeito
2
600
Featured
See All Featured
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3.2k
Speed Design
sergeychernyshev
29
930
The World Runs on Bad Software
bkeepers
PRO
68
11k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Designing for Performance
lara
608
69k
Visualization
eitanlees
146
16k
Product Roadmaps are Hard
iamctodd
PRO
53
11k
How STYLIGHT went responsive
nonsquared
100
5.5k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
Rebuilding a faster, lazier Slack
samanthasiow
81
9k
Building Better People: How to give real-time feedback that sticks.
wjessup
368
19k
Done Done
chrislema
184
16k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ