Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
470
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
360
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
770
Kaggle Criteo Challenge and Online Learning
kriss
1
290
The #FrenchData landscape
kriss
0
490
Other Decks in Technology
See All in Technology
Edge AI Performance on Zephyr Pico vs. Pico 2
iotengineer22
0
140
学習データって増やせばいいんですか?
ftakahashi
2
330
チーリンについて
hirotomotaguchi
6
1.9k
regrowth_tokyo_2025_securityagent
hiashisan
0
230
Kubernetes Multi-tenancy: Principles and Practices for Large Scale Internal Platforms
hhiroshell
0
120
ブロックテーマとこれからの WordPress サイト制作 / Toyama WordPress Meetup Vol.81
torounit
0
570
今からでも間に合う!速習Devin入門とその活用方法
ismk
1
690
プロンプトやエージェントを自動的に作る方法
shibuiwilliam
4
2.7k
AIプラットフォームにおけるMLflowの利用について
lycorptech_jp
PRO
1
130
Haskell を武器にして挑む競技プログラミング ─ 操作的思考から意味モデル思考へ
naoya
6
1.5k
.NET 10の概要
tomokusaba
0
100
Snowflakeでデータ基盤を もう一度作り直すなら / rebuilding-data-platform-with-snowflake
pei0804
5
1.5k
Featured
See All Featured
Optimising Largest Contentful Paint
csswizardry
37
3.5k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Into the Great Unknown - MozCon
thekraken
40
2.2k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
RailsConf 2023
tenderlove
30
1.3k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.8k
Java REST API Framework Comparison - PWX 2021
mraible
34
9k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ