Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
410
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
330
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.1k
Building a Data Science Team
kriss
2
400
Lean Machine Learning
kriss
5
730
Kaggle Criteo Challenge and Online Learning
kriss
1
260
The #FrenchData landscape
kriss
0
460
Other Decks in Technology
See All in Technology
ついに出た!OpenAIの最新モデル「o1」って何がすごいの?
minorun365
PRO
3
1.3k
OSTという文化を組織に根付かせてみた
sansantech
PRO
2
460
効果的なオンコール対応と障害対応
ryuichi1208
6
3.1k
2ヶ月かかるDBアップグレード検証を最大2週間に短縮した自作Go製CLIツール「Platinum」を紹介する / Introducing Go CLI tool "Platinum" for shortened DB upgrade validation
vtryo
2
130
AIで変わるテスト自動化:最新ツールの多様なアプローチ/ 20240910 Takahiro Kaneyama
shift_evolve
0
260
エムスリーエビデンス創出プロダクトチーム紹介資料 / Introduction of M3 Create Evidence Team
m3_engineering
0
520
OCI で始める!! Red Hat OpenShift / Get Started OpenShift on OCI
oracle4engineer
PRO
1
210
内製化を目指す事業会社が、システム開発会社と共に進める「開発生産性改善」の取り組み事例 #devsumi
yuwji
1
190
やってやろうじゃないかメカアジャイル! / Let's do it, mechanical agile!
psj59129
1
770
JTCや セキュリティチェックリストが夢の跡
nikinusu
1
830
株式会社EventHub・エンジニア採用資料
eventhub
0
3k
モノづくり品質珍道中
kzkhrk
1
100
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
16
960
Ruby is Unlike a Banana
tanoku
96
11k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
363
22k
VelocityConf: Rendering Performance Case Studies
addyosmani
322
23k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
36
2.1k
Navigating Team Friction
lara
183
13k
Automating Front-end Workflow
addyosmani
1365
200k
The Illustrated Children's Guide to Kubernetes
chrisshort
47
48k
No one is an island. Learnings from fostering a developers community.
thoeni
18
2.9k
How to Ace a Technical Interview
jacobian
274
23k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
248
20k
Designing Experiences People Love
moore
138
23k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ