Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
470
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
360
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
290
The #FrenchData landscape
kriss
0
490
Other Decks in Technology
See All in Technology
Oracle Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
1
820
Digitization部 紹介資料
sansan33
PRO
1
6.4k
Cloud WAN MCP Serverから考える新しいネットワーク運用 / 20251228 Masaki Okuda
shift_evolve
PRO
0
130
コールドスタンバイ構成でCDは可能か
hiramax
0
130
Java 25に至る道
skrb
3
120
『君の名は』と聞く君の名は。 / Your name, you who asks for mine.
nttcom
1
140
Eight Engineering Unit 紹介資料
sansan33
PRO
0
6.1k
歴史から学ぶ、Goのメモリ管理基礎
logica0419
7
1.6k
AIエージェントを5分で一気におさらい!AIエージェント「構築」元年に備えよう
yakumo
1
130
あの夜、私たちは「人間」に戻った。 ── 災害ユートピア、贈与、そしてアジャイルの再構築 / 20260108 Hiromitsu Akiba
shift_evolve
PRO
0
330
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
3
260
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
10k
Featured
See All Featured
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
220
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
Into the Great Unknown - MozCon
thekraken
40
2.2k
<Decoding/> the Language of Devs - We Love SEO 2024
nikkihalliwell
1
110
Typedesign – Prime Four
hannesfritz
42
2.9k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
71k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.3k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.8k
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
1
330
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
690
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ