Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
420
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
330
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.1k
Building a Data Science Team
kriss
2
400
Lean Machine Learning
kriss
5
740
Kaggle Criteo Challenge and Online Learning
kriss
1
260
The #FrenchData landscape
kriss
0
460
Other Decks in Technology
See All in Technology
生成AIが変えるデータ分析の全体像
ishikawa_satoru
0
180
マルチプロダクトな開発組織で 「開発生産性」に向き合うために試みたこと / Improving Multi-Product Dev Productivity
sugamasao
1
310
EventHub Startup CTO of the year 2024 ピッチ資料
eventhub
0
130
CDCL による厳密解法を採用した MILP ソルバー
imai448
3
180
あなたの知らない Function.prototype.toString() の世界
mizdra
PRO
2
400
なぜ今 AI Agent なのか _近藤憲児
kenjikondobai
4
1.4k
Engineer Career Talk
lycorp_recruit_jp
0
190
AWS Media Services 最新サービスアップデート 2024
eijikominami
0
200
TanStack Routerに移行するのかい しないのかい、どっちなんだい! / Are you going to migrate to TanStack Router or not? Which one is it?
kaminashi
0
610
複雑なState管理からの脱却
sansantech
PRO
1
160
OCI Network Firewall 概要
oracle4engineer
PRO
0
4.2k
初心者向けAWS Securityの勉強会mini Security-JAWSを9ヶ月ぐらい実施してきての近況
cmusudakeisuke
0
130
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1030
460k
The Pragmatic Product Professional
lauravandoore
31
6.3k
Side Projects
sachag
452
42k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
25
1.8k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
27
4.3k
The World Runs on Bad Software
bkeepers
PRO
65
11k
Designing the Hi-DPI Web
ddemaree
280
34k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
For a Future-Friendly Web
brad_frost
175
9.4k
Typedesign – Prime Four
hannesfritz
40
2.4k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Into the Great Unknown - MozCon
thekraken
32
1.5k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ