Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Christophe Bourguignat
April 15, 2016
Technology
0
480
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
370
Software Engineers, the New Data Scientists
kriss
1
150
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
420
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
300
The #FrenchData landscape
kriss
0
500
Other Decks in Technology
See All in Technology
非同期・イベント駆動処理の分散トレーシングの繋げ方
ichikawaken
1
210
AIエージェント時代に必要な オペレーションマネージャーのロールとは
kentarofujii
0
210
SaaSの操作主体は人間からAIへ - 経理AIエージェントが目指す深い自動化
nishihira
0
120
PostgreSQL 18のNOT ENFORCEDな制約とDEFERRABLEの関係
yahonda
0
140
ハーネスエンジニアリング×AI適応開発
aictokamiya
1
640
OpenClawでPM業務を自動化
knishioka
1
320
Even G2 クイックスタートガイド(日本語版)
vrshinobi1
0
110
OPENLOGI Company Profile for engineer
hr01
1
61k
QA組織のAI戦略とAIテスト設計システムAITASの実践
sansantech
PRO
1
250
Physical AI on AWS リファレンスアーキテクチャ / Physical AI on AWS Reference Architecture
aws_shota
1
180
夢の無限スパゲッティ製造機 #phperkaigi
o0h
PRO
0
390
脳が溶けた話 / Melted Brain
keisuke69
1
1.1k
Featured
See All Featured
Measuring & Analyzing Core Web Vitals
bluesmoon
9
800
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
300
Speed Design
sergeychernyshev
33
1.6k
GraphQLの誤解/rethinking-graphql
sonatard
75
12k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.2k
The agentic SEO stack - context over prompts
schlessera
0
720
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9k
Tell your own story through comics
letsgokoyo
1
870
Embracing the Ebb and Flow
colly
88
5k
The Cost Of JavaScript in 2023
addyosmani
55
9.8k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
310
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ