Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Christophe Bourguignat
April 15, 2016
Technology
0
480
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
360
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
290
The #FrenchData landscape
kriss
0
490
Other Decks in Technology
See All in Technology
バイブコーディングで作ったものを紹介
tatsuya1970
0
120
『誰の責任?』で揉めるのをやめて、エラーバジェットで判断するようにした ~感情論をデータで終わらせる、PMとエンジニアの意思決定プロセス~
coconala_engineer
0
240
Cosmos World Foundation Model Platform for Physical AI
takmin
0
1k
[CV勉強会@関東 World Model 読み会] Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models (Mousakhan+, NeurIPS 2025)
abemii
0
170
AWS DevOps Agent x ECS on Fargate検証 / AWS DevOps Agent x ECS on Fargate
kinunori
3
410
Context Engineeringが企業で不可欠になる理由
hirosatogamo
PRO
3
810
【Ubie】AIを活用した広告アセット「爆速」生成事例 | AI_Ops_Community_Vol.2
yoshiki_0316
1
140
なぜAIは チーム開発を 速くしないのか
tan_go238
6
2.8k
AIエージェントのメモリについて
shibuiwilliam
0
140
AgentCore RuntimeをVPCにデプロイして 開発ドキュメント作成AIエージェントを作った
alchemy1115
3
210
生成AIと余白 〜開発スピードが向上した今、何に向き合う?〜
kakehashi
PRO
1
250
Claude_CodeでSEOを最適化する_AI_Ops_Community_Vol.2__マーケティングx_AIはここまで進化した.pdf
riku_423
2
650
Featured
See All Featured
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
Ruling the World: When Life Gets Gamed
codingconduct
0
150
HDC tutorial
michielstock
1
420
Being A Developer After 40
akosma
91
590k
The Pragmatic Product Professional
lauravandoore
37
7.2k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
790
Test your architecture with Archunit
thirion
1
2.2k
What's in a price? How to price your products and services
michaelherold
247
13k
Ethics towards AI in product and experience design
skipperchong
2
210
So, you think you're a good person
axbom
PRO
2
1.9k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
210
Google's AI Overviews - The New Search
badams
0
920
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ