Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Christophe Bourguignat
April 15, 2016
Technology
0
480
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
370
Software Engineers, the New Data Scientists
kriss
1
150
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
420
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
300
The #FrenchData landscape
kriss
0
500
Other Decks in Technology
See All in Technology
互換性のある(らしい)DBへの移行など考えるにあたってたいへんざっくり
sejima
PRO
0
290
開発チームとQAエンジニアの新しい協業モデル -年末調整開発チームで実践する【QAリード施策】-
qa
0
400
イベントで大活躍する電子ペーパー名札を作る(その2) 〜 M5PaperとM5PaperS3 〜 / IoTLT @ JLCPCB オープンハードカンファレンス
you
PRO
0
210
FastMCP OAuth Proxy with Cognito
hironobuiga
3
220
ハーネスエンジニアリング×AI適応開発
aictokamiya
1
550
Oracle AI Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
3
2k
【社内勉強会】新年度からコーディングエージェントを使いこなす - 構造と制約で引き出すClaude Codeの実践知
nwiizo
28
14k
「AIエージェントで変わる開発プロセス―レビューボトルネックからの脱却」
lycorptech_jp
PRO
0
180
AI時代のオンプレ-クラウドキャリアチェンジ考
yuu0w0yuu
0
600
Zephyr(RTOS)でOpenPLCを実装してみた
iotengineer22
0
150
FASTでAIエージェントを作りまくろう!
yukiogawa
4
160
Oracle Cloud Infrastructure:2026年3月度サービス・アップデート
oracle4engineer
PRO
0
160
Featured
See All Featured
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
440
Typedesign – Prime Four
hannesfritz
42
3k
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
480
Ethics towards AI in product and experience design
skipperchong
2
240
Believing is Seeing
oripsolob
1
99
Visualization
eitanlees
150
17k
A brief & incomplete history of UX Design for the World Wide Web: 1989–2019
jct
1
330
A Tale of Four Properties
chriscoyier
163
24k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
260
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
330
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
1k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ