Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Better than Deep Learning: Gradient Boosting Ma...

szilard
October 16, 2018
99

Better than Deep Learning: Gradient Boosting Machines (GBM) - Crunch Conference - Budapest, Oct 2018

szilard

October 16, 2018
Tweet

More Decks by szilard

Transcript

  1. Better than Deep Learning: Gradient Boosting Machines (GBM) Szilárd Pafka,

    PhD Chief Scientist, Epoch USA Crunch Conference, Budapest Oct 2018
  2. Disclaimer: I am not representing my employer (Epoch) in this

    talk I cannot confirm nor deny if Epoch is using any of the methods, tools, results etc. mentioned in this talk
  3. ...

  4. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL
  5. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends
  6. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all
  7. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning
  8. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning / ensembles
  9. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning / ensembles feature engineering
  10. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning / ensembles feature engineering / other goals e.g. interpretability
  11. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning / ensembles feature engineering / other goals e.g. interpretability the title of this talk was misguided
  12. structured/tabular data: GBM (or RF) very small data: LR very

    large sparse data: LR with SGD (+L1/L2) images/videos, speech: DL it depends / try them all / hyperparam tuning / ensembles feature engineering / other goals e.g. interpretability the title of this talk was misguided but so is recently almost every use of the term AI
  13. I usually use other people’s code [...] I can find

    open source code for what I want to do, and my time is much better spent doing research and feature engineering -- Owen Zhang http://blog.kaggle.com/2015/06/22/profiling-top-kagglers-owen-zhang-currently-1-in-the-world/
  14. 10x

  15. 10x