Upgrade to Pro — share decks privately, control downloads, hide ads and more …

KDD 2017

GDP Labs
October 25, 2017

KDD 2017

GDP Labs

October 25, 2017
Tweet

More Decks by GDP Labs

Other Decks in Technology

Transcript

  1. KDD • Held by Special Interest Group on Knowledge Discovery

    and Data Mining ( SIGKDD ) • Part of Association for Computing Machinery ( ACM ) • SIGKDD has become an official ACM SIG since 1998 • ACM SIGKDD has hosted an annual conference since 1995 • Considered as the most influential forum for knowledge discovery and data mining research GDP Labs Confidential
  2. KDD 2012 Beijing 2013 Chicago 2014 New York 2015 Sydney

    2016 San Fransisco 2017 Halifax GDP Labs Confidential
  3. KDD 2017 • Held at Halifax, Canada • August 13th

    - 17th 2017 • Consists of : ◦ Tutorial ◦ Workshop ◦ Main Conference ▪ Keynote ▪ Paper presentations • Sponsored by a lot of named tech companies from around the world GDP Labs Confidential
  4. KDD 2017 • From Theory to Data Product - Applying

    Data Science Methods to Effect Business Change • A/B Testing at Scale : Accelerating Software Innovation • Workshop on Causal Discovery • Anomaly Detection in Finance • Platforms and Infrastructure • Applied Machine Learning • Deep Learning • Intelligent Systems and Data Science • KDD Panel : The Future of Artificially Intelligent Assistants • Clustering • Web Applications GDP Labs Confidential
  5. From Theory to Data Product : Applying Data Science Methods

    to Effect Business Change • Advanced analytic entry points ◦ The Technology Directive ◦ The Field of Dreams ◦ The Ambitious Executive ◦ The Smart Competitor • Are you asking the right questions? ◦ Business valuable questions • Agile Approach to Data Driven Decision Making ◦ What is agile? ◦ Managing uncertainty • Data science is expensive, but if you don’t do it, you will be left behind Danielle Leighton Lindsay Brin Janet Forbes T4G Limited GDP Labs Confidential
  6. Google Vizier - A Service for Black Box Optimization Daniel

    Golovin Google Research • Used on many applications ◦ A/B Testing ◦ Machine Learning ◦ Physical Design ◦ Robotics • Vizier ◦ Easy to use ◦ Reliable ◦ Scalable ◦ Flexible ◦ State of the art • They tried Vizier to bake chocolate chip cookies GDP Labs Confidential
  7. TFX : TensorFlow Extended - A Production Scale ML Platform

    Heng Tze Cheng Google Research • Productionizing ML pipeline is hard • End to end ( from data to serving ) • Design principles : ◦ One ML platform for many products ◦ Continuous training and serving ◦ Human in the loop ◦ Reliable and scalable • Steps : ◦ Data analysis ◦ Data validation ◦ Data transformation ◦ Trainer ◦ Model validation & evaluation ◦ Serving • Documentation -> passive way • Education -> active way ( teach, talk ) • Automation -> key principles GDP Labs Confidential
  8. Designing AI at Scale to Power Everyday Life Rajesh Parekh

    Facebook • Data drives all product at Facebook • FB Learner Flow ◦ Help non ML people to use ML ◦ 70% users are non AI-experts ◦ 25% engineers are active users • Applied Machine Learning at Facebook ◦ Computer Vision ◦ Deep Text ◦ Speech and Video ◦ AI Powered Camera • What’s next ◦ Multi modal learning ◦ Transfer learning ◦ Multi lingual modeling ◦ Weakly supervised / unsupervised learning GDP Labs Confidential
  9. Industrial Machine Learning Josh Bloom General Electric • Data produced

    by machines will overcome data produced by people • Industrial Machine Learning ◦ Preventive Maintenance ◦ Anomaly / Failure detection ◦ Etc • Industrial level is more dangerous than social level • Optimization metric ◦ Higher accuracy != higher value ◦ Drive towards higher precision/low FPR • Models ◦ Physical model ◦ Data driven model • Interpretability and accuracy trade off • Small improvement helps a lot • Hard to convince everyone to transition GDP Labs Confidential
  10. KDD 2017 • Deep Learning for Personalized Search and Recommender

    Systems • Recent Advances in Feature Selection: A Data Perspective • 2017 Edition of AdKDD and TargetAd • Three Principles of Data Science: Predictability, Stability, and Computability • Supervised Learning • Deep Learning • Intelligent Systems and Data Science • Hands-On Tutorial Declarative, Large-Scale Machine Learning with Apache SystemML • Hands-On Tutorial Tensor Flow GDP Labs Confidential
  11. Ad Pricing • Popular Model ◦ CPM -> cost per

    mille impression ◦ CPC -> cost per click ◦ CPA -> cost per acquisition • Others ◦ CPF -> cost per follower ◦ CPV -> cost per view ◦ CPI -> cost per install ◦ CPD -> cost per download GDP Labs Confidential