Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning without the Hype

Machine Learning without the Hype

Machine learning is both a highly overloaded and hyped topic. This talk covers one specific area in this space — anomaly detection of time-series data. It sounds very narrow, but is widely applicable in IT security and operations.

In particular we take a look at:
* What is artificial intelligence, machine learning, and deep learning mean in general?
* When is a rule-based approach the right solution and when do you need machine learning?
* What does machine learning mean for time-series data?
* What is the difference between supervised and unsupervised learning in this area?
* What could an example with an actual dataset look like?

Avatar for Philipp Krenn

Philipp Krenn

June 26, 2018
Tweet

More Decks by Philipp Krenn

Other Decks in Programming

Transcript

  1. ❝Using #DeepLearning when all you needed was a few if

    statements. #MachineLearning #DataScience❞ —https://twitter.com/randal_olson/status/927157485240311808
  2. ❝Alice: I love stateless protocols! Bob: There has to be

    something bad about them. Alice: Bad about what?❞ —https://twitter.com/znjp/status/933405548678021120
  3. Machine Learning Algorithms parse data → learn from it →

    make a determination or prediction "Trained" machine
  4. ❝Learn from experience E with respect to some class of

    tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.❞
  5. ❝"Machine Learning is an emerging tech!" Logistic regression 1958 Hidden

    Markov Model 1960 Support Vector Machine 1963 k-nearest neighbors 1967 Artificial Neural Networks 1975 Expectation Maximization 1977 Decision tree 1986 Q-learning 1989 Random forest 1995❞ —https://twitter.com/farbodsaraf/status/977916871000412160
  6. ❝But saying "powered by AI" is like saying you’re "powered

    by the internet" or "powered by computer code". By itself, it means nothing.❞ —https://twitter.com/jensenharris/status/999119292086960128
  7. ❝"What's the difference between AI and ML?" "It's AI when

    you're raising money, it's ML when you're trying to hire people."❞ —https://twitter.com/WAWilsonIV/status/925599712849174528
  8. ❝OH: "Do you run any CPU intensive application on your

    laptop? Like, machine learning, or Slack?" ! ❞ —https://twitter.com/jpetazzo/status/932464823530430464
  9. Multiple Time Series Multiple metrics or single metric split up

    Each series modeled independently Example: Unusual activity by country?
  10. nginx access log { "source": "/home/ec2-user/data/production-4/prod4elasticlog/_logs/access-logs541.log", "beat": { "hostname": "ip-172-31-5-206",

    "name": "ip-172-31-5-206", "version": "5.4.0" }, "@timestamp": "2017-03-08T11:44:51.562Z", "read_timestamp": "2017-06-20T08:49:58.538Z", "fileset": { "name": "access", "module": "nginx" },
  11. "nginx": { "access": { "body_sent": { "bytes": "3262" }, "url":

    "/assets/blt1afcb054f02e257c/logo-activision.svg", "geoip": { "continent_name": "Asia", "country_iso_code": "IN", "location": { "lat": 20, "lon": 77 } },
  12. "response_code": "200", "user_agent": { "device": "Other", "os_name": "Other", "os": "Other",

    "name": "Other" }, "http_version": "1.1", "method": "GET", "remote_ip": "192.19.197.26" } }, "prospector": { "type": "log" } }
  13. 43 rules Rule #1: Don’t be afraid to launch a

    product without machine learning Rule #14: Starting with an interpretable model makes debugging easier Rule #16: Plan to launch and iterate