Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Anomaly Detection. Part 1 – Basics

Exactpro
May 20, 2022
72

Anomaly Detection. Part 1 – Basics

Rostislav Yavorski
Head of Research, Exactpro

“In this lecture, we will review the definitions and practical examples of outliers and anomalies in different domains: financial fraud detection, medical diagnosis, fault identification, etc.”

AI Testing Talks – Anomaly Detection. 20 May 2022

https://exactpro.com/events/external/ai-testing-talks-anomaly-detection?utm_source=speakerdeck&utm_medium=Refferer&utm_campaign=basics

---

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

Exactpro

May 20, 2022
Tweet

Transcript

  1. 1 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks BUILD

    SOFTWARE TO TEST SOFTWARE exactpro.com Lecture 1. Anomaly Detection Basics ANOMALY DETECTION FOR AI TESTING 20 MAY | 10.00 GET | 11.30 SLST Rostislav Yavorski Head of Research, Exactpro
  2. 2 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly,

    also known as outlier or novelty • Data points, events, or observations that deviate from normal behaviour • Instances or collections of data that occur very rarely in the data set • Observations which appear to be inconsistent with the remainder of the data 2
  3. 3 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Challenges

    in Anomaly Detection • Definition of normal behaviour is extremely challenging • Noise data aren’t anomalies • The definition of anomaly is domain-specific • Anomalies evolve over time • Getting a set of labeled anomalous instances is difficult 3
  4. 4 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks •

    To compute the mean or standard deviation • To improve linear regression models for better predictions • To boost the performance of machine learning algorithms Sometimes anomalies are discarded as waste 4
  5. 5 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Sometimes

    anomalies are most desirable • fraud detection • fault detection • system health monitoring • event detection in sensor networks • detecting ecosystem disturbances • defect detection in images • medical diagnosis • law enforcement • cyber-security intrusion detection 5
  6. 6 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #1

    Mean and Standard Deviation ANOMALY DETECTION FOR AI TESTING 6
  7. 8 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name

    Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 8 Computing Mean and Standard Deviation
  8. 9 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name

    Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 9 Computing Mean and Standard Deviation
  9. 10 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name

    Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 10 Computing Mean and Standard Deviation
  10. 11 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name

    Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 Outlier, very rare value Meaningless results, hardly interpretable 11 Computing Mean and Standard Deviation
  11. 12 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #2

    Linear Regression ANOMALY DETECTION FOR AI TESTING 12
  12. 14 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units

    Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 14
  13. 15 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units

    Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 15 Prediction error is rather small
  14. 16 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units

    Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0) Performance Prediction with Linear Regression 16
  15. 17 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units

    Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Performance Prediction with Linear Regression Inconsistent observation Poor prediction quality 17 Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0)
  16. 18 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #3

    Fraud Detection ANOMALY DETECTION FOR AI TESTING 18
  17. 20 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A

    deliberate act aimed at obtaining an unauthorised benefit: • Theft or misappropriation of funds placed in one's trust • Forgery or alteration of documents or computer files • Authorisation of payment for services not performed • Receipt of unearned wages or benefits • Identity theft 20 Fraud
  18. 21 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators

    • Excessive number of checking accounts • Frequent changes in banking accounts • Behavioural changes: drugs, alcohol, gambling • Lifestyle changes: expensive cars, jewelry, homes, clothes 21
  19. 22 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators

    • Excessive number of checking accounts • Frequent changes in banking accounts • Behavioural changes: drugs, alcohol, gambling • Lifestyle changes: expensive cars, jewelry, homes, clothes Deviate from normal behaviour 22
  20. 23 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/

    ! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve
  21. 24 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/

    ! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve
  22. 25 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #4

    Medical Diagnosis ANOMALY DETECTION FOR AI TESTING 25
  23. 27 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly

    detection captures unique characteristics of the physiological data that could offer information about the patient Medical Diagnosis 27
  24. 28 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Loftus,

    Tyler J., et al. "Opportunities for machine learning to improve surgical ward safety." The American Journal of Surgery 220.4 (2020): 905-913. 28 Physiological data Ward admission Streaming electronic health records Early risk stratification guides initial triage to ward vs. intensive care unit Efficient, automated, wireless data acquisition making Wearables Machine learning Clinical assessment Early recovery Decompensation Rapid response Delayed recovery Rehabilitation Discharge home Accurate phenotyping, augmented decision-making Early recognition Cardiac arrest Automated alerts, augmented decision-making, early intervention
  25. 29 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #5

    Fault Detection ANOMALY DETECTION FOR AI TESTING 29
  26. 31 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Fault

    Detection Monitoring a system identifying when a fault has occurred and pinpointing the type of fault and its location. https://camatsystem.com/ 31
  27. 32 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A.

    DESCRIPTIVE ANALYTICS. Detect whether an item is functioning well or not by comparing the information received from it with historical data. B. DIAGNOSTIC ANALYTICS. Identify the causes of the fault. This process should consider trends in health history and operational context. C. PREDICTIVE ANALYTICS. Predict the state of the item within the future to detect any possible fault beforehand. D. PRESCRIPTIVE ANALYTICS. Elaborate maintenance plans considering the previous predictions to cut back fault. Fault Management Systems https://www.cloudmantra.net/blog/fault-detection-using-machine-learning-techniques/ 32 MANAGER SYSTEM MONITORING FAULT DETECTION FAULT PREDICTION ROOT CASE ANALYSIS FAULT PREVENTION AND RECOVERY
  28. 34 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Terms

    34 An outlier is a data point that differs significantly from other observations. Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour.
  29. 35 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks •

    Definition of normal behaviour is extremely challenging • Noise data aren’t anomalies • The definition of anomaly is domain-specific • Anomalies evolve over time • Getting a checklist of all possible anomalies is difficult 35 Challenges in Anomaly Detection