Anomaly Detection. Part 1 – Basics

1 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks BUILD
SOFTWARE TO TEST SOFTWARE exactpro.com Lecture 1. Anomaly Detection Basics ANOMALY DETECTION FOR AI TESTING 20 MAY | 10.00 GET | 11.30 SLST Rostislav Yavorski Head of Research, Exactpro

2 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly,
also known as outlier or novelty • Data points, events, or observations that deviate from normal behaviour • Instances or collections of data that occur very rarely in the data set • Observations which appear to be inconsistent with the remainder of the data 2

3 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Challenges
in Anomaly Detection • Definition of normal behaviour is extremely challenging • Noise data aren’t anomalies • The definition of anomaly is domain-specific • Anomalies evolve over time • Getting a set of labeled anomalous instances is difficult 3

4 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks •
To compute the mean or standard deviation • To improve linear regression models for better predictions • To boost the performance of machine learning algorithms Sometimes anomalies are discarded as waste 4

5 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Sometimes
anomalies are most desirable • fraud detection • fault detection • system health monitoring • event detection in sensor networks • detecting ecosystem disturbances • defect detection in images • medical diagnosis • law enforcement • cyber-security intrusion detection 5

6 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks #1
Mean and Standard Deviation ANOMALY DETECTION FOR AI TESTING 6

7 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 7

8 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Name
Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 8 Computing Mean and Standard Deviation

Salary Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 6 = 44 σ = 32 + 12 + 02 + 12 + 12 + 22 6 = 1.4 9 Computing Mean and Standard Deviation

Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 10 Computing Mean and Standard Deviation

Salary Joe the Intern 9 Maria 41 Jose 43 Ahmed 44 Anna 45 Carlos 45 Patricia 46 M = 41 + 43 + 44 + 45 + 45 + 46 + 9 7 = 39 σ = 22 + 42 + 52 + 62 + 62 + 72 + 302 7 = 12.3 Outlier, very rare value Meaningless results, hardly interpretable 11 Computing Mean and Standard Deviation

Linear Regression ANOMALY DETECTION FOR AI TESTING 12

13 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://www.product-pro.com/preparing-for-mass-production/
13

14 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Units
Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 14

Time 2 units 14 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 4.0 min × Units + 5.3 min (k = 4.0 ± 0.3, b = 5.3 ± 1.7) Performance Prediction with Linear Regression 15 Prediction error is rather small

Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0) Performance Prediction with Linear Regression 16

Time 2 units 14 min 3 units 41 min 4 units 19 min 6 units 30 min 8 units 38 min 10 units 44 min Performance Prediction with Linear Regression Inconsistent observation Poor prediction quality 17 Time = 2.7 min × Units + 16.2 min (k = 2.7 ± 1.5, b = 16.2 ± 9.0)

Fraud Detection ANOMALY DETECTION FOR AI TESTING 18

19 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://www.zoho.com/books/articles/payment-fraud.html
19

20 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A
deliberate act aimed at obtaining an unauthorised benefit: • Theft or misappropriation of funds placed in one's trust • Forgery or alteration of documents or computer files • Authorisation of payment for services not performed • Receipt of unearned wages or benefits • Identity theft 20 Fraud

21 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators
• Excessive number of checking accounts • Frequent changes in banking accounts • Behavioural changes: drugs, alcohol, gambling • Lifestyle changes: expensive cars, jewelry, homes, clothes 21

22 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Indicators
• Excessive number of checking accounts • Frequent changes in banking accounts • Behavioural changes: drugs, alcohol, gambling • Lifestyle changes: expensive cars, jewelry, homes, clothes Deviate from normal behaviour 22

23 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/
! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve

24 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://sdk.finance/all-you-need-to-know-about-machine-learning-based-fraud-detection-systems/
! Rule Based Machine Learning The traditional approach to identifying fraudulent activities through known past behaviour The machine learning approach models a user’s banking patterns and detects anomalous behaviour Commits Fraudster Fraud Rules Detection Human Analysis User for ! Performs User Transaction ML Model Detection Train User for Improve

Medical Diagnosis ANOMALY DETECTION FOR AI TESTING 25

26 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks 26

27 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Anomaly
detection captures unique characteristics of the physiological data that could oﬀer information about the patient Medical Diagnosis 27

28 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Loftus,
Tyler J., et al. "Opportunities for machine learning to improve surgical ward safety." The American Journal of Surgery 220.4 (2020): 905-913. 28 Physiological data Ward admission Streaming electronic health records Early risk stratiﬁcation guides initial triage to ward vs. intensive care unit Eﬃcient, automated, wireless data acquisition making Wearables Machine learning Clinical assessment Early recovery Decompensation Rapid response Delayed recovery Rehabilitation Discharge home Accurate phenotyping, augmented decision-making Early recognition Cardiac arrest Automated alerts, augmented decision-making, early intervention

Fault Detection ANOMALY DETECTION FOR AI TESTING 29

30 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks https://climatix-group.com/wp-content/uploads/2020/01/HVAC-cotractor-Leeds-1.jpg
30

31 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Fault
Detection Monitoring a system identifying when a fault has occurred and pinpointing the type of fault and its location. https://camatsystem.com/ 31

32 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks A.
DESCRIPTIVE ANALYTICS. Detect whether an item is functioning well or not by comparing the information received from it with historical data. B. DIAGNOSTIC ANALYTICS. Identify the causes of the fault. This process should consider trends in health history and operational context. C. PREDICTIVE ANALYTICS. Predict the state of the item within the future to detect any possible fault beforehand. D. PRESCRIPTIVE ANALYTICS. Elaborate maintenance plans considering the previous predictions to cut back fault. Fault Management Systems https://www.cloudmantra.net/blog/fault-detection-using-machine-learning-techniques/ 32 MANAGER SYSTEM MONITORING FAULT DETECTION FAULT PREDICTION ROOT CASE ANALYSIS FAULT PREVENTION AND RECOVERY

33 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Conclusion
ANOMALY DETECTION FOR AI TESTING 33

34 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks Terms
34 An outlier is a data point that differs significantly from other observations. Anomalies are patterns in data that do not conform to a well-defined notion of normal behaviour.

35 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks •
Definition of normal behaviour is extremely challenging • Noise data aren’t anomalies • The definition of anomaly is domain-specific • Anomalies evolve over time • Getting a checklist of all possible anomalies is difficult 35 Challenges in Anomaly Detection

36 BUILD SOFTWARE TO TEST SOFTWARE AI Testing Talks AI
Testing Talks Thank You!

Anomaly Detection. Part 1 – Basics

Anomaly Detection. Part 1 – Basics

More Decks by Exactpro

Featured

Transcript