• 25 years • $5.5 Billion Revenue • 10,000+ employees • 110+ offices Fastest growing All-Flash Array vendor Recognized as a Leader in multiple Gartner Magic Quadrants • General-Purpose Disk Arrays • Solid-State Arrays • Integrated Systems We help out customers to: • Drive value with data • Gain Insight, Access, and Control NetApp Cloud Central And
and Servers) 10M+ Storage Systems 35K+ • The top 10 Fortune 500 companies • 6 of the top 10 US Retailers • 8 of the top 10 Banks • 5 of the top 10 Insurance companies • 7 of the top 10 Tech and Service Providers Top Companies Rely on OnCommand Insight 5 NetApp OnCommand Insight Hybrid Cloud Infrastructure Analytics Manage growth and complexity Troubleshoot issues Identify and monitor cost
Switches On-Premise Public Cloud Consistent insights across multi-vendor, hybrid infrastructure Intelligent Operations • Discover and monitor resources, their relationships and dependencies • Proactive alerting and fast troubleshooting with advanced analytics Business Insights • Resource optimization • Cost alignment and show back • Forecast performance and capacity planning • Enables business workflows such as billing, cost, change management and automation Ecosystem Integration • Open API provide access to discovered and monitored data Inventory - Resources Performance - KPI Topology and Relationships
Topology Business Context Data Expert View Analytics Section Related Resources Resource Landing Page • Customized view for each resource type • A 360 degree view of the resource including metrics, topology and business context • Expert view with charts, and advanced analytics section • Quick navigation to related resources’ landing pages 7
Public Cloud! Switch Switch Storage Storage Storage VM VM VM VM VM Hypervisor volume Storage VM Hypervisor Switch Switch App – running OmPrem App – running on AWS(or Azure) AWS Instance EC2 AWS Instance AWS Instance EBS Volume S3 Buckets Switch Storage VM VM KVM App - running on OpenStack VM Switch Storage
of operation, a predictable cyclical pattern. • This pattern is not a simple, and the cycles can span over hours, days, weeks and months. • Static threshold works when the user knows what is “bad”, otherwise creates noise • ML is good for detecting when the pattern has changed • Prelert (now Elastic ML) and Elastic. • OnCommand Insight implementation of Elastic ML for Anomaly Detection Anomaly Detection 25 +
• Discovering infrastructure resources • Collecting key performance metrics (Latency, IOPS, Utilization) • OnCommand Insight Server • Compute the service path and relationships • Realizes all the Application resources • Packages and send the data to the Elastic ML (job) • Anomaly Detection Engine – Elastic ML • Learns and models normal and detect anomalies • OnCommand Insight UI • Presents the Application anomaly score, with anomalous resources Anomaly Detection Engine – Elastic ML Data Sources
Application infrastructure resource stack §Overall Anomaly Score and Time §Highlight anomalous resources – 1,2,3 blue bars to indicate the significance of the Anomaly §Call out resources for further investigation §Application Anomaly Score chart Application Anomaly Score and the Time of Anomaly Add to Expert View # of resources Forensic View Anomalous resource Application Anomaly Score at this time
• The data richness and the data quality • Understanding the path, the relationships, and enriching the data with business context • The powerful search, flexible visualization of the data with the topologies • Analytics and Machine learning for proactive alerting OCI technology helps our customers to become aware of issues before becoming a problem, preventing an approaching outage in their environment. Another leap forward in the path Intelligent Operation Summary – The Path to Intelligent Operation 30
with timeseries and heavy analysis • Elasticsearch can do a lot but it can be further extended with plugins • Plugins are good but… documentation is scarce and code hooked up to Elasticsearch version • Elasticsearch is better at smaller number of large indexes than larger number of smaller indexes • Rollover API can be your friend Lessons Learned - Elastic 32
math might be right, but this is not always enough. Excluding Anomalies Below the Thresholds • A change in very small numbers (0.005 – 0.5) is mathematically significant. • Yet, it is very case specific, becoming an interesting anomaly! Dormant Resources • Resources who does very little work – mostly inactive • A sudden, even subtle change in the performance can generate anomaly • In most cases this is not a critical resource to alert for These resources are not excluded from the learning, only from the results