When managing large and distributed networks, it can be challenging to know where to focus your attention, especially when issues have a limited impact or are short lived. User Experience Insight AIOps can help you transform your operations by identifying the most critical issues that need attention before users complain. The first component of AIOps is Incident Detection.
Motivation for AIOps Incident Detection
User Experience Insight sensors run synthetic tests one at a time in a continuous round-robin sequence. There are two types of issues observed in the dashboard which generate notifications: threshold violations and test failures.
Incident Detection uses machine learning to examine all issues in real-time to surface only the issues that you should focus on. An incident is a collection of issues.
Incident Detection Models
User Experience Insight Incident Detection begins with training a model using historical data.
When an issue is detected and the timing of the arrival of the issue in relation to other issues conforms to the model, the issue will appear blue to indicate it is informational. When an issue is detected and the timing of the arrival of the issue in relation to other issues does not confirm to the model, the issue and other non-confirming issues will appear red in the dashboard letting you know this is set of issues is outside the observed baseline for your network and might need immediate attention. Emails, alerts and other notifications will only be sent for issues that are classified as incidents (in red) on the dashboard.
The model requires at least 20 active sensors and sufficient issue data to build relevant models. The model is recalculated every week.
As this feature capabilities expand, more models will be added.
How to Enable AIOps Incident Detection
Go to Settings → AIOps and follow the wizard to enable the feature. Once enabled, the transition to AI Incident mode may take up to 15 minutes. This feature can be toggled on and off only once every 4 hours. So once you have enabled it, you may need to wait 4 hours to disable it.
How to View Historical Situations
To see a view of the past 30 days of incidents, select the bell icon on the top right of the main dashboard. The naming convention for issues is Month/Year-Incident Number.
Select an Incident to navigate to the Incident View. This view shows the specific time period of the incident and where the sensors are located. You can rename the incident and drill down into the triage to better understand the issue.
Mutes - Mutes only affect the visual representation of the dashboard. They do not affect whether an issue can be added to an Incident or any notifications.
Please note that read-only users will receive incident notification messages regardless of group assignment.
Weekly reports are issue-aware for now, but will eventually evolve to provide full support for incidents.
We are considering additional ML models and enhanced AIOps capabilities as we continue to improve upon this feature.