Anomaly Policy
The Anomaly policy in Motadata AIOps is a powerful tool designed to detect and alert on anomalous behavior in system metrics, log data, and flow data. It utilizes sophisticated algorithms to identify deviations from expected patterns and triggers alerts when unusual or abnormal behavior is detected. This policy evaluation occurs every 15 minutes, providing real-time insights into potential issues or anomalies within the IT environment.
Anomaly detection is particularly useful for monitoring metrics that exhibit strong trends and recurring patterns, making it challenging to effectively monitor using traditional threshold-based alerting. By considering trends, the Anomaly policy can accurately identify deviations from expected behavior, even in complex and dynamic environments.
Use Case
Imagine a scenario where a company's e-commerce website experiences a sudden surge in page load times. Typically, the website's performance remains consistent during peak hours, with an average page load time of 2 seconds. However, due to a technical glitch or increased user traffic, the page load times start fluctuating significantly, sometimes exceeding 10 seconds.
With the Anomaly Policy enabled, the system monitors the page load time metric continuously. During the regular evaluation intervals, it compares the current page load times with historical data and expected patterns. In this case, the policy detects the sudden spikes and prolonged delays in page load times, which deviate significantly from the normal range.
Upon detecting the anomaly, the Anomaly Policy triggers an alert, notifying the IT operations team about the performance degradation. The team can promptly investigate the issue, identify the root cause, and take appropriate actions to optimize the website's performance, ensuring a seamless user experience.
By leveraging the Anomaly Policy, organizations can identify unexpected variations in performance metrics, such as response times, latency, or throughput. This empowers them to proactively address issues, maintain high service levels, and enhance customer satisfaction.
Anomaly Policy Mechanism
Minimum Polling data required for the policy to work
To ensure the effectiveness of the Anomaly policy, a minimum of 8 hours of polling data is required for each monitored metric. This duration allows the alert engine to establish a baseline of expected values and intelligently determine the acceptable range for that metric. Any values that fall outside this range are considered anomalous and may trigger an alert if other conditions are met.
Polling values aggregation
To aggregate the polling values effectively, the alert engine consolidates all the polling values into a single sample point every half hour. This aggregation provides a more comprehensive view of the metric's behavior and facilitates accurate anomaly detection.
Sample Lookup
The Anomaly policy offers flexibility through the Sample Lookup field, which determines the number of samples used for evaluating the policy. By specifying the sample lookup as, for example, '30,' the policy will consider the last 30 samples for evaluation. This will be explained further in detail below under the 'Assumption Based Scenarios' section.
The Anomaly policy in Motadata AIOps empowers IT teams to proactively detect and respond to abnormal behavior in their IT infrastructure. By leveraging advanced anomaly detection algorithms and real-time monitoring, organizations can swiftly identify and address potential issues, ensuring optimal performance, and minimizing disruptions.
Navigation
Go to Menu, Select Settings . After that, Go to Policy Settings . Select Metric/Log/Flow policy based on the type of policy you want to create. The list of the created policies is now displayed.
Select to start creating a policy. Select Anomaly Policy.
Configuring Anomaly policy
Enter the following parameters to create Anomaly policy:
Field | Description |
---|---|
Policy Name | Enter a unique name of the policy you want to create. |
Tag | Enter a name to logically categorize the policy. You can quickly and easily identify a policy based on the tag assigned to it. This tag can be used later on to filter the policies as per your requirement. |
Set Conditions
Field | Description |
---|---|
Counter | Select the metric for which you want to create the policy. Click on the dropdown to view the available options. |
Source Filter | - Select Monitor if you want to create the policy for a single monitor. - Select Group if you want to create the policy for a group of monitors. In case you create the policy for a group, it is configured for all the monitors present in the group individually. - Select Everywhere if you want to create the policy for all the monitors created in the system. This option is selected by default. |
Source | Select the specific Monitor or Group for which you want to create the policy. This dropdown will show results based on the option you have selected in the previous option. Leave this field blank if you have selected 'Everywhere' in the previous option. |
Make sure that in case you select a specific monitor(s) in the previous selection, the monitor(s) has the metric for which you are creating the policy. In case you select Everywhere, the policy will be created for all the monitors in the system having the metric you have selcted.
Critical/Major/Warning | Kindly use these fields to set the criteria under which the alert will be triggered. Here, you can also decide the alert severity based on the conditions you set. |
Sample Lookup | This field determines the number of samples used for evaluating the policy. |
Auto Clear | Kindly enter the time in which you want the alert to be cleared irrespective of any other conditions. |
Assumption Based Scenarios
To further understand the last two parameters, let us consider a few scenarios with following assumptions in mind:
Let us assume that the Sample Lookup is configured as '10', this means that the policy will consider the last 10 samples for policy evaluation.
The policy is configured to trigger a critical alert when more than 40% samples are anomalous, a major alert when more than 30% samples are anomalous, and a warning alert when more than 20% of the samples exhibit anomalies as shown in the screenshot below.
Let us consider a policy evaluation which starts at 8:00 PM(as explained earlier, policy evaluation for AI/ML policies occurs every 15 Mins).
Here, the policy is configured to trigger a critical alert when more than 40% (5 out of 10) samples are anomalous, a major alert when more than 30% (4 out of 10) samples are anomalous, and a warning alert when more than 20%(3 out of 10) samples exhibit anomalies. No Alert will be triggered if less than 3 samples are anomalous.
Scenario 1
In this case, the alert will be triggered with Critical severity based on the policy configuration mentioned above.
Scenario 2
In this case, the alert will be triggered with Major severity based on the policy configuration mentioned above.
Scenario 3
In this case, the alert will be triggered with Warning severity based on the policy configuration mentioned above.
Scenario 4
In this case, No alert will be triggered based on the policy configuration mentioned above.
Now, let us get back to other parameters to start creating the policy.
Notify Team
Field | Description |
---|---|
Notify | There are two ways you can populate this field: |
If severity is | Select the severity level using individual checkboxes in the dropdown.You can select multiple, all, or a single option as per your requirement. You can also have different recipients notified at different severity levels. For instance, you can notify johndoe@motadata.com when severity level hits Critical and send an alert notification to janedoe@motadata.com when severity level is Major. |
Play Sound | Activate this toggle to enable sound notifications when an alert is triggered. |
If Severity is | Choose the severity level at which the sound notification should be triggered. This option becomes visible only when the Play Sound toggle is switched ON. |
Renotification | Turning on the toggle will resend the alert at a specific interval defined by the user if the alert severity is not changed for the time specified. If turned off, Motadata AIOps will not renotify about the alert. |
Renotify | Similar to Notify Team field, enter the username or email address of the recipient. Also choose a preset duration for renotification along with the severity level at which they system will renotify you if the alert severity is not changed. |
Do not renotify if acknowledged | If the toggle is turned on, Motadata AIOps will not send a renotification to the recipient if they mark the alert as acknowledged. |
Take Action
Field | Description |
---|---|
Action to be taken | Select a runbook from the dropdown to be executed when the alert is triggered. |
When Severity is | You can use this option to map the action you selected in the previous step to status of the alert. This means that you can execute different runbooks based on the whether the alert is in the 'Down' state or 'Clear' state respectively. |
Create New | Select this button to start creating a new runbook which you might want to assign to the policy you are creating. |
Select the Create Policy button to create the policy based on the details entered.
Select the Reset button to erase all the current field values, if required.