Macros in Alerts & Policies
You can customize your alert messages by including pre-defined Macros. These Macros serve as placeholders that are automatically replaced with actual values when the alert is triggered. By leveraging Macros, you can tailor your alert messages to provide precise information about the event that triggered the alert, facilitating quicker and more informed decision-making.
Navigation
While configuring a policy, click on the Set Alert Message drop-down to modify the default alert messages. Here, you can use the pre-defined macros to customize the alert message and the subject.
How to Use Macros
To use Macros in your alert messages, simply include the Macros within your customized alert message. When the alert is triggered, these Macros will be replaced with the actual values associated with the event.
Below is an example of how you can incorporate Macros in your alert message:
An alert $$$policy.name$$$ was triggered with $$$severity$$$ severity for the monitor $$$object.name$$$ (IP: $$$object.ip$$$) because the $$$counter$$$ breached the threshold with the value $$$value$$$.
Supported Macros
Here's a list of supported Macros along with the descriptions of what these macros display in the actual alert message:
| Macro | Description |
|---|---|
| $$$policy.trigger.time$$$ | The exact time when the alert was triggered. |
| $$$object.name$$$ | The name of the monitor that triggered the policy. |
| $$$object.ip$$$ | The IP address of the monitor that triggered the policy. |
| $$$object.host$$$ | The host name of the monitor that triggered the policy. |
| $$$object.type$$$ | The type of monitor. |
| $$$counter$$$ | The counter for which the alert is triggered. |
| $$$value$$$ | The value of the counter at which the alert is triggered. |
| $$$severity$$$ | The severity level of the triggered alert. |
| $$$policy.name$$$ | The name of the policy that triggered the alert. |
| $$$policy.type$$$ | The type of policy that triggered the alert. |
| $$$object.groups$$$ | The monitor group for which the alert is triggered . |
| $$$instance$$$ | The specific instance for which the alert is triggered. |
| $$$active.since$$$ | The duration of the alert in its current severity state. |
| $$$trigger.condition$$$ | The policy evaluation criteria for the received alert. |
| $$$counter.description$$$ | A short description of the selected counter (what the metric represents). |
| $$$counter.interpretation.high$$$ | It suggests the high counter value. |
| $$$counter.interpretation.low$$$ | It suggests the low counter value. |
| $$$counter.rootcause$$$ | Common reasons why the counter may breach or behave abnormally. |
| $$$counter.recommended.action$$$ | Recommended next steps to investigate and resolve the issue. |
| $$$counter.threshold.guidance$$$ | Provides threshold guidance for major, warning, and critical limits when the counter breaches. |
| $$$counter.related.metrics$$$ | Related counters/metrics that can help with deeper analysis and correlation. |
Motadata AIOps supports all instance level counters as macros. Let's understand how to use instance counters as macros using an example:
For this scenario, assume you have created a policy for interface.in.traffic.utilization.percent. Thus, you can mention the below instance counters as macros in your alert message:
| Instances | Description |
|---|---|
| interface.sent.discard.packets | The number of packets discarded during transmission on the interface. |
| interface.in.packets | Total number of packets received on the interface. |
| interface.packets | Combined count of all packets (sent and received) on the interface. |
| interface.error.packets | Total number of packets with errors encountered on the interface. |
| interface.sent.error.packets | The number of outgoing packets that encountered errors during transmission. |
| interface.received.discard.packets | The number of received packets that were discarded. |
| interface.received.octets | The number of bytes received on the interface. |
| interface.bit.type | The interface speed or bit rate (bps). |
| interface.status | Current status of the interface. |
| interface.out.packets | The number of packets sent out through the interface. |
| interface.operational.status | Operational state of the interface. |
| interface.admin.status | Administrative status set for the interface. |
| interface.sent.octets | The number of bytes sent out from the interface. |
| interface.last.change | Timestamp or system ticks indicating when the interface last changed status. |
| interface.received.error.packets | Number of incoming packets that contained errors. |
| interface.discard.packets | Total number of discarded packets. |
| interface.in.traffic.utilization.percent | Utilization percentage of inbound traffic relative to interface bandwidth. |
| interface.out.traffic.utilization.percent | Utilization percentage of outbound traffic relative to interface bandwidth. |
Below is an example of how you can incorporate instance counters as macros in your alert message:
$$$counter$$$ has entered a $$$severity$$$ state with value $$$value$$$ on $$$object.host$$$ ($$$object.ip$$$)
Here’s what this alert indicates:
$$$counter$$$ $$$counter.description$$$ $$$counter.interpretation.high$$$ $$$counter.interpretation.low$$$
This situation often arises due to
$$$counter.rootcause$$$.
To fix this, investigate $$$counter.recommended.action$$$
For further diagnosis, analyze related metrics like $$$counter.related.metrics$$$ will give you a broader picture of your system’s behavior and confirm recovery.
Mentioned below is the message which will be received when the alert is triggered:
system.cpu.percent has entered into critical state with value 96.0 on jay-patel-Latitude-5430(10.246.56.226) Here's what this alert indicates: system.cpu.percent Usage above 80% indicates CPU saturation and potential performance issues. Below 5% means the CPU is mostly idle, which is normal during off-peak times. This situation often arises due to: When High: CPU-intensive processes, Runaway applications, Insufficient cores, Host oversubscription in virtual environments; When Low: Idle workload, Lack of incoming traffic, Misconfigured monitoring agent. To fix this: Investigate top processes; consider optimization or scaling resources. For further diagnosis, analyze related metrics like: system.load.avg5.min, system.memory.used.percent, system.processor.queue.length will give you a broader picture of your system’s behavior and confirm recovery.
With these Macros, you can create customized alert messages tailored to your specific requirements, ensuring that you receive the most relevant information when alerts are triggered.