Skip to main content

Alert Correlation in ObserveOps — Investigate Metric Alerts in Context

Alert Correlation in ObserveOps (formerly known as AIOps) lets you investigate a triggered metric alert alongside its related metrics, logs, other simultaneous alerts, and recurring log patterns. You open it directly from the alert detail screen, no context switching needed.

How Alert Correlation Works

When you trigger Alert Correlation on a metric alert, ObserveOps queries the same monitor for correlated data within the alert's active time window. It surfaces related metric counters, ingested logs, simultaneous alerts, and detected log patterns in four tabs. Each tab narrows the root-cause investigation without leaving the alert context.

Prerequisites

  • At least one metric alert must be active or recently fired.
  • The monitor must be instrumented and sending data to ObserveOps.
  • To see correlated logs and log patterns, logs must be forwarded from the source monitor.

Go to Menu and select Alerts. The Alerts screen opens showing a severity summary strip — Down, Unreachable, Critical, Major, and Warning — across the top.

Select the Metric tab to view metric alerts. The screen shows:

  • Metric Alert Overview — a donut chart breaking down alerts by Availability and Metric Threshold policy types.
  • Metric Alert By Policy Type — a bar chart comparing counts across policy types and severities.
  • Today Alert Trend — a time-series bar chart showing alert distribution across the current day.
  • Device category rings — per-category counts for Server, Database, Service Check, and other monitor types.

Alerts screen showing the Metric tab with overview donut chart, policy type bar chart, today alert trend, and device category rings

Click any chart segment or device ring to filter the alert list. You can also click the grid view icon in the top-right corner to switch to the full alert list directly.

Alert Correlation List

Metric alert grid view showing columns for Alert, Type, Monitor, Instance, Incident number, Last Seen, Value, Duration, and Acknowledged By

The alert list shows all active metric alerts. Each row includes:

ColumnDescription
AlertThe policy name that triggered the alert.
TypeAlert type — Availability or Metric Threshold.
MonitorThe monitor on which the alert fired.
InstanceThe specific instance, if applicable.
Incident #The incident identifier.
Last SeenTimestamp of the most recent occurrence.
ValueMetric value when the alert triggered.
DurationTime the alert has been in its current state.
Acknowledged ByThe user who acknowledged the alert, if any.

Click any row in the alert list to open its Alert Detail screen.

Alert detail screen for CPU Utilization showing monitor metadata, metric details, alert flap history timeline, history log, and metric trend chart with threshold lines and a Correlation button in the top right

The detail screen shows monitor metadata and alert context:

FieldDescription
MonitorName of the monitor that triggered the alert.
IPIP address of the monitor.
HostHostname of the monitor.
InstanceInstance name for instance-level alerts.
GroupMonitor group(s) the device belongs to.
TagTags associated with the monitor.
Alert IDUnique identifier for this alert.
Alert TypeMetric Threshold or Availability.
MetricThe counter that breached the threshold.
Trigger ConditionThe condition that caused the alert to fire.
Metric ValueThe value at the time of breach.
First SeenWhen this alert first triggered.
Active SinceHow long the alert has been continuously active.
Flap CountNumber of times the alert changed state.
Acknowledge ByUser who acknowledged the alert, if any.

Below the fields, you see:

  • Alert Flap History — a colour-coded bar showing severity state changes over time.
  • History — a chronological list of state transitions with timestamps and metric values.
  • Metric Trend — a live chart showing the metric value over the selected time window with threshold markers overlaid.

The Correlation View

Click the Correlation button in the top-right corner of the Alert Detail screen.

The Correlation view opens for that alert. It shows:

  • The alert header with the policy name, IP address, and monitor name.
  • A time range selector in the top-right — defaults to the alert's active window. Adjust it using presets or a custom range.
  • A metric trend chart showing the breaching counter across the selected window with threshold markers.
  • Four correlation tabs below the chart.

Correlated Metrics

The Correlated Metrics tab opens by default. It displays related metric charts from the same monitor during the alert's time window.

Alert Correlation screen showing the Correlated Metrics tab with process statistics charts including Top Processes by Forecast Threads, Top Processes by CPU Utilization, Top Processes by Memory Utilization, and others

Charts appear under Process Statistics — covering process-level CPU, memory, virtual memory, I/O, and network throughput trends. If data is unavailable for a chart, it shows No data found. Use this tab to identify whether other counters on the same device behaved abnormally at the same time.

Correlated Logs

The Correlated Logs tab shows log entries from the same monitor ingested during the alert's time window.

Alert Correlation screen showing the Correlated Logs tab with an empty state message indicating no logs are available for correlation from the source IP

If logs are not being forwarded from the source monitor, the tab shows:

  • No logs available for correlation
  • Logs are not being ingested from this source, so correlated log insights cannot be shown for this alert. Start forwarding logs from <IP> to automatically correlate logs with alerts.

When logs are available, they display in a searchable list filtered to the alert's time range.

Correlated Alerts

The Correlated Alerts tab lists other metric alerts that were active on the same monitor during the alert's time window.

Alert Correlation screen showing the Correlated Alerts tab with a Metric Alerts table listing CPU Utilization as a Metric Threshold alert with metric system.cpu.percent at 96%

Alerts appear under Metric Alert(s) with the following columns:

ColumnDescription
AlertName of the correlated alert.
TypeAlert type — Metric Threshold or Availability.
MetricThe counter for the correlated alert.
InstanceThe instance, if applicable.
Incident DetailsIncident reference, if linked.
Last SeenTimestamp of the most recent occurrence.
ValueMetric value at the time of the alert.

Use the Search bar to filter by alert name. Use the icons in the top-right to refresh, export, or filter the results.

Log Pattern

The Log Pattern tab analyses logs correlated with the alert and surfaces recurring patterns. It helps you identify repeated error messages or common signatures that may contribute to the alert condition.

Alert Correlation screen showing the Log Pattern tab with a table showing Count, Severity, and Pattern columns with No records available message and 0 patterns found from 0 logs summary

A summary line shows the total pattern and log counts — for example, 0 patterns found from 0 logs.

The table shows:

ColumnDescription
CountNumber of log lines matching this pattern within the alert's time window.
SeveritySeverity level of log entries matching the pattern.
PatternThe detected recurring log pattern string.

If no patterns are found, the table shows No records available!

note

Log Pattern detection runs only on logs correlated with the selected alert and its time window. It uses the same pattern detection engine available in Log Search and APM Log Correlation.

Example

Your team gets a CPU Utilization alert on debianv13 at 172.16.15.65. You click the alert row, open the detail screen, and see a flap count of 68 with the metric sitting at 96%. You click Correlation and check the Correlated Metrics tab — process CPU and memory charts reveal a runaway process spiking at the same time. You check Correlated Alerts to confirm no other threshold has fired simultaneously. The root cause is isolated in under two minutes.

Troubleshooting

Correlated Logs tab shows "No logs available for correlation"

Cause: Logs are not being forwarded from the source monitor to ObserveOps.

Fix: Set up log forwarding from the monitor's host. Go to Menu > Settings > Log Management and configure a log collector or forwarder for the source IP shown in the empty state message.

Log Pattern tab shows "No records available" even though logs are available

Cause: The correlated logs within the alert's time window do not contain enough repeated patterns to detect.

Fix: Widen the time range using the selector in the top-right. If the time window is narrow, fewer logs are analysed and pattern detection may return empty.

Correlated Metrics charts show "No data found"

Cause: The monitor did not collect process-level metrics during the selected time window, or those counters are not enabled in the monitor's configuration.

Fix: Confirm the monitor is actively polling and that process monitoring is enabled in Monitor Settings for the device.

Known Limitations

  • Alert Correlation is available only for metric alerts. Log, Flow, Trap, APM, and other alert types do not have a Correlation view.
  • Correlated Logs and Log Pattern tabs require active log forwarding from the source monitor. ObserveOps does not retroactively correlate logs forwarded after the alert fired.
  • The Correlation view uses the alert's active time window. Alerts that have already cleared may show limited data if retention settings are restrictive.