The flow of alert rules

lucabello · 25 February 2025 10:18

In the Juju world, charms bundle their own alert rules. With COS, you only need to configure Alertmanager to start receiving alerts for the charms you want to observe. But how does that work?

Alert rules can be defined for both metrics and logs. This document will only follow the flow for metrics alert rules, but the process for log alert rules works similarly.

alertrulesflow.drawio

Aggregating the alert rules

Whether you’re observing your charms via Grafana Agent or directly with Prometheus, the alerts are aggregated and saved to disk by the Prometheus charm.

When you juju relate <your-charm> prometheus (or to grafana-agent), your charm sends its alert rules over relation data, using either the metrics-endpoint or receive-remote-write interfaces. You can see Grafana Agent as a pre-aggregation point: it gathers alerts and forwards them to Prometheus.

After the charms have settled — and all the alert rules are in relation data — Prometheus saves them to disk, by creating a YAML file per relation under /etc/prometheus/rules. As relations are being processed one by one, each rule file is validated via cos-tool, to make sure it’s well-formed.

Using the alert rules

Alerts are now saved to disk by Prometheus: now what?

After pointing to the /etc/prometheus/rules folder in its config, so that the alerts are picked up, Prometheus will now constantly evaluate all alert rules (i.e., checking if they’ve been triggered).

When Prometheus detects that an alert rule is triggered, it contacts Alertmanager via HTTP API, communicating what triggered, when, and how. Assuming you’ve configured it with your preferred notification channels, Alertmanager will then fire off notifications to make you aware of the alert.