Juju topology labels

Juju topology labels are telemetry labels that are used for identifying the origin of metrics and logs in juju models. In other words, the Juju topology labels are a fingerprint of a unit in some juju model that is emitting telemetry. Especially when you have hundreds, thousands of nodes, it is essential to be able to locate that one unit that has been emitting alerts. For this reason, Juju topology labels play a key role in the Canonical observability stack (COS).

See also: Model-driven observability: the magic of Juju topology for metrics

This is what the Juju topology labels look like:

        labels:
          model: "some-juju-model"
          model_uuid: "00000000-0000-0000-0000-000000000001"
          application: "fancy-juju-application"
          unit: "fancy-juju-application/0"
          charm_name: "fancy-juju-application-k8s"

The COS charm libraries wrapping the observability relation endpoints inject these labels into any outgoing metric, log, trace, and dashboard, so that the charm using them doesn’t have to be aware of this at all. Any charm can ship with dashboards and (alert) rules and integrations with COS to monitor its lifecycle. These are the so-called “built-in” dashboards and rules. These are workload-specific. As the charm is deployed and related to COS, the charm libraries mediating said integration automatically inject the juju topology labels in all built-in dashboards and rules.

The following sections outline what this means in practice, and which juju-topology-related modifications are applied to the built-in rules and dashboards.

Dashboards

Depending on whether the charm where the dashboards reside is related directly to grafana-k8s, or whether the data flows through grafana-agent or cos-proxy, there are subtle differences in how the topology is injected.

Charms relating directly to grafana-k8s

Built-in dashboards are enriched with topology drop-downs. This allows filtering dashboard data by topology labels. You can opt out of this behaviour by calling a ._reinitialize_dashboard_data(inject_dropdowns=False) method on the GrafanaDashboardProvider relation wrapper object.

Charms relating through cos-configuration

Incidental dashboards coming in from a git-repo via the cos-configuration charm are left intact.

Charms relating through grafana-agent (-k8s or not)

When dashboards are forwarded through a grafana-agent intermediary, the juju topology labels of the charm of origin are injected (and not grafana-agent’s). Any subsequent chaining to additional grafana agent charms would leave the labels intact.

Charms relating through cos-proxy

TODO: what happens to dashboards via cos-proxy

Alert rules

For built-in alert rules,

  • Alert exprs are qualified with topology labels. This way, built-in alerts fire only for that particular unit
  • Alert labels are enriched with topology labels. This is meant for convenient reading of a rendered alert when presented to an on-caller. The labels would also be visible in the alert’s rendered expr, but alert labels are more convenient to read.
  • Alert rules are NOT enriched with the unit label, because alert rules are forwarded to prometheus/loki per related app, not unit. I.e. having multiple units does not result in prometheus having duplicated alerts per unit. If an alert was qualified with a unit (which one?), we wouldn’t get alerts from any other units.

Charms relating through cos-configuration

Incidental rule files coming in from a git-repo via the cos-configuration charm are left intact.

Charms relating through grafana-agent (-k8s or not)

When rule files are forwarded via grafana-agent, then they are enriched with juju topology labels of the relating charm (not grafana agent’s topology). Any subsequent chaining to additional grafana agent charms would leave the labels intact.

Charms relating through cos-proxy

TODO: what happens to rule files via cos-proxy

Logs

K8s charms can stream logs to loki using the charm lib. Behind the scenes this is accomplished using promtail, and log streams are enriched with juju topology labels.

Charms relating through grafana-agent (-k8s or not)

TODO: what happens to logs via grafana agent

Charms relating through cos-proxy

TODO: what happens to logs via cos-proxy

Additional notes

  • In the future, the grafana-agent charm may start exposing metrics and logs generated by its own workload, and those would be enriched by juju topology labels.
  • In the future, the cos-configuration charm may start exposing metrics and logs generated by its own workload, git-sync, and those would be enriched by juju topology labels.

Maybe make this it’s own section with a header as well? For instance:

Charms relating through cos-configuration-k8s

So initially I had the document structured with section per app, but then I restructured it to section per type of data (dashboards, alerts, …). Pietro refactored it to section per app.

I was thinking that readers would be more interested in section per type of data?

Wdyt @ppasotti @0x12b?

1 Like

wasn’t me. But it feels more logical to have it split between who the ‘final recipient’ of a block of data is

1 Like

Added! Thanks @0x12b.

1 Like

@0x12b @rbarry I added this bullet after today’s discussion.

Interestingly, by 2020, the telegraf charm was using a concept very similar to juju topology. h/t @0x12b

@verterok would you be able to share some historical context?

Hi, Apologize for the delay. Wondering what could be interesting, let me know if you want any specifics on this. If I remember correctly we were already using telegraf tags on each model (some models have multiple applications of the telegraf charm) . This was pretty involved as we had to manually set the tags to the appropriate values on each bundle/deploy. So, the linked commit was making this ad-hoc usage of tags a built-in thing in the telegraf charm. Worth noting we have telegraf on every instance deployed in our cloud*

Cheers,

*at least in the team I’m currently part of and other teams I work closely

1 Like