Aggregating data from principal units with a subordinate charm

Recently, the observability team was working on integrating COS Lite with Data Platform machine charms.

This involved developing a snap and a machine charm for grafana-agent, as well as means to forward data from all related principal units (kafka, zookeeper, …) to apps in a different controller. Along the way, we discovered a few things that were completely new to us.

What’s special about subordinate charms?

Having little experience with machine charms in general, we discovered that subordiante charms are quite special:

  • They are specified with a subordinate: true key in metadata.yaml.
  • They can have both regular and peer relations, but also subordinate relations.
  • Subordinate relations are specified with a scope: container key in metadata.yaml.
  • A subordinate charm is provisioned not on “deploy” but only when a subordinate relation is formed with a regular charm (also named principal).
  • The subordinate charm is installed in the same VM as the principal. For this reason, we typically have only one subordinate app related to the same principal. For this reason, the principal charm should probably have a limit: 1 for the subrodinate relation. Imagine what would happen if two subordinate charms attempt to install and configure the same snap.
  • Subordinate charms only see one unit over the subordinated relation - the principal unit. I.e. subordinate units cannot iterate over relation data from other units.
  • The subordinate leader unit does not necessarily coincide with the principal’s leader. Especially if the same subordinate is related to several different principal apps.

Typical deployment topology and its implication

Typically, admins deploy one subordinate app and relate it to multiple principal apps. This means that a unit of the subordinate charm is created inside the VM of every unit of the principal(s) charm(s), but also:

  • a subordinate unit can only see its principal unit.
  • there is only one subordinate leader, which means we need a leader guard in front of writing to app data.

(Edit a copy of this diagram)

With the combination of the two constraints above, the only juju-y way of forwarding data from a principal app out via a subordinate charm, is using peer unit data.

Aggregate information from all principal units to the subordinate leader

Note that the only data that is forwarded to the leader is data that needs to go into the app databag of the outgoing relation(s).

For the grafana-agent charm, we need to forward information (alert rules, dashboards) from all principal units (the “incoming” relation) over to the regular relations to COS Lite (“outgoing” relations to prometheus, loki and grafana).

For grafana-agent, it looks a little bit like this:

(Edit a copy of this diagram)

Principal charms Subordinate relation unit data (“incoming”) Subordinate unit Peer unit data App data (“outgoing”)
first/0* Info from first/0 agent/0* From first/0 Amalgamated by the leader
first/1 Info from first/1 agent/1 From first/1 N/A
second/0* Info from second/0 agent/2 From second/0 N/A
second/1 Info from second/1 agent/3 From second/1 N/A

To accomplish the above:

  • The principal unit needs to update unit data on the subordinate relation.
  • The subordinate charm needs to observe relation-changed on the subordinate relation, and copy over the data to peer unit data.
  • The subordinate charm needs to observe relation-changed on the peer relation and copy over the data to the outgoing regular relation app data (with a leader guard).

Conclusions

  • Multiplicity considerations with subordinate charms are a bit tricky.
  • Subordinate charms can externalize aggregated information from their principal units by utilizing peer unit data.
2 Likes