How to interpret : app,unit,charm,version and workload

Me and @heitor has been discussing alot about how to work with versions, workloads and status with juju and we are both massively confused.

We have a few terms used: “app”, “charm”, “unit”, “version”, “workload”. It has taken us a while to form a consistent model on how to work with these and I’m moving this discussion here to get some eyes on this.

If we can’t figure this out and agree - new charmers will not either.

As we stand now, we believe that “application” = “charm” and that the “workload” = “version”.

This is applied on the application level.

If this is true, this is massively confusing in the “ops” framework since the construct to set is not on the APP level, but on the UNIT level:

self.unit.set_workload_version(self._nextcloud_version())

This is really misleading. Wouldn’t it be a better way to set this value as:

self.unit.app.set_workload_version(self._nextcloud_version())

At the moment, we tent to believe that there is also a “unit_workload”, whereas its clearly no such thing - or is there?

Nevertheless, mixing “version” with “workload” is a mix of terminologies and a developer might think that these (version and workload) are different things?

Now, this leads up to even more questions about how to present “the underlying service version/workload”

Today, we have “juju status” to play with here and the “update-status” hook.

The interpretation this status differ massively between me and Heitor which makes things problematic when trying to collaborate over charming.

Heitor thinks that status should not reflect the underlying service at all, but rather only represent juju related statuses.

Me, on the other hand, tend to like to present the underlying service status here.

Juju doesn’t say what is right and wrong, but leads up to massively different kind of charms. It would be better if we could figure out a clear good-vs-bad here, this being a core piece of juju.

I think this is a bad situation and could partially be mitigated by restricting developers ability to alter a “charm-workload-version-status” and move this to metadata.yaml or a version file which is already becoming part of charmhub requirements? E.g. forcing a charmer to have a version file which juju then set as “workload/version” on the app level.

That would reduce the issue to only deal with:

  • What “juju status” should reflect in terms of underlying service, or, juju internals.
  • Having a consistent terminology about charm=app, version=workload

Or, have we missed out on key concepts here?

1 Like

Possibly self.app.config and self.app.version are shorter, more accurate, and more descriptive than self.model.config and self.unit.set_workload_version()?

@tucker-beck any thoughts on this from when you were initially diving into to juju and the operator framework?

1 Like

Yes! I found the self.model.config super confusing when I encountered it the other day. I’ve been working on building my mental model of the various entities and learning the nomenclature. As a new developer to juju, it adds to the difficulty when the terminology is muddied by the API.

I think self.app.config and self.app.version are much clearer

2 Likes

About the charm version:

I don’t think the charm version, as reported in juju status and/or shown in CharmHub, should be a 1:1 to the service version provided by the charm:

  • suppose a charm for the service foo, the current upstream version is 1.2.3. If our charm version is 1:1 to upstream, the charm version will also be 1.2.3. Now, if we release a new charm, that fixes something in the charm code, the new version will not change. So, hos to inform the person deploying/managing Juju Clusters?
  • many charms provide more than 1 component for the service to run (e.g. juju controller also has a juju-db service inside), does it makes sense to have a 1:1 mapping for the versions in these cases?
  • the charm code is a code, so it must be versioned. I don’t think using the version of its dependencies or services installed as the version of the charm code is instructive to the person operating it.

This is the workflow we’ve been using for our slurm-charms:

  • the version is controlled by Git, using Git tags
  • we use the latest git tag to build the charm version using git describe --tags --dirty. This populates a version file in the charm directory. This sets the version in CharmHub
  • the charm code reads this file and self.unit.set_workload_version to show in juju status.

This automates the versioning of the charms, with unique version strings.

The downside of this approach is that juju status does not show the version of the services provided. And I don’t think that is an issue: juju status should provide the state of the Juju model/apps/units/machines, regardless of the stuff deployed in there.


About the ops API:

juju status only shows the column version in the Applications section, not in the Units. The workload column only appears for Units, not for applications, and does not show versions, it shows the states. It is awkward to use unit.set_workload_version: version does not apply to units and workload is not used for versions. self.app.version makes more sense to me, as it uses the appropriate words to describe the action of setting the application version.

1 Like

Let’s take mysql as an example. Here’s a really basic mysql charm:

#! /usr/bin/env bash

apt install mysql-server
systemctl start mysql-server

Obviously, a good charm is capable of doing more than just installing something. But stick that code in hooks/install, and you’ll wind up with what is technically a “mysql” charm.

When you deploy that charm to a model, the controller downloads the charm code, and maps it to an “application”, which has the same name as the charm by default. That doesn’t mean much at first. It just means that the controller has the charm code, and it could hand that charm off to a unit agent if it wanted to do so.

When you add a unit to the mysql application, the unit agent downloads the charm code, and executes the install hook. As you can see, our hook will start an instance of mysql-server. This is the “workload”.

Juju does not know a whole lot about that workload, which is an intentional separation of concerns. The human writing the charm knows the details of actually running mysql. The Juju agent just executes hooks, and reports statuses and failures back to the controller.

I think that the problem comes when we try to talk about “version”. There are several versions:

  1. The version automatically generated by the charm store when you upload a charm.
  2. Possibly an independently tracked version of the charm, targeted at the humans writing it.
  3. The version of mysql that gets installed, which depends on the underlying Linux distro or docker container.

So you might have the following versions:

  1. Charm store version 2
  2. Charmer team version 0.2
  3. Mysql version 8.x (the default in focal, I think)

Which one is the “workload version”? I believe that the most appropriate is “mysql version 8.x”. The mysql charm would be responsible for checking that, and then calling the appropriate api on the unit agent. In the Operator framework, you call “set_workload_version”. (See the docs here: Welcome to The Operator Framework’s documentation! — The Operator Framework documentation)

(Note that “workload version” is not something that Juju models explicitly. Again, there’s a separation of concerns here. Juju provides charm authors with a way of telling human operators about the version of the software running on a unit. But that’s not something that Juju tracks as part of its own model.)

3 Likes

I like to think of it in this manner:

A charm is a piece of software that models an application. An application in the juju sense is a server or collection of servers (be they metals, vms, or containers) which provide a singular addressable service which are deployed by juju into a cloud and configured via the charm source code based on the pre-defined hooks. Each charm has configuration variables such as installation source, ports, etc, as appropriate for configuring the software. The deployment of a charm and a set of configurations for that charm is called an “application” (previous versions of juju called this a “service”). If you wanted to run two different mysql clusters, you would deploy the mysql charm twice, perhaps named mysql-development and mysql-production. Each of these charm deployments is an “application” in the juju nomenclature, using the same “charm” - mysql. Maybe you run mysql-development application with 2 “units” and run on port=1234. And mysql-production may be an application with 10 “units” and run on port=2345. “Units” are instantiations of the application (charm code plus configuration values) on a metal, vm, or container. The “unit” is where the charm code is actually executed. Hooks are called on each individual unit of an application as model states change based on operator input - add unit, add relation, remove unit, change a configuration value - or based on reactions to relations to peer or other application units (remote relations, like a mysql client registering a new user/database, or a second unit of mysql peering with the first unit of the same application).

There are five “versions” to be tracked within a juju deployment:

  • Controller Model Agent Version - The version of the juju API and database service running in the current cloud environment (whether MAAS, Openstack, K8s, AWS, GCP, Azure, or LXD/localhost). This can be queried with juju status -m controller | head -2 (i.e. 2.9.10).
  • Model Agent Version - The version of the model (which contains deployed applications) agent that will run the code on the various deployed units. This can be queried with juju status -m <model-name> | head -2, and each machine and unit’s agent version can be seen with juju status -m <model-name> --format yaml.
  • Charm version - The version of the charm code which is deployed to each of the juju application’s deployed units which controls how the software (workload) is installed and configured. This is the charmhub version of the uploaded charm code (charm store version 2 as described above by @pengale). This is available in the second-to-right-hand column of juju status's Applications section, or available from juju status --format yaml as the charm: cs:some-charm-<version> section of a given application stanza.
  • Workload version - The actual software that is running as configured by the charm based on the charm code version and application configuration. Most useful when you have the “source”, “openstack-origin”, “snap_channel” or other similar configuration set to “latest”. In the context of kubernetes operators, this is most useful when the charm developer queries the deployed OCI container image and posts the version of the software within the container via set_workload_version() to avoid having to query kubernetes directly. This is optional metadata provided by the charm developer that may be available in the second column of juju status in the Applications section, or in the version key of an application stanza within juju status --format yaml.
  • Machine version - This is the version of the operating system running on the machine (metal, VM, lxd, container) i.e bionic, focal. These can be seen in the output of juju machines

I think when we talk about the “workload” we should be thinking “snaps, container images, or package versions of the running software which provides the useful service”. All versions other than the workload version and machine version are related to the juju configuration management engine and should not have a major bearing on the functionality and security of the deployed service itself.

A given charm, such as nova-compute, may support installation of many different versions of that software (like the nova versions related to Mitaka, Queens, Ussuri, etc), hence the helpful hints of workload_version to make the info available in juju status, rather than checking the charm’s configuration or logging into the unit to get the data from the OS/packaging system.

As Pete noted, this workload_version is not something that juju reacts to, rather it is useful metadata able to be provided by a charm developer for operator visibility of the workload status.

In the case of something like nova-compute, the charm release version is something like 21.04 (this is the charm team’s “release” version and info related to the git repository commit for branch stable/21.04 will show up in the repo-info file - not a standard, but a per-charm team “charmhub -> git” translation helper provided by their CI/CD process). This is then published to charmhub with an auto-incrementing version along the lines of charm version #123. Then the operator can configure the openstack-origin setting of the nova-compute application to something like cloud:bionic-ussuri and the version of the workload becomes 21.0.x as the charm code upgrades/installs the packages from the associated repository (or snap channel or OCI image). There are certainly overlaps of which versions of charms support specific versions of workload software, and that must be communicated within release notes and README files of the charm itself.

2 Likes

I think there is a subtlety when performing application upgrades where having a per-unit workload version may be useful, as, for instance, when upgrading ceph-osd from nautilus to mimic, each unit in turn upgrades the software, so reflecting which version each unit’s software is at during an upgrade may be more useful than setting the entire application’s “intended/configured” workload version wholesale before an upgrade is completed on a multi-unit deployment.

I did notice that the expression of the workload version in the juju status model is at the application level, and thus your confusion is warranted, but I do wonder if this is intentional within the operator framework for future expansion of the visibility of this metadata on a per-unit vs per-application basis.

2 Likes

In the operator framework, we see this in the ops/model.py:357:

    def set_workload_version(self, version: str) -> None:
        """Record the version of the software running as the workload.
        This shouldn't be confused with the revision of the charm. This is informative only;
        shown in the output of 'juju status'.
        """
        if not isinstance(version, str):
            raise TypeError("workload version must be a str, not {}: {!r}".format(
                type(version).__name__, version))
        self._backend.application_version_set(version)

And application_version_set is:

    def application_version_set(self, version):
        self._run('application-version-set', '--', version)

So, setting the version of the workload sets the version of the application…

If multiple units of the same application are updating, and they have different versions of the underlying system, they can mess up the version in juju status.

3 Likes

I agree, currently, the Juju data model considers it one version per application; however, I agree with the ops framework providing future-proofing to allow for per-unit workload versioning. That being said, with the current implementation, I think it best that charm developers currently gate application_version_set calls to only run from leader units to avoid this confusion.

1 Like
  • application_version_set

  • set_workload_version

Are then essentially the same currently?

It appears that’s true. set_workload_version is the Operator framework call which invokes application_version_set (ultimately calling the juju agent binary “application-version-set”.)

1 Like

Hmm, any change this situation can be made less misleading until the functionality is actually implemented?

I also love the explanation you provided above and if the core developers could perhaps make it impossible to make alternative implementations on how to use “workload” which will cause conflicts between charm implementations - that would create a bit more narrow path to writing charms that allows for collaboration.

There would be a single interpretation of this instead of free form as it is today which me and @heitor discovered yesterday.