A PR is merged, changes are released to latest/edge
. Wait a few weeks, promote to beta
. Wait a few weeks, promote to candidate
. Wait a few weeks, promote to stable
.
This release process is straight-forward, but it has several blind spots:
- releasing to
latest
means there isn’t a good way to introduce a breaking change:juju refresh
might break you; - releasing to
stable
is only gated by time, not by quality;
The Observability team thought about this for a long time, and after lots of discussions, we decided what we wanted:
- feature-frozen tracks: we’ll open new tracks on a periodic cadence, only backporting critical bugs and security fixes until the track’s end-of-life;
- worry-free in-track upgrades: when following a track, you should be able to run
juju refresh
without worrying about breaking changes; - risk channels are gated by quality, not by time: we’re introducing quality gates (i.e., various sets of tests) as a promotion mechanism between risk channels.
Are we there yet? No, but we made a lot of concrete steps towards it. Specifically, we had to alter our charm tooling and release infrastructure in order to support the aforementioned goals.
Here’s a moderately detailed description of I think charms should be managed, tested, and released.
This will be a long post!
Managing and pinning Python dependencies
Charmers use requirements.txt
to define Python dependencies for their charm, often without pinning them. This leads to some complications:
- given a charm revision, there is no easy way to tell which versions of your Python dependencies are included:
- you have to inspect the
.charm
file or deploy it, check thevenv
folder and look at the libraries inside; - there is no correlation between source code and dependency versions.
- you have to inspect the
- your dependencies might release a new version between your tests and your release:
- if you’re not careful and don’t re-use the same artifact, you might get CI breakages after merging (looking at you,
pyright
), or (a lot worse) undetected bugs in your released charm.
- if you’re not careful and don’t re-use the same artifact, you might get CI breakages after merging (looking at you,
The solution is using a lockfile to manage and pin your dependencies, with CI that updates it automatically. Dependencies are defined in a human-readable, (mostly) unpinned format in pyproject.toml
, to then be compiled into a lockfile with precise version pinning.
We started using uv
, and — while waiting for the uv task runner to be implemented — using tox
to run uv
commands for updating the lockfile, linting, formatting, testing, and so on. It’s blazingly fast, to the point where it removes the need for re-usable virtual environments by creating ephemeral ones in a super short time.
In order for charmcraft pack
to still work, we generate requirements.txt
on the fly from the lockfile when packing a charm. Charmcraft has also recently introduced a uv
plugin, which will likely simplify things further. This is a simple diagram showing some main uv
commands we use and what they do.
TL;DR: Your charm should pin Python dependencies. Look at how we use uv
in canonical/grafana-k8s-operator for more details.
Workflow quality and Testing automation
Most of the quality goals that we’re moving towards require solid, reliable automation. Currently, while our CI workflows have served us decently well, there are a few pain points that we needed to address:
- our reusable workflows are not versioned: every charm is pointing to
@main
, meaning it’s very easy (writing from experience) to break CI for all charms at once — this includes breaking every other team that uses our CI; - it’s hard to test: especially for workflows that involve releasing a charm, we don’t have a good way to make sure our changes are working as expected;
- it heavily relies on GitHub-specific things: several actions we use (including
charming-actions
) are complex, not flexible, and very opaque; what is being executed is hidden behind TypeScript, and in some cases they produce unexpected side effects.; - integration tests are slow: it’s not uncommon for us to wait for 1-hour-long tests on PRs; this slows down our work a lot.
I redesigned our CI with some guiding principles[1] in mind: simplicity, stability, repeatability, decoupling from GitHub actions. I rewrote chunks of our CI in order to support our quality gates story and minimize our reliance on GitHub actions, and took the chance to level up the quality of our automation on multiple fronts.
I introduced a series of processes in order to raise our confidence in the quality of our CI: most changes we make are tested against the o11y-tester, a charm with no other purpose than letting us fully run our CI changes, from testing to release. Additionally, our CI is now versioned, with the old workflows available at the v0
tag. This makes the experience for charms a lot more stable overall, and prevents us from accidentally breaking other teams.
jobs:
pull-request:
name: Pull Request
uses: .github/workflows/charm-pull-request.yaml@v1
The documentation around our workflows has also been dramatically improved, with multiple charts[2] detailing them. Their behavior can be easily parsed from our diagrams.
I changed the way we run integration tests by parallelizing their execution by spinning up one GitHub runner per integration test file. After running our fast quality checks, the charm is packed and uploaded as an artifact, which is re-used in our parallel integration tests, saving us extra time. Combined with some smart dependency caching, using this in our Grafana charm reduced the CI execution time from 45 minutes to 20 minutes, a 125% speed improvement. For charms with even longer integration tests (we have some that easily take 90 minutes), I expect the performance improvement to be even better.
There are several other things that were added to our automation, with the purpose of improving quality:
- support for multi-track releases from different branches;
- support for automatic promotion up to
candidate
via quality gates; - linting of Terraform modules and Prometheus alert rules in charms;
- linting of the GitHub workflows themselves;
- automatic and parallel release of all the specified bases — no more manual
noble
orarm
releases.
If you want to use this new version of our CI, I’m in the process of writing a document that details the requirements for your charm repository. We’ve only rolled it out to one charm so far, so there’s still some progress to make!
TL;DR: Our CI has a lot of new features, is versioned, and tests take less than half the time they used to.
Final thoughts
I think these changes are significant steps forward towards improving the levels of professionalism and quality of our work. This is the Observability take on levelling up our release processes, I hope it was helpful!