Upgrade experience for K8s charms on base migration

In the Kubeflow team, we have been investigating the upgrade of the Ubuntu base version in our charms on supported tracks, as we have a number of charms on 20.04, using a Python version (3.8) that has reached end of standard support. In particular, this has surfaced with the nasty setuptools bug that has been preventing a clear CI across the board. However, to achieve the base bump on a supported track, we needed to ensure there can be a reliable (hopefully seamless) upgrade experience for users when we bump the base of a charm.

We’ve tested out the base upgrade experience using Juju cli and the Juju Terraform Provider. Below is a detailed showcase of both.

Disclaimer All our charms are currently on K8s where the upgrade of base should be possible given that upgrades comes with a tear down of the old pod and the creation of a new one. On VM, there are stronger constraints that they might likely prevent the base upgrade discussed here.

Test environment

  • Juju 3.6.7
  • Juju Terraform Provider 0.14.0
  • MicroK8s v1.31.7

Charm under test

I used the jupyter-controller charm as the test subject. In the test we’ll be attempting an upgrade from 1.10/stable to latest/edge. The charm currently has the following bases in the given channels:

  • 20.04 in 1.10/stable (rev 1229)
  • 24.04 in latest/edge (rev 1286)

Keep in mind the revisions above, along with the available revisions per base. This is critical to understand the test results. From charmcraft status we see:

charmcraft status jupyter-controller
Track    Base                  Channel      Version    Revision    Resources          Expires at
latest   ubuntu 20.04 (amd64)  stable       -          -           -
                               candidate    1228       1228        oci-image (r1014)
                               beta         1186       1186        oci-image (r1013)
                               edge         1266       1266        oci-image (r1014)
         ubuntu 24.04 (amd64)  stable       -          -           -
                               candidate    -          -           -
                               beta         -          -           -
                               edge         1286       1286        oci-image (r1014)
1.10     ubuntu 20.04 (amd64)  stable       1229       1229        oci-image (r1014)
                               candidate    1229       1229        oci-image (r1014)
                               beta         ↑          ↑           ↑
                               edge         1260       1260        oci-image (r1014)
         ubuntu 24.04 (amd64)  stable       -          -           -
                               candidate    -          -           -
                               beta         -          -           -
                               edge         -          -           -

Note: Omitted irrelevant channels from the output for simplicity

The status of the App before testing is as follows:

juju status
Model     Controller  Cloud/Region        Version  SLA          Timestamp
kubeflow  myk8s       microk8s/localhost  3.6.7    unsupported  10:51:47+03:00

App                 Version  Status  Scale  Charm               Channel       Rev  Address         Exposed  Message
jupyter-controller           active      1  jupyter-controller  1.10/stable  1229  10.152.183.240  no       

Keep in mind that the deployed 1.10/stable charm is on 20.04 base.

UX using juju cli

The juju refresh command is the standard approach to upgrading a charm.

Let’s try to upgrade to latest/edge with juju refresh:

juju refresh jupyter-controller --channel=latest/edge

Output:

Added charm-hub charm "jupyter-controller", revision 1266 in channel latest/edge, to the model
no change to endpoints in space "alpha": grafana-dashboard, logging, metrics-endpoint

The charm did refresh to latest/edge, but to which revision? If you refer to the charmcraft status from earlier you can see that 1266 is the revision under latest/edge with the 20.04 base. We can confirm this by viewing the image of the charm container:

kubectl get pod jupyter-controller-0 -n kubeflow -o jsonpath="{.spec.containers[?(@.name=='charm')].image}"

docker.io/jujusolutions/charm-base:ubuntu-20.04

Meanwhile, there is a more recent revision 1286 under latest/edge with the 24.04 base that was completely ignored. This is an issue that previously raised in this bug.

In order to get the most recent revision in latest/edge channel, users would need to specify the --base argument on refresh. The full command would be:

juju refresh jupyter-controller --channel=latest/edge --base=ubuntu@24.04

Output:

Added charm-hub charm "jupyter-controller", revision 1286 in channel latest/edge, to the model
no change to endpoints in space "alpha": grafana-dashboard, logging, metrics-endpoint

Now we see it refreshed to the most recent revision for the given channel.

Check the image of the charm container to confirm the new base:

kubectl get pod jupyter-controller-0 -n kubeflow -o jsonpath="{.spec.containers[?(@.name=='charm')].image}"

docker.io/jujusolutions/charm-base:ubuntu-24.04

The base is updated to 24.04 as expected :white_check_mark:

What this means that given the base of a charm was changed, users would be oblivious to all new released revisions unless they explicitly specify the base. This is a caveat with the upgrade experience that prevents it from being seamless.

UX using juju terraform provider

Now to test the upgrade with the juju terraform provider, we can do the following:

  1. applying the terraform module from track/1.10 of jupyter-controller charm:
cd notebook-operators
git checkout track/1.10
cd charms/jupyter-controller/terraform
terraform init
terraform apply -var "model_name=kubeflow"

This applies the charm from 1.10/stable channel.

  1. switched to main branch (the one with the merged 24.04 base change)
git checkout main
  1. edited the terraform module in main to specify the base and setting the default base to 24.04 the diff:
index becfc96..3282df2 100644
--- a/charms/jupyter-controller/terraform/main.tf
+++ b/charms/jupyter-controller/terraform/main.tf
@@ -1,6 +1,7 @@
 resource "juju_application" "jupyter_controller" {
   charm {
     name     = "jupyter-controller"
+    base     = var.base
     channel  = var.channel
     revision = var.revision
   }
diff --git a/charms/jupyter-controller/terraform/variables.tf b/charms/jupyter-controller/terraform/v
ariables.tf
index 44e1532..8d9f2c3 100644
--- a/charms/jupyter-controller/terraform/variables.tf
+++ b/charms/jupyter-controller/terraform/variables.tf
@@ -4,6 +4,12 @@ variable "app_name" {
   default     = "jupyter-controller"
 }
 
+variable "base" {
+  description = "Application base"
+  type        = string
+  default     = "ubuntu@24.04"
+}
+
 variable "channel" {
   description = "Charm channel"
   type        = string

Since we have version control for the terraform modules, we can set the base variable in the track branch when the base change takes place.

  1. run plan to review the changes
terraform plan
var.model_name
  Model name

  Enter a value: kubeflow

juju_application.jupyter_controller: Refreshing state... [id=kubeflow:jupyter-controller]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # juju_application.jupyter_controller will be updated in-place
  ~ resource "juju_application" "jupyter_controller" {
        id          = "kubeflow:jupyter-controller"
        name        = "jupyter-controller"
      + principal   = (known after apply)
      + storage     = (known after apply)
        # (8 unchanged attributes hidden)

      ~ charm {
          ~ base     = "ubuntu@20.04" -> "ubuntu@24.04"
          ~ channel  = "1.10/stable" -> "latest/edge"
            name     = "jupyter-controller"
            # (2 unchanged attributes hidden)
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.

We can see the base change was observed in the plan^

  1. run apply
terraform apply -var "model_name=kubeflow" --auto-approve
juju_application.jupyter_controller: Refreshing state... [id=kubeflow:jupyter-controller]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # juju_application.jupyter_controller will be updated in-place
  ~ resource "juju_application" "jupyter_controller" {
        id          = "kubeflow:jupyter-controller"
        name        = "jupyter-controller"
      + principal   = (known after apply)
      + storage     = (known after apply)
        # (8 unchanged attributes hidden)

      ~ charm {
          ~ base     = "ubuntu@20.04" -> "ubuntu@24.04"
          ~ channel  = "1.10/stable" -> "latest/edge"
            name     = "jupyter-controller"
            # (2 unchanged attributes hidden)
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.
juju_application.jupyter_controller: Modifying... [id=kubeflow:jupyter-controller]
juju_application.jupyter_controller: Modifications complete after 1s [id=kubeflow:jupyter-controller]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

Outputs:

app_name = "jupyter-controller"
provides = {
  "grafana_dashboard" = "grafana-dashboard"
  "metrics_endpoint" = "metrics-endpoint"
}
requires = {
  "logging" = "logging"
}
  1. Check the charm revision to be the latest one
juju status
Model     Controller  Cloud/Region        Version  SLA          Timestamp
kubeflow  myk8s       microk8s/localhost  3.6.7    unsupported  12:38:09+03:00

App                 Version  Status  Scale  Charm               Channel       Rev  Address        Exposed  Message
jupyter-controller           active      1  jupyter-controller  latest/edge  1286  10.152.183.88  no       

The upgrade worked as desired! The revision is indeed the most recent one 1286.

  1. To confirm the base is updated, view the image of the charm container
kubectl get pod jupyter-controller-0 -n kubeflow -o jsonpath="{.spec.containers[?(@.name=='charm')].image}"

docker.io/jujusolutions/charm-base:ubuntu-24.0

it’s updated to 24.04 :white_check_mark:

This is promising to provide a seamless UX with the juju terraform provider, given we implement the following changes to the terraform modules:

  • add a variable for base (with the default being the new base)
  • use this variable in the charm resource to set the base
4 Likes

Great investigation. It does look like we fixed the old issues about requiring “–force-base” when switching on Kubernetes.

There is certainly still an argument about not needing to supply --base, but at the upgrade goes cleanly when you do.