This guide describes how to upgrade Charmed Kubeflow (CKF) from 1.7 to 1.8 version.
This requires upgrading each charm individually. New charms and relations must be deployed separately. Most charms can be upgraded simply with juju refresh
; however certain components require additional steps to upgrade.
Requirements
- An active and idle Charmed Kubeflow 1.7 deployment. This requires all charms in the bundle to be in that state. Access to dashboard of the existing Charmed Kubeflow 1.7 deployment.
- Admin access to Kubernetes cluster where existing Charmed Kubeflow 1.7 is deployed.
- Tools:
kubectl
,juju
(version 3.x)
Before the upgrade
Before upgrading, it is recommended to do the following:
- Stop all Notebooks.
- Make sure all Workflows are completed and disable Recurring Runs.
- Review any important data that needs to be backed up and preform backup procedures according to the policies of your organisation.
- Record all charm versions in existing Charmed Kubeflow deployment.
All upgrade steps should be done in kubeflow
model. If you haven’t already, switch to kubeflow
model:
# switch to kubeflow model
juju switch kubeflow
Migrate DBs to MySQL
As of Charmed Kubeflow 1.8, MySQL is replacing MariaDB as the database for Katib and Pipelines. Charmed Kubeflow 1.8 does NOT support MariaDB, so you need to migrate to MySQL.
Upgrade Juju to 3.4
Charmed Kubeflow 1.8 is supported on Juju 3.4. For deployments that are on Juju 2.9, you should migrate them to Juju 3.4. See Upgrade your Juju deployment from 2.9 to 3.x for more details.
Modify CRD labels to keep user workloads
Due to many charms in Charmed Kubeflow 1.8 moving from Podspec to Sidecar pattern, some charms cannot be upgraded with juju refresh
. Rather, you need to remove and re-deploy them. The commands listed below are needed to prevent the loss of your workloads created by Notebooks, Argo Workflows, and Scheduled Workflows when removing these charms. Run the following:
# prevent loss of existing notebooks
kubectl annotate crd notebooks.kubeflow.org controller.juju.is/id-
kubectl annotate crd notebooks.kubeflow.org model.juju.is/id-
kubectl label crd notebooks.kubeflow.org app.juju.is/created-by-
kubectl label crd notebooks.kubeflow.org app.kubernetes.io/managed-by-
kubectl label crd notebooks.kubeflow.org app.kubernetes.io/name-
kubectl label crd notebooks.kubeflow.org model.juju.is/name-
# prevent loss of defined workflows
kubectl annotate crd workflows.argoproj.io controller.juju.is/id-
kubectl annotate crd workflows.argoproj.io model.juju.is/id-
kubectl label crd workflows.argoproj.io app.juju.is/created-by-
kubectl label crd workflows.argoproj.io app.kubernetes.io/managed-by-
kubectl label crd workflows.argoproj.io app.kubernetes.io/name-
kubectl label crd workflows.argoproj.io model.juju.is/name-
# prevent loss of defined scheduled workflows
kubectl annotate crd scheduledworkflows.kubeflow.org controller.juju.is/id-
kubectl annotate crd scheduledworkflows.kubeflow.org model.juju.is/id-
kubectl label crd scheduledworkflows.kubeflow.org app.juju.is/created-by-
kubectl label crd scheduledworkflows.kubeflow.org app.kubernetes.io/managed-by-
kubectl label crd scheduledworkflows.kubeflow.org app.kubernetes.io/name-
kubectl label crd scheduledworkflows.kubeflow.org model.juju.is/name-
Remove argo-server charm
argo-server
charm was deprecated in Charmed Kubeflow 1.8. This charm was not being utilized in the bundle, so removing it will not affect your deployment. Rather, you should remove it to save resources. Remove it by running:
juju remove-application argo-server
Upgrade charms
Upgrade Istio
It is assumed that the deployed istio-pilot
and istio-ingressgateway
versions alongside Charmed Kubeflow 1.7 are 1.16.
- Scale down the
istio-ingressgateway
application to 0
juju scale-application istio-ingressgateway 0
- Run the following command to ensure that the
istio-ingressgateway
deployment is properly removed. If removal is successful, the command should succeed (return0
):
kubectl -n kubeflow get deploy istio-ingressgateway-workload 2> >(grep -q "NotFound" && echo $?)
- Upgrade
istio-pilot
charm.
If the ssl-key
and ssl-crt
configuration was in place, make sure you read the Migration from configuration to action guide for important considerations.
juju refresh istio-pilot --channel 1.17/stable
- Upgrade and Scale up
istio-ingressgateway
charm
juju refresh istio-ingressgateway --channel 1.17/stable
juju scale-application istio-ingressgateway 1
See Istio upgrade troubleshooting for more details.
Re-deploy kubeflow-roles charm
There is a difference how charms are handling Roles and ClusterRoles in 1.8 release. As a result, kubeflow-roles
charm needs to be re-deployed rather than refreshed:
# redeploy kubeflow-roles
juju remove-application kubeflow-roles
juju deploy kubeflow-roles --channel 1.8/stable --trust
Upgrade Podspec to Sidecar charms
Some charms were written from PodSpec to Sidecar between Charmed Kubeflow 1.7 to 1.8. Juju 3.4 requires for this kind of upgrade that you scale down the application, refresh it, then scale it up.
- Scale down the applications
juju scale-application admission-webhook 0
juju scale-application kfp-profile-controller 0
juju scale-application kfp-ui 0
juju scale-application kfp-viz 0
juju scale-application oidc-gatekeeper 0
juju scale-application tensorboard-controller 0
juju scale-application tensorboards-web-app 0
- Refresh to the new charms
juju refresh admission-webhook --channel 1.8/stable --trust
juju refresh kfp-profile-controller --channel 2.0/stable --trust
juju refresh kfp-ui --channel 2.0/stable --trust
juju refresh kfp-viz --channel 2.0/stable --trust
juju refresh oidc-gatekeeper --channel ckf-1.8/stable --trust
juju refresh tensorboard-controller --channel 1.8/stable --trust
juju refresh tensorboards-web-app --channel 1.8/stable --trust
- Scale up the applications
juju scale-application admission-webhook 1
juju scale-application kfp-profile-controller 1
juju scale-application kfp-ui 1
juju scale-application kfp-viz 1
juju scale-application oidc-gatekeeper 1
juju scale-application tensorboard-controller 1
juju scale-application tensorboards-web-app 1
Other charms that moved to Sidecar pattern are a special case, they need to be removed and re-deployed, for more information see GH 732. Make sure to follow the pre-upgrade steps before doing this to prevent any loss in your user-created workloads.
- Remove the charms from 1.7
juju remove-application jupyter-controller
juju remove-application argo-controller
juju remove-application kfp-persistence
juju remove-application kfp-schedwf
juju remove-application kfp-viewer
- Wait for the charms to be removed. Make sure all related resources are properly removed. The following commands should succeed (return
0
):
juju show-application jupyter-controller 2> >(grep -q "not found" && echo $?)
kubectl -n kubeflow get deploy jupyter-controller 2> >(grep -q "NotFound" && echo $?)
juju show-application argo-controller 2> >(grep -q "not found" && echo $?)
kubectl -n kubeflow get deploy argo-controller 2> >(grep -q "NotFound" && echo $?)
juju show-application kfp-persistence 2> >(grep -q "not found" && echo $?)
kubectl -n kubeflow get deploy kfp-persistence 2> >(grep -q "NotFound" && echo $?)
juju show-application kfp-schedwf 2> >(grep -q "not found" && echo $?)
kubectl -n kubeflow get deploy kfp-schedwf 2> >(grep -q "NotFound" && echo $?)
juju show-application kfp-viewer 2> >(grep -q "not found" && echo $?)
kubectl -n kubeflow get deploy kfp-viewer 2> >(grep -q "NotFound" && echo $?)
- Deploy the new charms and add the relations
juju deploy jupyter-controller --trust --channel=1.8/stable
juju deploy argo-controller --trust --channel=3.3/stable
juju deploy kfp-persistence --trust --channel=2.0/stable
juju deploy kfp-schedwf --trust --channel=2.0/stable
juju deploy kfp-viewer --trust --channel=2.0/stable
juju relate argo-controller minio
Upgrade charms with refresh
Now, you can upgrade the rest of the CKF charms with juju refresh
:
juju refresh dex-auth --channel 2.36/stable
juju refresh jupyter-ui --channel 1.8/stable
juju refresh katib-controller --channel 0.16/stable
juju refresh katib-db-manager --channel 0.16/stable --trust
juju refresh katib-ui --channel 0.16/stable
juju refresh kfp-api --channel 2.0/stable --trust
juju refresh knative-eventing --channel 1.10/stable
juju refresh knative-operator --channel 1.10/stable
juju refresh knative-serving --channel 1.10/stable
juju refresh kserve-controller --channel 0.11/stable
juju refresh kubeflow-dashboard --channel 1.8/stable
juju refresh kubeflow-profiles --channel 1.8/stable
juju refresh kubeflow-volumes --channel 1.8/stable
juju refresh metacontroller-operator --channel 3.0/stable
juju refresh minio --channel ckf-1.8/stable
juju refresh seldon-controller-manager --channel 1.17/stable
juju refresh training-operator --channel 1.7/stable
juju relate kfp-api:kfp-api kfp-persistence:kfp-api
Add new relations
Add Dashboard relations
Charmed Kubeflow 1.8 introduces dynamic sidebar configuration for the dashboard. Add these relations to kubeflow components to be able to use the new dashboard:
juju relate kubeflow-dashboard:links jupyter-ui:dashboard-links
juju relate kubeflow-dashboard:links katib-ui:dashboard-links
juju relate kubeflow-dashboard:links kfp-ui:dashboard-links
juju relate kubeflow-dashboard:links kubeflow-volumes:dashboard-links
juju relate kubeflow-dashboard:links tensorboards-web-app:dashboard-links
Add KServe-KNative relation
Charmed Kubeflow 1.8 changes the default of KServe deployment mode from RawDeployment to Serverless. For Serverless deployments to work correctly, you need to add the following relation:
juju relate knative-serving:local-gateway kserve-controller
Deploy KFP 2.0 dependencies
Deploy the charms needed for KFP 2.0. These are required in KFP 2.0 for MLMD functionality.
juju deploy envoy --channel=2.0/stable --trust
juju deploy kfp-metadata-writer --channel=2.0/stable --trust
juju deploy mlmd --channel=1.14/stable
juju relate istio-pilot:ingress envoy:ingress
juju relate mlmd:grpc envoy:grpc
juju relate mlmd:grpc kfp-metadata-writer:grpc
Deploy PVCViewer charm
Kubeflow 1.8 introduced the PVCViewer feature in Kubeflow Volumes, this is enabled in Charmed Kubeflow by deploying the pvcviewer-operator
charm. Deploy it with:
juju deploy pvcviewer-operator --channel=1.8/stable --trust
Update Kubernetes
Kubeflow 1.8 is supported on Kubernetes versions 1.24
, 1.25
and 1.26
Users should update their Kubernetes cluster to one of these versions.
Migrate Pipelines to v2
If you have Pipelines created with SDK v1, you need to migrate them to use SDK v2. This is because KFP SDK v2 is not backwards compatible with SDK v1.
See Migrate from KFP SDK v1 for more details.