Upgrading Charmed Kubeflow from 1.6 to 1.7 requires upgrading each charm individually. New features must be deployed separately. Most charms can be upgraded simply with
juju refresh; however certain components require additional steps to upgrade.
- An active and idle Charmed Kubeflow 1.6 deployment. This requires all charms in the bundle to be in that state. Access to dashboard of exising Charmed Kubeflow 1.6 deployment.
- Admin access to Kubernetes cluster where existing Charmed Kubeflow 1.6 is deployed.
- Before upgrade
- Upgrade Istio
- Before charms upgrade
- Upgrade charms
- Deploy KNative and KServe charms
- Verify upgrade
Before upgrading Charmed Kubeflow it is recommended to do the following:
- Stop all Notebooks.
- Review any important data that needs to be backed up and preform backup procedures according to the policies of your organisation.
- Record all charm versions in existing Charmed Kubeflow deployment.
All upgrade steps should be done in
kubeflow model. If you haven’t already, switch to
# switch to kubeflow model juju switch kubeflow
Upgrade of istio components is performed according to Istio’s best practices, which requires upgrading Istio by one minor version at a time and in sequence. For more details on upgrading and troubleshooting
istio-ingressgateway charms, please refer to this document. It is assumed that the deployed
istio-ingressgateway version alongside Charmed Kubeflow 1.6 is 1.11.
- Remove the
istio-ingressgatewayapplication and corresponding relation with
# remove relation and istio-ingressgateway application juju remove-relation istio-pilot istio-ingressgateway juju remove-application istio-ingressgateway
- Ensure that
istio-ingressgatewayapplication and all related resources are properly removed. The following commands should succeed (return
juju show-application istio-ingressgateway 2> >(grep -q "not found" && echo $?) kubectl -n kubeflow get deploy istio-ingressgateway-workload 2> >(grep -q "NotFound" && echo $?)
Troubleshooting of removal of `istio-ingressgateway` application
WARNING: Removing application using
--force option should be the last resort. There could be potential stability issues if application is not shutdown cleanly.
If required, remove
istio-ingressgateway application with
--force option and remove
juju remove-application --force istio-ingressgateway kubectl -n kubeflow delete deploy istio-ingressgateway-workload
istio-pilotcharm in sequence. For intermediate versions, Wait for each
refreshcommand to finish and upgrade is complete, i.e.
waitingstatus with the message
"Missing istio-ingressgateway-workload service, deferring this event".
# upgrade istio-pilot from 1.11 to 1.12 juju refresh istio-pilot --channel 1.12/stable
Initial upgrade from 1.11 to 1.12 might take some time. Ensure that
istio-pilot charm has completed its upgrade.
# upgrade istio-pilot from 1.12 to 1.13 juju refresh istio-pilot --channel 1.13/stable
# upgrade istio-pilot from 1.13 to 1.14 juju refresh istio-pilot --channel 1.14/stable
# upgrade istio-pilot from 1.14 to 1.15 juju refresh istio-pilot --channel 1.15/stable
# upgrade istio-pilot from 1.15 to 1.16 juju refresh istio-pilot --channel 1.16/stable
After refreshing to
istio-pilot should reach
active status within a few minutes. Otherwise, check out the troubleshooting tips below.
Troubleshooting of Istio upgrade
Refer to this document for troubleshooting tips.
istio-ingressgatewayadd relation between
# deploy istio-ingressgateway juju deploy istio-gateway --channel 1.16/stable --trust --config kind=ingress istio-ingressgateway juju relate istio-pilot istio-ingressgateway
Before charms can be upgraded the following actions need to be taken:
- Eanble trust on deployed charms (required).
- Updated default
adminprofile to prevent its deletion (optional)
Because of changes in the charm code, some charms in Charmed Kubeflow 1.6 have to be trusted by juju before the upgrade.
WARNING: Please note that if you do not execute
juju trust for these charms, you may encounter authorization errors. If that is the case, please refer to the Troubleshooting guide.
# enable trust on charms juju trust jupyter-ui --scope=cluster juju trust katib-db-manager --scope=cluster juju trust katib-ui --scope=cluster juju trust kfp-api --scope=cluster juju trust kubeflow-dashboard --scope=cluster juju trust kubeflow-profiles --scope=cluster juju trust seldon-controller-manager --scope=cluster
In Charmed Kubeflow 1.6 a user profile named
admin is created by default at deployment time. This profile has no additional priviledges - it is just a default profile that was created for convenience and has been removed as of Charmed Kubeflow 1.7. When upgrading to 1.7 this default profile will be deleted. If you depend on this profile, you can do the following to prevent its deletion:
# update admin profile kubectl annotate profile admin controller.juju.is/id- kubectl annotate profile admin model.juju.is/id- kubectl label profile admin app.juju.is/created-by- kubectl label profile admin app.kubernetes.io/managed-by- kubectl label profile admin app.kubernetes.io/name- kubectl label profile admin model.juju.is/name-
There is a difference how charms are handling Roles and ClusterRoles in 1.7 release. As a result,
kubeflow-roles charm needs to be re-deployed rather than refreshed:
# redeploy kubeflow-roles juju remove-application kubeflow-roles juju deploy kubeflow-roles --channel 1.7/stable --trust
To upgrade Charmed Kubeflow each charm needs to be refreshed. It is recommended to wait for each charm to finish its upgrade before proceeding with the next.
Depending on original deployment of Charmed Kuberflow version 1.6, refresh command will report that charm is up-to-date which indicates that there is not need to upgrade that particular charm.
During the upgrade some charms can temporarily go into
blocked state, but they should go
active after a while.
# upgrade charms juju refresh admission-webhook --channel 1.7/stable juju refresh argo-controller --channel 3.3/stable juju refresh argo-server --channel 3.3/stable juju refresh dex-auth --channel 2.31/stable juju refresh jupyter-controller --channel 1.7/stable juju refresh jupyter-ui --channel 1.7/stable juju refresh katib-controller --channel 0.15/stable juju refresh katib-db --channel latest/stable juju refresh katib-db-manager --channel 0.15/stable juju refresh katib-ui --channel 0.15/stable juju refresh kfp-api --channel 2.0/stable juju refresh kfp-db --channel latest/stable juju refresh kfp-persistence --channel 2.0/stable juju refresh kfp-profile-controller --channel 2.0/stable juju refresh kfp-schedwf --channel 2.0/stable juju refresh kfp-ui --channel 2.0/stable juju refresh kfp-viewer --channel 2.0/stable juju refresh kfp-viz --channel 2.0/stable juju refresh kubeflow-dashboard --channel 1.7/stable juju refresh kubeflow-profiles --channel 1.7/stable juju refresh kubeflow-volumes --channel 1.7/stable juju refresh metacontroller-operator --channel 2.0/stable juju refresh minio --channel ckf-1.7/stable juju refresh oidc-gatekeeper --channel ckf-1.7/stable juju refresh seldon-controller-manager --channel 1.15/stable juju refresh tensorboard-controller --channel 1.7/stable juju refresh tensorboards-web-app --channel 1.7/stable juju refresh training-operator --channel 1.6/stable
Troubleshooting charm upgrade
If charm fails upgrade or is stuck in
maintenance state for long time it is possible to recover by running refresh command with version that was there prior to deployment, i.e. downgrade the charm. After that repeat the upgrade.
KNative and KServe are new additions to Charmed Kubeflow 1.7 and need to be deployed separately as part of the upgrade:
# install knative and kserve juju deploy knative-operator --channel 1.8/stable --trust juju deploy knative-serving --config namespace="knative-serving" --config istio.gateway.namespace=kubeflow --config istio.gateway.name=kubeflow-gateway --channel 1.8/stable --trust juju deploy knative-eventing --config namespace="knative-eventing" --channel 1.8/stable --trust juju deploy kserve-controller --channel 0.10/stable --trust juju relate istio-pilot:gateway-info kserve-controller:ingress-gateway
You can verify the progress of the upgrade by running:
watch -c juju status --color
When all services are in active/idle state then the upgrade should be finished.