Charmed Kubeflow Upgrade Error

Hi there,

I’ve just upgraded from Charmed Kubeflow 1.4 to 1.6 and there are no longer any ports exposed on istio-ingressgateway.

I’m yet to upgrade the Microk8s cluster and it’s still running on 1.21.

Training-operator and tensorboard-controller are also stuck waiting for information: Model Controller Cloud/Region Version SLA Timestamp kubeflow microk8s-localhost microk8s/localhost 2.9.33 unsupported 15:18:07Z

App Version Status Scale Charm Channel Rev Address Exposed Message admission-webhook res:oci-image@84a4d7d active 1 admission-webhook 1.6/stable 50 10.152.183.11 no argo-controller res:oci-image@669ebd5 active 1 argo-controller 3.3/stable 99 no argo-server res:oci-image@576d038 active 1 argo-server 3.3/stable 45 no dex-auth active 1 dex-auth 2.31/stable 129 10.152.183.27 no envoy res:oci-image@b4adee5 active 1 envoy 1.12/stable 11 10.152.183.133 no istio-ingressgateway active 1 istio-gateway 1.11/stable 114 10.152.183.248 no istio-pilot active 1 istio-pilot 1.11/stable 131 10.152.183.226 no jupyter-controller res:oci-image@8f4ec33 active 1 jupyter-controller 1.6/stable 138 no jupyter-ui res:oci-image@cde6632 active 1 jupyter-ui 1.6/stable 99 10.152.183.55 no katib-controller res:oci-image@03d47fb active 1 katib-controller 0.14/stable 92 10.152.183.75 no katib-db mariadb/server:10.3 active 1 charmed-osm-mariadb-k8s stable 35 10.152.183.254 no ready katib-db-manager res:oci-image@16b33a5 active 1 katib-db-manager 0.14/stable 66 10.152.183.43 no katib-ui res:oci-image@c7dc04a active 1 katib-ui 0.14/stable 90 10.152.183.205 no kfp-api res:oci-image@1b44753 active 1 kfp-api 2.0/stable 81 10.152.183.45 no kfp-db mariadb/server:10.3 active 1 charmed-osm-mariadb-k8s stable 35 10.152.183.234 no ready kfp-persistence res:oci-image@31f08ad active 1 kfp-persistence 2.0/stable 76 no kfp-profile-controller res:oci-image@d86ecff active 1 kfp-profile-controller 2.0/stable 61 10.152.183.80 no kfp-schedwf res:oci-image@51ffc60 active 1 kfp-schedwf 2.0/stable 80 no kfp-ui res:oci-image@55148fd active 1 kfp-ui 2.0/stable 80 10.152.183.63 no kfp-viewer res:oci-image@7190aa3 active 1 kfp-viewer 2.0/stable 79 no kfp-viz res:oci-image@67e8b09 active 1 kfp-viz 2.0/stable 74 10.152.183.174 no kubeflow-dashboard res:oci-image@6fe6eec active 1 kubeflow-dashboard 1.6/stable 154 10.152.183.175 no kubeflow-profiles res:profile-image@0a46ffc active 1 kubeflow-profiles 1.6/stable 82 10.152.183.85 no kubeflow-roles active 1 kubeflow-roles 1.6/stable 31 10.152.183.249 no kubeflow-volumes res:oci-image@cc5177a active 1 kubeflow-volumes 1.6/stable 64 10.152.183.78 no metacontroller-operator active 1 metacontroller-operator 2.0/stable 48 10.152.183.6 no minio res:oci-image@1755999 active 1 minio stable 57 10.152.183.47 no mlmd res:oci-image@e2cb9ce active 1 mlmd 1.0/stable 14 10.152.183.8 no oidc-gatekeeper res:oci-image@32de216 active 1 oidc-gatekeeper ckf-1.6/stable 76 10.152.183.160 no seldon-controller-manager res:oci-image@eb811b6 active 1 seldon-core 1.14/stable 92 10.152.183.79 no tensorboard-controller res:oci-image@0f8c7de waiting 1 tensorboard-controller 1.6/stable 56 10.152.183.151 no Waiting for gateway info relation tensorboards-web-app res:oci-image@914a8ab active 1 tensorboards-web-app 1.6/stable 57 10.152.183.215 no training-operator waiting 1 training-operator 1.5/stable 65 10.152.183.99 no waiting for units settled down

Unit Workload Agent Address Ports Message admission-webhook/2* active idle 10.1.247.113 4443/TCP argo-controller/3* active idle 10.1.33.185 argo-server/2* active idle 10.1.235.105 2746/TCP dex-auth/0* active idle 10.1.181.102 envoy/0* active idle 10.1.181.85 9901/TCP,9090/TCP istio-ingressgateway/0* active idle 10.1.247.116 istio-pilot/0* active idle 10.1.33.187 jupyter-controller/3* active idle 10.1.248.38 jupyter-ui/2* active idle 10.1.248.51 5000/TCP katib-controller/3* active idle 10.1.248.52 443/TCP,8080/TCP katib-db-manager/2* active idle 10.1.181.99 6789/TCP katib-db/0* active idle 10.1.248.14 3306/TCP ready katib-ui/2* active idle 10.1.181.100 8080/TCP kfp-api/3* active idle 10.1.235.107 8888/TCP,8887/TCP kfp-db/0* active idle 10.1.248.16 3306/TCP ready kfp-persistence/3* active idle 10.1.235.108 kfp-profile-controller/3* active idle 10.1.248.50 80/TCP kfp-schedwf/2* active idle 10.1.235.106 kfp-ui/3* active idle 10.1.33.186 3000/TCP kfp-viewer/2* active idle 10.1.247.112 kfp-viz/3* active idle 10.1.248.49 8888/TCP kubeflow-dashboard/3* active idle 10.1.235.111 8082/TCP

Please could someone advise me on how to get the ports exposed again? I’ve tried redeploying the components multiple times but with no luck.

Thanks, Ollie

Update: My bad I didn’t check the load balancer IP. I am able to access dex now.

Glad you sorted it out! Thanks

Hi,

Just to say I redeployed again and was unable to access the dashboard.

It turns out the issue is the gateway was not created.

I’ve seen on the blog post it says this is rare but it has happened every time I deployed on fresh VMs with Microk8s in cluster mode on Ubuntu 20.04.

Perhaps this might be helpful in identifying when the issue occurs @ca-scribner

Thanks,

Hi @ollienuk,

Sorry you’re stuck on that. We thought we fixed that here but sounds like no. That pull links to an Issue that has some more ideas. Just confirming - does that look like the problems you’re hitting?

@ca-scribner No worries, I was able to fix it with the following workaround: juju run --unit istio-pilot/0 – “export JUJU_DISPATCH_PATH=hooks/config-changed; ./dispatch”

The issue you’ve linked to is definitely the issue I’ve been having.