Upgrade argo to 3.3.9 to support enhanced depends

Hi,

I was wondering how I can upgrade the Argo installation bundled with charmed Kubeflow. Here i want to use enhanced depends Logic, but it seems that the bundled version of argo is outdated. https://github.com/argoproj/argo-workflows/issues/8654

To me it seems that the bundled version might be 3.3.8: (taken from microk8s kubectl describe pod argo-controller-54d4fb44df-hsknk -n kubeflow)

Args:
  --configmap
  argo-controller-configmap-config
  --executor-image
  argoproj/argoexec:v3.3.8

Whereas as stated in the issue linked above there is a problem in 3.3.8 which is fixed in 3.3.9 and I am getting the same error:

"error":"Internal error: templates.demo-train templates.demo-train.tasks.chmod task result 'Omitted' for task 'validate' is invalid\nFailed to create a new run.

So my Question is how can I upgrade to the right argo version?

This is the bundle i am currently running:

Model     Controller          Cloud/Region        Version  SLA          Timestamp
kubeflow  microk8s-localhost  microk8s/localhost  2.9.42   unsupported  08:18:52Z

App                        Version                Status  Scale  Charm                    Channel         Rev  Address         Exposed  Message
admission-webhook          res:oci-image@2d74d1b  active      1  admission-webhook        1.7/stable      134  10.152.183.31   no
argo-controller            res:oci-image@669ebd5  active      1  argo-controller          3.3/stable      236                  no
argo-server                res:oci-image@576d038  active      1  argo-server              3.3/stable      185                  no
dex-auth                                          active      1  dex-auth                 2.31/stable     224  10.152.183.36   no
istio-ingressgateway                              active      1  istio-gateway            1.16/stable     411  10.152.183.55   no
istio-pilot                                       active      1  istio-pilot              1.16/stable     413  10.152.183.213  no
jupyter-controller         res:oci-image@1167186  active      1  jupyter-controller       1.7/stable      607                  no
jupyter-ui                                        active      1  jupyter-ui               1.7/stable      534  10.152.183.150  no
katib-controller           res:oci-image@111495a  active      1  katib-controller         0.15/stable     206  10.152.183.85   no
katib-db                   mariadb/server:10.3    active      1  charmed-osm-mariadb-k8s  latest/stable    35  10.152.183.127  no       ready
katib-db-manager           res:oci-image@2fd18aa  active      1  katib-db-manager         0.15/stable     180  10.152.183.68   no
katib-ui                                          active      1  katib-ui                 0.15/stable     194  10.152.183.30   no
kfp-api                    res:oci-image@e08e41d  active      1  kfp-api                  2.0/stable      298  10.152.183.60   no
kfp-db                     mariadb/server:10.3    active      1  charmed-osm-mariadb-k8s  latest/stable    35  10.152.183.196  no       ready
kfp-persistence            res:oci-image@516e6b8  active      1  kfp-persistence          2.0/stable      294                  no
kfp-profile-controller     res:oci-image@6278f3e  active      1  kfp-profile-controller   2.0/stable      274  10.152.183.249  no
kfp-schedwf                res:oci-image@1f6d4b5  active      1  kfp-schedwf              2.0/stable      312                  no
kfp-ui                     res:oci-image@ae72602  active      1  kfp-ui                   2.0/stable      297  10.152.183.144  no
kfp-viewer                 res:oci-image@c2f2ee1  active      1  kfp-viewer               2.0/stable      310                  no
kfp-viz                    res:oci-image@3de6f3c  active      1  kfp-viz                  2.0/stable      281  10.152.183.243  no
knative-eventing                                  active      1  knative-eventing         1.8/stable      165  10.152.183.53   no
knative-operator                                  active      1  knative-operator         1.8/stable      142  10.152.183.157  no
knative-serving                                   active      1  knative-serving          1.8/stable      164  10.152.183.41   no
kserve-controller                                 active      1  kserve-controller        0.10/stable      86  10.152.183.237  no
kubeflow-dashboard                                active      1  kubeflow-dashboard       1.7/stable      307  10.152.183.46   no
kubeflow-profiles                                 active      1  kubeflow-profiles        1.7/stable      269  10.152.183.103  no
kubeflow-roles                                    active      1  kubeflow-roles           1.7/stable      113  10.152.183.245  no
kubeflow-volumes           res:oci-image@d261609  active      1  kubeflow-volumes         1.7/stable      178  10.152.183.12   no
metacontroller-operator                           active      1  metacontroller-operator  2.0/stable      117  10.152.183.111  no
minio                      res:oci-image@1755999  active      1  minio                    ckf-1.7/stable  186  10.152.183.42   no
mlflow-db                  mariadb/server:10.3    active      1  charmed-osm-mariadb-k8s  stable           35  10.152.183.149  no       ready
mlflow-server              res:oci-image@bba33cd  active      1  mlflow-server            stable           77                  no
oidc-gatekeeper            res:oci-image@6b720b8  active      1  oidc-gatekeeper          ckf-1.7/stable  176  10.152.183.164  no
seldon-controller-manager                         active      1  seldon-core              1.15/stable     298  10.152.183.133  no
tensorboard-controller     res:oci-image@c52f7c2  active      1  tensorboard-controller   1.7/stable      156  10.152.183.183  no
tensorboards-web-app       res:oci-image@929f55b  active      1  tensorboards-web-app     1.7/stable      158  10.152.183.140  no
training-operator                                 active      1  training-operator        1.6/stable      190  10.152.183.158  no

Hi @lukas,

Sorry for the trouble. If not for this bug, this should hack work:

juju download argo-controller --channel 3.3/stable

# open downloaded .charm file as an archive
# edit metadata.yaml to: 
# * use argoproj/workflow-controller:v3.3.9
#   NOTE: This is just bookkeeping, it doesn't actually control the image used.  See also the --resource call next

# Deploy your modified .charm file and use the updated image
juju deploy ./argo-controller_XXXXXXX.charm --resource oci-image=argoproj/workflow-controller:v3.3.9

But that series bug I think breaks this attempt. The series bug should land very soon (might already be in the most recent juju 2.9 CLI versions.

Apart from that, I can’t think of a good way to work around this.

Hi, thanks for your Help and Sorry for the late reply. On a fresh from scratch install your suggestion seems to be working. But i tried upgrading my existing setup and failed.

I did the following steps to upgrade:

  1. juju remove-application argo-controller
  2. juju deploy ./argo-controller_3e788c8.charm --resource oci-image=argoproj/workflow-controller:v3.3.9

But now i am stuck with the argo-controller waiting for object-storage relation data:

argo-controller blocked 1 argo-controller 4 no Waiting for object-storage relation data

Any idea how I can fix this/save my Installation?

Hey, I managed to solve the upgrade issue myself.

I have updated not both argo controller and argo server but nevertheless the error persists:

"error":"Internal error: templates.demo-train templates.demo-train.tasks.chmod task result 'Omitted' for task 'validate' is invalid\nFailed to create a new run

Are there any checks enforced on kubeflows side?

Hey Lukas. Let’s continue the conversation on Mattermost, where the engineers have more eyes: Mattermost.