[KUBEFLOW] Cannot get MLMD objects from Metadata store.

Hi All,

I am trying out latest charmed kubeflow version v1.10 and seeing an issue with one of my runs as shown in below snapshots.

As shown below, run_192613 is stuck and showing error message as “Cannot get MLMD objects from Metadata store.” When I clone the same run it is working fine.

Any known resolution to this issue? I am not sure which exact logs need to be extracted from the system. Please do suggest which logs will be helpful in troubleshooting this issue so that I can capture the same.

Hi there, thank you for taking the team to post your issue. Would it be possible to move this discussion to github with an issue under GitHub - canonical/kfp-operators: Kubeflow Pipelines Operators repo? To help troubleshooting the issue, we would need logs from pods created by the pipeline run with the issue. Each pipeline run usually creates multiple pods. If unsure about which pod to provide logs for, please provide us for all run_192613_23042025_mnist_kfpv2_mf_exp-* pods.

I see that this is a v2 pipeline, which means that the run itself contacts the MLMD store for data (you can see this spec about kfp-v2 design if you 're interested in learning more). Some other logs that could be helpful here would also be from the following charms:

  • argo-controller charm workload container with kubectl logs -n kubeflow argo-controller-0 -c argo-controller
  • kfp-api charm workload container with kubectl logs -n kubeflow kfp-api-0 -c apiserver
  • mlmd charm workload container with kubectl logs -n kubeflow mlmd-0 -c mlmd

Thanks for the reply

Raised following issue in github and attached logs for the same.

1 Like