Kubeflow is a complex stack of applications running on Kubernetes, usually on a remote cloud. While most of the time it does “just work”, there may be occasions when you encounter issues. This page outlines some general methods to find out the cause of the issue, as well as some common issues and their solutions.
Troubleshooting with Juju
Juju tracks the state of all the applications it deploys. Any issues detected by the applications will be picked up by Juju, so running:
…will show the current status of deployed software and any actions which need to be taken.
To troubleshoot applications further, you can use Juju to get a shell in the running container. This just requires the deployed application name and unit number (which you can see from the status command above). For example:
juju ssh seldon-core/0
You can then run whatever commands required to examine the state of the application and its container.
For more information on debugging and troubleshooting with Juju, see the juju documentation
Troubleshooting with kubectl
kubectl command can give you lots of information about the state of pods and services running on the cluster. To restrict the output to your kubflow deployment, you can run only in the desired namespace (the name of the Juju model, which in this documentation we called “kubeflow”). For example:
kubectl get pods -n kubeflow
A lot of information can be gleaned just using
kubectl. Check out the kubectl documentation for more help.
Pods stuck in pending
If some pods are not progressing past the ‘pending’ stage after a long time, the most common cause is that they have been unable to allocate storage. Check that enough storage is available to the cluster and examine the persistent volume claims made by the pods.
The dex-auth user and password can be seen using the Juju config command if you have access to the Juju client running the model.
juju config dex-auth static-username
juju config dex-auth static-password
… will reveal the current settings. You can also set a new username/password:
juju config dex-auth static-username=admin
juju config dex-auth static-password=AxWiJjk2hu4fFga7