Kubernetes-worker and GPU

When adding a kubernetes-worker to a charemed kubernetes cluster and it allocates a machine with GPU. The driver failes because the following command is missing:

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub

This command has to be run manually on each node. How can this command added to kubernetes-worker charm? Right now it does not work on GPU nodes.

@bjornrun thanks for raising this issue. Looks like nvidia changed their ppa key a week ago. You had the unfortunate timing to hit this shortly after they changed it!

I’ve opened a bug to get this fixed in the next charmed kubernetes bugfix release. Until then, you can work around this as you’ve done (apt-key on each worker), or change the default config to update all workers:

juju config containerd nvidia_apt_key_urls='https://nvidia.github.io/nvidia-container-runtime/gpgkey https://developer.download.nvidia.com/compute/cuda/repos/{id}{version_id_no_dot}/x86_64/3bf863cc.pub'

Fwiw, the nvidia repo/key is managed by containerd charm config, not kubernetes-worker.

1 Like