Install on AKS

This guide describes how to install Charmed Kubeflow (CKF) on Azure Kubernetes Service (AKS).

You will spin up an AKS cluster using the Azure CLI (Command Line Interface) on your local machine. Then, you will interact with the cluster and deploy CKF using kubectl and Juju.

Requirements

Make sure the Azure CLI has the following configuration:

Deploy AKS cluster

Note that the deployment incurs charges for every hour the cluster is running.

First, create a resource group to deploy the cluster:

az group create --name myResourceGroup --location westeurope

Regarding location, choose whichever suits best your needs. You can list all locations available using az account list-locations -o table.

Now, spin up the cluster. The configuration below provides the minimum requirements for deploying CKF. For further customization, see the full list of available parameters:

az aks create \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --kubernetes-version 1.29 \
  --node-count 2 \
  --node-vm-size Standard_D8s_v3 \
  --node-osdisk-size 100 \
  --node-osdisk-type Managed \
  --os-sku Ubuntu \
  --ssh-key-value <path-to-public-key>
  • kubernetes-version: this example uses Kubernetes (K8s) 1.29. For using other versions, see Supported versions for compatibility with K8s and Juju.
  • node-count: this cluster has two worker nodes, given that the cluster autoscaler option is disabled by default. You can also enable it and define instead max-count and min-count.
  • node-vm-size: the cluster is deployed with Azure VM instances of size Standard_D8s_v3 for worker nodes. See VM sizes and Sizes for cloud services for more details.
  • node-osdisk-type: Managed node disks are used since Ephemeral ones are better suited when applications are tolerant of individual VM failures, which is not the case for CKF.
  • ssh-key-value: public key path or key contents to access individual nodes using SSH. Its default value is ~\.ssh\id_rsa.pub.

Spinning up the cluster may take some time to complete.

Verify access to the cluster

Check if the AKS cluster has been added to kubeconfig as follows:

kubectl config get-clusters

If you don’t see it there, use the following command to add it:

az aks get-credentials --resource-group myResourceGroup --name myAKSCluster --admin

You may need to remove --admin from the command above, depending on the type of kubeconfig that you have access to.

Now check your access to the cluster as follows:

kubectl get nodes

You should expect an output like the following:

NAME                                STATUS   ROLES   AGE     VERSION
aks-nodepool1-18441560-vmss000000   Ready    agent   1m20s   v1.29.4
aks-nodepool1-40664177-vmss000001   Ready    agent   1m20s   v1.29.4

Set up Juju

  1. Install Juju:
sudo snap install juju --channel=3.4/stable
  1. Add your AKS cluster as a cloud to Juju:
juju add-k8s aks --client
  1. Bootstrap a Juju controller:
juju bootstrap aks aks-controller

See Get started with Juju for more details.

Deploy CKF

To deploy CKF and access its dashboard, follow the steps provided in the general installation guide from creating the kubeflow model section.

Clean up resources

You can delete the AKS cluster and related resources as follows:

az aks delete --resource-group myResourceGroup --name myAKSCluster --yes
az group delete --name myResourceGroup --yes

You can check if a resource group exists using:

az group exists --name <resource-group-name>

See Azure resource manager for further information.

To clean up Juju resources, run the following commands:

juju unregister aks-controller
juju remove-cloud aks --client
2 Likes

Wow, this is such a nice, clean experience for a very sophisticated solution! Well done :slight_smile:

3 Likes