Perform inference on ISVCs using access tokens

This guide describes how to configure Charmed Kubeflow to perform inference on user owned KServe Inference Services (ISVCs) using programmatic access tokens, for example from a Jupyter Notebook.

To do so, follow these steps:

  1. Create a ServiceAccount token from inside a Notebook server with the following parameters:
TOKEN=$(kubectl create token \
default-editor \
--duration=<duration in s> \
--audience=istio-ingressgateway.kubeflow.svc.cluster.local)

The --audience parameter has to be set to istio-ingressgateway.kubeflow.svc.cluster.local for requests with this token to go through, otherwise they can be rejected.

  1. Pass the TOKEN to the request, for example:
curl $ISVC_URL/v1/models/model-endpoint \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: <content-type>' \
-d <data>

The ServiceAccount token is bound to the default-editor ServiceAccount that gets created for every Charmed Kubeflow user. That being said:

  • The token will not be valid if the default-editor ServiceAccount is deleted.
  • The access cannot be revoked unless the duration set at creation time has expired.