Kubernetes API warnings

Hi folks,

Testing some kubernetes stuff and I don’t see it in juju debug-log so maybe its nothing but on kubectl logs pytorch-operator-blah I see a lot of:

E1206 01:00:22.299362       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *unstructured.Unstructured: pytorchjobs.kubeflow.org is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "pytorchjobs" in API group "kubeflow.org" at the cluster scope
E1206 01:00:22.311128       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "pods" in API group "" at the cluster scope
E1206 01:00:22.311128       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "services" in API group "" at the cluster scope
E1206 01:00:23.307654       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *unstructured.Unstructured: pytorchjobs.kubeflow.org is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "pytorchjobs" in API group "kubeflow.org" at the cluster scope
E1206 01:00:23.316298       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "pods" in API group "" at the cluster scope
E1206 01:00:23.318757       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.15.9/tools/cache/reflector.go:98: Failed to list *v1.Service: services is forbidden: User "system:serviceaccount:test3:pytorch-operator" cannot list resource "services" in API group "" at the cluster scope

This is kubernetes 1.19 on OVH. Everything says its up and running but I don’t really believe it, judging by the errors. Same happens on any charm it seems.

This is indeed the same with all the k8s charms. Some like my random test charm deploys the container alright, so its not terminal. On Postgres, it seems terminal though:

2020-12-06 16:54:54,048     INFO: Updating PostgreSQL configuration in /srv/pgconf/12/main/conf.d/juju_charm.conf
Traceback (most recent call last):
  File "/usr/local/bin/docker_entrypoint.py", line 23, in <module>
    pgcharm.docker_entrypoint()
  File "/usr/local/lib/python3.8/dist-packages/pgcharm.py", line 503, in docker_entrypoint
    if is_master():
  File "/usr/local/lib/python3.8/dist-packages/pgcharm.py", line 412, in is_master
    return get_master() == JUJU_POD_NAME
  File "/usr/local/lib/python3.8/dist-packages/pgcharm.py", line 421, in get_master
    masters = [i.metadata.name for i in api.list_namespaced_pod(NAMESPACE, label_selector=master_selector).items]
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api/core_v1_api.py", line 15302, in list_namespaced_pod
    return self.list_namespaced_pod_with_http_info(namespace, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api/core_v1_api.py", line 15413, in list_namespaced_pod_with_http_info
    return self.api_client.call_api(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py", line 373, in request
    return self.rest_client.GET(url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 239, in GET
    return self.request("GET", url,
  File "/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py", line 233, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': '82eaba53-a093-40f8-b0a6-bd7c4446bbf5', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Sun, 06 Dec 2020 16:54:54 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"system:serviceaccount:postgresql2:default\" cannot list resource \"pods\" in API group \"\" in the namespace \"postgresql2\"","reason":"Forbidden","details":{"kind":"pods"},"code":403}

As pointed out on other thread, juju doesn’t really show any serious signs of trying to inform me that its in a crashloop which is odd, and juju debug-logs doesn’t capture any of the security api log output, so I have to go poking around in pods to find out.

Can anyone tell me if this is a) expected b) a straight bug and stick it in github c) something we should see if we can fix without raising a bug?

Happy to dig deeper, but as its making postgres explode I assume its not an expected response.

I’ve created Bug #1907161 “Kubernetes API security errors” : Bugs : juju as I have no idea and the lack of replies leads me to assume its not something obvious.

Currently, the operator is set up with a service account with these roles

		Rules: []rbacv1.PolicyRule{
			{
				APIGroups: []string{""},
				Resources: []string{"pods"},
				Verbs: []string{
					"get",
					"list",
				},
			},
			{
				APIGroups: []string{""},
				Resources: []string{"pods/exec"},
				Verbs: []string{
					"create",
				},
			},
		},

So the charm operator pod can manage and exec into its own workload pods in the namespace, but is not configured to be able to access cluster level resources, which is what the errors seem to imply.

It’s a bit of a security risk to arbitrarily allow operators to just access any cluster resources. We have plans to allow operators to be trusted (juju trust); the operators would declare the level of access they would want and the user can then choose to grant those additional capabilities as an informed decision. That work hasn’t been started just yet.

In the interim, you would have to consider something like using kubectl to apply changes to the operator’s service account outside of Juju.

Thanks @wallyworld

Okay still a little confused.

If I launched an EKS, Charmed Kubernetes or MicroK8S, I’d assume I’d not see these warnings, right?

In which case what does OVH do differently to the main services you folks have tested on.

To the point above:

 kubectl -n controller-ovh-test get clusterroles

returns

juju-credential-24f0a8d1 
juju-credential-6633aa6a

is it one of these I’m supposed to fiddling around with?

Or applying a policy to the default service account created inside a juju namespace?

Tom

Whether you get those errors depends on if the cluster has RBAC enabled I believe.
Without RBAC turned on, there’s no restriction on what APIs a pod can access, if I am not mistaken.
It’s not so much related to the credential used to access te cluster but the roles (assigned via the service account) allocated to the operator pod. Juju currently does not allow the charm operator service account rules to be configured. But you may be able to add additional rules directly via kubectl. I haven’t tried it though to confirm it would work.

@wallyworld can you just check this for me to ensure I’m not going crazy:

This is the error:

"message":"pods is forbidden: User "system:serviceaccount:test3:default" cannot list resource "pods" in API group "" in the namespace "test3"",

From what I read in the snippet you posted the other day, the PolicyRule you setup the account with specifically states the ability to list pods. Is that not correct, or are we talking about 2 different things. Because I assume the default seviceaccount in the test3 model to be the operator you speak of.

== More ==

Oh I’ve just done a

kubectl -n test3 get serviceaccounts

and I see postgresql-operator and default. So I assume you meant the postgres operator not the default service account thats being leveraged here by the charm?

Okay I think I’ve got somewhere:

I cheated and created this:

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
 name: role-test-account-binding
 namespace: test3
subjects:
- kind: ServiceAccount
  name: default
  namespace: test3
roleRef:
 kind: Role
 name: postgresql-operator
 apiGroup: rbac.authorization.k8s.io

and bound the

system:serviceaccount:test3:default

account to the postgresql-operator roles and it picked them up and deployed the charm.

I’m still confused though, why is it using the default serviceaccount when it creates the postgresql-operator account?

Is that a bug or a feature?

A summary of what gets created…

There’s 2 pods for an application foo:

pod/foo (the workload pod)
pod/foo-operator (runs the charm)

The operator pod always has a service account created for it by Juju. The operator pod’s service account rules are set by Juju and are not currently configurable. These are used to create serviceaccount/foo-operator

The pod spec can contain a serviceAccount block. If set, this service account is created as serviceaccount/foo for the workload pod and will have the specified rules.

There is a default service account also. If the pod spec does not contain a service account definition for the workload pod, this will be used.

Workloads that need extra privileges will need to have a bespoke service account configured.

Allowing the operator to gain privileges over and above what Juju needs goes back to the juju trust thing in an earlier post.

If your charm need to do things not currently allowed, you will need to create the extra role binding(s) manually as you have done.