502 Bad Gateway

pnew · 18 October 2022 09:32

We have hit the following issue with k8s deployed with juju:

$ juju status
Model    Controller      Cloud/Region         Version  SLA          Timestamp
default  my1-controller  localhost/localhost  2.9.32   unsupported  09:27:34Z

App                    Version   Status  Scale  Charm                  Channel   Rev  Exposed  Message
containerd             go1.13.8  active      2  containerd             stable    178  no       Container runtime available
easyrsa                3.0.1     active      1  easyrsa                stable    420  no       Certificate Authority connected.
etcd                   3.4.5     active      3  etcd                   stable    634  no       Healthy with 3 known peers
flannel                0.11.0    active      2  flannel                stable    597  no       Flannel subnet 10.1.38.1/24
kubeapi-load-balancer  1.18.0    active      2  kubeapi-load-balancer  stable    844  yes      NGINX is ready
kubernetes-master      1.22.15   active      1  kubernetes-master      stable   1078  no       Kubernetes master running.
kubernetes-worker      1.22.15   active      1  kubernetes-worker      stable    816  yes      Kubernetes worker running.

Unit                      Workload  Agent  Machine  Public address  Ports             Message
easyrsa/0*                active    idle   0        10.50.30.221                      Certificate Authority connected.
etcd/0                    active    idle   1        10.50.30.201    2379/tcp          Healthy with 3 known peers
etcd/1                    active    idle   2        10.50.30.248    2379/tcp          Healthy with 3 known peers
etcd/2*                   active    idle   3        10.50.30.219    2379/tcp          Healthy with 3 known peers
kubeapi-load-balancer/1   active    idle   13       10.50.30.223                      NGINX is ready
kubeapi-load-balancer/2*  active    idle   14       10.50.30.222    443/tcp,6443/tcp  Loadbalancer ready.
kubernetes-master/0*      active    idle   5        10.50.30.243    6443/tcp          Kubernetes master running.
  containerd/4            active    idle            10.50.30.243                      Container runtime available
  flannel/4               active    idle            10.50.30.243                      Flannel subnet 10.1.34.1/24
kubernetes-worker/0*      active    idle   7        10.50.30.204    80/tcp,443/tcp    Kubernetes worker running.
  containerd/1*           active    idle            10.50.30.204                      Container runtime available
  flannel/1*              active    idle            10.50.30.204                      Flannel subnet 10.1.38.1/24

Machine  State    Address       Inst id         Series  AZ  Message
0        started  10.50.30.221  juju-e1704d-0   focal       Running
1        started  10.50.30.201  juju-e1704d-1   focal       Running
2        started  10.50.30.248  juju-e1704d-2   focal       Running
3        started  10.50.30.219  juju-e1704d-3   focal       Running
5        started  10.50.30.243  juju-e1704d-5   focal       Running
7        started  10.50.30.204  juju-e1704d-7   focal       Running
13       started  10.50.30.223  juju-e1704d-13  focal       Running
14       started  10.50.30.222  juju-e1704d-14  focal       Running

$ k get pods -n test
Error from server (InternalError): an error on the server ("<html>\r\n<head><title>502 Bad Gateway</title></head>\r\n<body>\r\n<center><h1>502 Bad Gateway</h1></center>\r\n<hr><center>nginx/1.18.0 (Ubuntu)</center>\r\n</body>\r\n</html>") has prevented the request from succeeding

But if I run exactly the same command (k get pods -n test) a few seconds later it works correctly.

Can anybody tell me what is wrong ?

ppasotti · 18 October 2022 13:17

Hi! To clarify, does the command fail every now and then, or just once, the first time you call it, as soon as charmed k8s is done deploying? Is the issue encountered in an automated script, or what is exactly the repro steps for this behaviour?

It could be that you’re trying to hit the k8s api ‘too soon’, when it’s not ready yet, but I’m no expert on this (I can find someone who is though )

pnew · 20 October 2022 13:56

This command fails every now and then.
The k8s and api load balancer have been deployed long time ago.

ppasotti · 20 October 2022 14:06

no idea, but I asked to some colleagues. Someone will get back to you