Cannot bootstrap juju on GKE

flimzy · 29 January 2021 13:37

I’ve created a new k8s cluster on GKE, and now attempting to bootstrap juju, I get the following:

$ juju bootstrap gke
ERROR unable to determine legacy status for namespace controller: Get “https://35.202.226.149/api/v1/namespaces/controller”: dial tcp 35.202.226.149:443: i/
o timeout

Observed both with juju 2.9-rc6, and 2.8.7.

Full debug output (2.9-rc6):

$ juju --debug bootstrap gke
14:36:13 INFO juju.cmd supercommand.go:56 running juju [2.9-rc6 0 5a68ca1d03ec88d7345d3dd9f8518ac1294b7ef8 gc go1.14.14]
14:36:13 DEBUG juju.cmd supercommand.go:57 args: []string{"/snap/juju/15219/bin/juju", “–debug”, “bootstrap”, “gke”}
14:36:13 DEBUG juju.cmd.juju.commands bootstrap.go:1304 authenticating with region “” and credential “gke” ()
14:36:13 DEBUG juju.cmd.juju.commands bootstrap.go:1452 provider attrs: map[operator-storage: workload-storage:]
14:36:13 INFO cmd authkeys.go:114 Adding contents of “/home/jonhall/.local/share/juju/ssh/juju_id_rsa.pub” to authorized-keys
14:36:13 INFO cmd authkeys.go:114 Adding contents of “/home/jonhall/.ssh/id_rsa.pub” to authorized-keys
14:36:13 DEBUG juju.cmd.juju.commands bootstrap.go:1522 preparing controller with config: map[agent-metadata-url: agent-stream:released apt-ftp-proxy: apt-h
ttp-proxy: apt-https-proxy: apt-mirror: apt-no-proxy: authorized-keys:ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQPSW0TfsTeZvzukn8z/au0V7V0ZK0pr+vW+E3W1lE6tKg5v
UbXs8ggPrmWXeniDdNfszIY1qqCGCqQ0GyMCvIIwE9O8Xqac7Q8NC98VFlPTrCYU8Vt7tD5jbCT6vDgxIVcpZOYiyJ2+MqgPjq/1x2e91Kk7h3UuxDWvl+XcXUeaT//8sBJ0iDr4p2+c4AVP9jAnyG8AbLpg
Yyj6DeYAJEH9vwvWV9sHIysCRhkaI2SZn3PLHWODmpKq+s/TOfDMqhsG1AYMnHF08cfSF2tajH0dySShvWr73SVoGHPOYSa/pxJz9QZm5I58k6rJkXbQHpLQ+Gu95jyXClIjBBEAYf juju-client-key
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDF+9mrjG7OYlsbxQ5KIc8adpPGhrzg2MrbsBeXWe2plP0DBSdnDMrFY5tYv/CT6yG3iL3nWQKA4BfBBy9tTZ/NBOm8xSoa48O+rBk4ywJ6b17uRDUsdiTG
hfT3dgD9d37eic1niWYx1+yLSEO8mls4jUk6LnUiHoHY6oyMIKm6cM+LP2dtpDPJzX082LitB/2BYUgfC2081J5/ULDTII2NlUfTuVy3F6QrsOJSzC5sDvypYsKHydjt6WzxLRHomuFT9ikX6DlXrznw0T01
vhcxljalwiDOeutnYgAWAgYp9H+gG3Zi8zZor9kcwvp9orkjigEo9EtDFJmmV7wdhOh3 jonhall@ivy
automatically-retry-hooks:true backup-dir: charmhub-url:https://api.charmhub.io cloudinit-userdata: container-image-metadata-url: container-image-stream:re
leased container-inherit-properties: container-networking-method: default-series:focal default-space: development:false disable-network-management:false egr
ess-subnets: enable-os-refresh-update:true enable-os-upgrade:true fan-config: firewall-mode:instance ftp-proxy: http-proxy: https-proxy: ignore-machine-addr
esses:false image-metadata-url: image-stream:released juju-ftp-proxy: juju-http-proxy: juju-https-proxy: juju-no-proxy:127.0.0.1,localhost,::1 logforward-en
abled:false logging-config: lxd-snap-channel:latest/stable max-action-results-age:336h max-action-results-size:5G max-status-history-age:336h max-status-his
tory-size:5G name:controller net-bond-reconfigure-delay:17 no-proxy:127.0.0.1,localhost,::1 operator-storage:standard provisioner-harvest-mode:destroyed pro
xy-ssh:false resource-tags: snap-http-proxy: snap-https-proxy: snap-store-assertions: snap-store-proxy: snap-store-proxy-url: ssl-hostname-verification:true
test-mode:false transmit-vendor-metrics:true type:kubernetes update-status-hook-interval:5m uuid:db630b05-1123-4422-8999-2c0d15ad2636 workload-storage:stan
dard]
14:36:13 DEBUG juju.kubernetes.provider provider.go:133 opening model “controller”.
ERROR unable to determine legacy status for namespace controller: Get “https://35.202.226.149/api/v1/namespaces/controller”: dial tcp 35.202.226.149:443: i/
o timeout
14:36:43 DEBUG cmd supercommand.go:537 error stack:
Get “https://35.202.226.149/api/v1/namespaces/controller”: dial tcp 35.202.226.149:443: i/o timeout
/build/snapcraft-juju-4ccf60/parts/juju/src/caas/kubernetes/provider/utils/labels.go:49: unable to determine legacy status for namespace controller
/build/snapcraft-juju-4ccf60/parts/juju/src/caas/kubernetes/provider/k8s.go:189:
/build/snapcraft-juju-4ccf60/parts/juju/src/environs/bootstrap/prepare.go:131:
/build/snapcraft-juju-4ccf60/parts/juju/src/cmd/juju/commands/bootstrap.go:798:

wallyworld · 31 January 2021 23:00

It looks like the default node sizing on a GKE cluster might be smaller than Juju’s default memory requirements for the controller.

I did this

gcloud container clusters create ,,,
juju add-k8s --gke ...
juju bootstrap ...

and bootstrap failed with

ERROR failed to bootstrap model: creating controller stack for controller: creating statefulset for controller: timed out waiting for controller pod: unschedulable: 0/9 nodes are available: 9 Insufficient memory.

But doing this

juju bootstrap --constraints mem=1G ...

worked fine. We may need to look to adjust Juju’s minimum requirements on k8s.

You seem to be getting a different type of failure though. Can you use kubectl to inspect the cluster to see the state of the controller pod and the juju/mongo containers and what’s being logged etc.

flimzy · 1 February 2021 12:39

I’m pretty sure this is an unrelated error condition. I also received the error you quoted, but only after (seemingly randomly) getting past the error reported at the top of this thread. When I create a cluster wtih larger nodes, I also don’t get the above error (but still–only after “randomly” getting past the error pasted above).

To clarify:

I created a default GKE cluster.
I tried to bootstrap, and got the “unable to determine legacy status for namespace controller” error. I repeated the bootstrap process a few times, and eventually it succeeded.
Then I got the “Insufficient memory” error you quoted.

Later I created a new GKE cluster, with nodes with 8gb RAM (this configuration worked for me in the past), and continue to get the “unable to determine legacy status for namespace controller” error, even though I now have enough memory.

flimzy · 1 February 2021 13:34

It appears I failed to re-run juju add-k8s after re-initializing the GKE cluster, so it was apparently trying to connect to the old (no longer existing) cluster.

So user error, for sure, but could probably use some improved error reporting.