Issues with Juju controller

Hello all,

we are facing strange behavior in one of our projects. We have Juju 2.9.45 (and it’s also the same issue with Juju 3.1.6) on Ubuntu 22.04.3. We are using manual type of cloud. After simple adding new cloud and bootstrap of single controller with all defaults values we are facing that Juju controller is time to time not accessible, technically the request to port 17070 response is “connection refused”, even when calling localhost on the controller machine itself. The machine-0.log looks like this:

2023-11-24 08:33:41 INFO juju.api apiclient.go:687 connection established to "wss://localhost:17070/model/3f510045-0b45-41b0-817e-dc64bc3f30f9/api"
2023-11-24 08:33:41 INFO juju.apiserver.connection request_notifier.go:96 agent login: machine-0 for 3f510045-0b45-41b0-817e-dc64bc3f30f9
2023-11-24 08:33:41 INFO juju.worker.apicaller connect.go:163 [3f5100] "machine-0" successfully connected to "localhost:17070"
2023-11-24 08:33:45 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:33:46 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:33:50 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:33:50 WARNING juju.worker.lease.raft manager.go:317 [2158c1] retrying timed out while handling claim {"singular-controller" "3f510045-0b45-41b0-817e-dc64bc3f30f9" "2158c17a-550b-402a-85e9-960beeb9bc3d"} for "machine-0"
2023-11-24 08:33:50 ERROR juju.worker.dependency engine.go:695 "is-primary-controller-flag" manifold worker returned unexpected error: lease operation timed out
2023-11-24 08:33:51 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "raft" manifold worker returned unexpected error: timed out waiting for worker loop
2023-11-24 08:33:55 ERROR juju.worker.raft.rafttransport streamlayer.go:122 streamLayer.Addr timed out waiting for API address
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "raft-transport" manifold worker returned unexpected error: timed out waiting for API address
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "migration-minion" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "machiner" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "proxy-config-updater" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "upgrade-series" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "log-sender" manifold worker returned unexpected error: sending log message: websocket: close 1006 (abnormal closure): unexpected EOF: use of closed network connection
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "fan-configurer" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 INFO juju.worker.logger logger.go:136 logger worker stopped
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "reboot-executor" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "logging-config-updater" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "api-address-updater" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "deployer" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:55 ERROR juju.worker.dependency engine.go:695 "migration-inactive-flag" manifold worker returned unexpected error: watcher has been stopped (stopped)
2023-11-24 08:33:56 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:33:58 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:33:58 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: [13aec6] "machine-0" cannot open api: unable to connect to API: apiserver shutdown in progress (Service Unavailable)
2023-11-24 08:33:58 ERROR juju.worker.dependency engine.go:695 "migration-inactive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:33:58 ERROR juju.worker.dependency engine.go:695 "migration-minion" manifold worker returned unexpected error: setting up watcher: connection is shut down
2023-11-24 08:33:59 INFO juju.worker.deployer nested.go:159 new context: units "", stopped ""
2023-11-24 08:33:59 ERROR juju.worker.dependency engine.go:695 "deployer" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:33:59 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:33:59 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:33:59 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:33:59 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:00 ERROR juju.worker.dependency engine.go:695 "is-primary-controller-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:00 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly
2023-11-24 08:34:01 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:03 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:03 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:03 ERROR juju.worker.dependency engine.go:695 "peer-grouper" manifold worker returned unexpected error: computing desired peer group: updating member addresses: juju-ha-space is not set and these nodes have more than one usable address: 0
run "juju controller-config juju-ha-space=<name>" to set a space for Mongo peer communication
2023-11-24 08:34:03 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:04 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:06 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:08 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:08 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:09 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:09 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:11 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:14 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:14 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:15 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:15 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:16 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:19 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:21 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:22 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:22 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:22 ERROR juju.worker.dependency engine.go:695 "not-alive-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:24 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:26 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:29 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:31 ERROR juju.worker.dependency engine.go:695 "valid-credential-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:31 ERROR juju.worker.dependency engine.go:695 "not-dead-flag" manifold worker returned unexpected error: connection is shut down
2023-11-24 08:34:31 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 3f510045-0b45-41b0-817e-dc64bc3f30f9, holder: machine-0) to be processed
2023-11-24 08:34:31 WARNING juju.worker.lease.raft manager.go:317 [2158c1] retrying timed out while handling claim {"singular-controller" "3f510045-0b45-41b0-817e-dc64bc3f30f9" "3f510045-0b45-41b0-817e-dc64bc3f30f9"} for "machine-0"
2023-11-24 08:34:31 ERROR juju.worker.dependency engine.go:695 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly
2023-11-24 08:34:31 INFO juju.apiserver.connection request_notifier.go:125 agent disconnected: machine-0 for 3f510045-0b45-41b0-817e-dc64bc3f30f9
2023-11-24 08:34:34 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:39 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:44 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: claim, ns: singular-controller, model: 3f5100, lease: 2158c17a-550b-402a-85e9-960beeb9bc3d, holder: machine-0) to be processed
2023-11-24 08:34:44 WARNING juju.worker.lease.raft manager.go:317 [2158c1] retrying timed out while handling claim {"singular-controller" "3f510045-0b45-41b0-817e-dc64bc3f30f9" "2158c17a-550b-402a-85e9-960beeb9bc3d"} for "machine-0"
2023-11-24 08:34:44 INFO juju.apiserver.connection request_notifier.go:125 agent disconnected: machine-0 for 3f510045-0b45-41b0-817e-dc64bc3f30f9
2023-11-24 08:34:44 INFO juju.worker.httpserver worker.go:315 listening on "[::]:17070"
2023-11-24 08:34:58 ERROR juju.worker.dependency engine.go:695 "raft" manifold worker returned unexpected error: timed out waiting for worker loop
2023-11-24 08:34:58 ERROR juju.worker.raft.rafttransport streamlayer.go:122 streamLayer.Addr timed out waiting for API address
2023-11-24 08:34:58 ERROR juju.worker.dependency engine.go:695 "raft-transport" manifold worker returned unexpected error: timed out waiting for API address
2023-11-24 08:35:01 INFO juju.worker.httpserver worker.go:315 listening on "[::]:17070"

The result is that the machines, even juju controller machine are in the state “started” and then switched to “down” status when Juju controller is not available. I could not find any problem which is causing this behavior not even in syslog, mongodb log (/var/snap/juju-db/common/logs/mongodb.log) nor any other log.

Could anyone help me? Thank you in advance.

This seems to be your first ERROR - which might give a hint. @hmlanigan might know?

That’s a symptom. The lease system can’t process requests, indicated by:

ERROR juju.worker.dependency engine.go:695 "raft-transport" manifold worker returned unexpected error: timed out waiting for API address

How many NICs does the bootstrap target machine have?

2 Likes

For those playing along at home: https://chat.charmhub.io/charmhub/pl/ienian8eubb8umk7bm1ecj9ahc

1 Like

There were 4 adapters (NIC) present on the controller machine. I remove them and now after re-bootstrap it works as expected. Thank you very much.

1 Like