Hello Everyone,
We’ve been using Juju for the past a year and a half with absolutely no problem after our openstack deployment. And it’s really working great.
Recently we had a storage Issue were we were hosting our juju controller. The mongo database was not starting and I had to carry out some work to start it again.
I was able to do Juju status and it was working.
Today I tried to add a new machine to our deployment and I started getting machines in a pending state. We are using maas as a cloud and Usually that command starts the deployment of the physical machine.
Digging on the logs of machine-0 on our Juju controller I was able to see this:
2021-10-13 11:32:29 ERROR juju.worker.dependency engine.go:663 “raft” manifold worker returned unexpected error: timed out waiting for worker loop
This happens straight after I run the command… If i attach --debug to it it gives me this output:
12:32:07 INFO juju.cmd supercommand.go:56 running juju [2.9.0 0 gc go1.16.3]
12:32:07 DEBUG juju.cmd supercommand.go:57 args: []string{“juju”, “add-machine”, “–debug”}
12:32:07 INFO juju.juju api.go:78 connecting to API addresses: [x.x.x.x:17070]
12:32:07 DEBUG juju.api apiclient.go:1132 successfully dialed “wss://x.x.x.x:17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:07 INFO juju.api apiclient.go:664 connection established to “wss://x.x.x.x:17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:08 INFO juju.juju api.go:78 connecting to API addresses: [x.x.x.x:17070]
12:32:08 DEBUG juju.api apiclient.go:1132 successfully dialed “wss://x.x.x.x:17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:08 INFO juju.api apiclient.go:664 connection established to “wss://x.x.x.x:17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:08 INFO juju.cmd.juju.machine add.go:291 load config
12:32:08 INFO juju.juju api.go:78 connecting to API addresses: [x.x.x.x:17070]
12:32:08 DEBUG juju.api apiclient.go:1132 successfully dialed “wss://x.x.x.x:17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:08 INFO juju.api apiclient.go:664 connection established to “wss://x.x.x.x17070/model/2989a1f5-c738-4e03-86ac-0f586a5d989a/api”
12:32:08 INFO juju.cmd.juju.machine add.go:316 model provisioning
12:32:08 INFO cmd add.go:363 created machine 10
12:32:08 DEBUG juju.api monitor.go:35 RPC connection died
12:32:08 DEBUG juju.api monitor.go:35 RPC connection died
12:32:08 DEBUG juju.api monitor.go:35 RPC connection died
12:32:08 INFO cmd supercommand.go:544 command finished
I am not sure exactly what the problem could be. It really feels I don’t have any more control over my cloud. On the audit logs i get this
{“errors”:{“conversation-id”:“c8169ed29b0b74ba”,“connection-id”:“58”,“request-id”:2,“when”:“2021-10-13T11:32:08Z”,“errors”:[null]}}
not sure if it’s connected, but usually when I run the command
juju run --unit mysql/0 leader-get
I usually get an output, but since the storage issue it hasn’t been working. It just hangs. Which keeps me thinking that I might have lost control of my cloud. But Juju status seems to be reporting properly so I am not sure.
Not sure if anyone can point me to the right direction for more troubleshooting.
Kind Regards,
Ejike