Here are some tips to diagnose issues in Juju 4.0+ with Dqlite. These steps could come in useful e.g. if bootstrap is failing due to a timeout making the initial connection to the API.
-
When bootstrapping, use the
--keep-brokenflag. If the bootstrap fails, instead of cleaning up, Juju will keep the broken controller around, so you can access it for debugging purposes. -
To get a shell inside the broken controller:
- Run
lxc listto find the name of the container. - Get a console with
lxc exec <container name> bash.
- Run
-
Inside the controller, run
# Install some needed tools sudo apt install -y go-dqlite sudo snap install yq # Create the .cert and .key files needed to connect to dqlite sudo cat /var/lib/juju/agents/machine-0/agent.conf | yq '.controllercert' | xargs -I% echo % > dqlite.cert sudo cat /var/lib/juju/agents/machine-0/agent.conf | yq '.controllerkey' | xargs -I% echo % > dqlite.key # Connect to dqlite sudo dqlite -s file:///var/lib/juju/dqlite/cluster.yaml -c ./dqlite.cert -k ./dqlite.key controller -
Root about in the controller DB as you please. You can for example list the tables that have been created:
dqlite> SELECT name FROM sqlite_schema WHERE type ='table' AND name NOT LIKE 'sqlite_%'; -
You can also use introspection tools to see the state of the worker tree. Inside the
lxc execsession, runsource /etc/profile.d/juju-introspection.sh juju_engine_report | less