Juju loses communication with agents

I think I know the root cause here. The internal model cache in the controller seems to be not up to date. The logging-config-updater gets its information from that cache.

First thing to do is to determine which workers are having issues in the controller. We have some tools on the juju created machines to help look inside running systems:

The command that I’d like you to run is:

juju run -m controller --machine 0 juju_engine_report

This would get the information about the internal workers running in that controller. Run this command on each of the machines in the controller model.

This should give us enough information initially to determine where things are.

There is a known bug in 2.6.8 where unit agents may get stuck restarting, if there were issues in particular workers. We have this fixed in the upcoming 2.6.9 release.