Disaster Recovery of a Juju Controller

It just came to my attention, that a “backup” of a juju controller is not enough to safeguard for a lost controller.

This is mentioned in this issue with the tool “juju-restore”, in turn referenced here: Juju | How to upgrade your Juju deployment from 2.9 to 3.x

This “caveat” with backup:ing juju is a scary situation and having a “disaster recovery” path for lost juju controllers is really monumental for juju.

What is the status of this and what would be the recommended way to actually backup a juju controller to safeguard from a disaster scenario?

There’s no immediate plans to implement an automated disaster recovery process. It is important for sure, but there are other higher priority items that have hard deadlines, like the transition away from mongo to dqlite etc. We do want to be able to deliver the feature though.

There is a manual procedure that can be followed so it is possible to get a working system again by following a set of well defined steps. I haven’t got the link handy but will post here when I dig it up.

1 Like

Would this be possible to add the the docs while its still WIP? @tmihoc

@wallyworld is this post relevant for people losing their controller?

Here’s the link to the manual disaster recovery steps:

https://discourse.charmhub.io/t/manual-steps-to-restore-a-backup/1330

1 Like