Hello Everyone, i have a question in regards to possible power outage, we simulated a power outage on our standard 4 node test cluster, and every time a power outage was simulated when the cluster is brought back up the charms for mysql-innodb-cluster go in a blocked state with a message Cluster is inaccessible from this instance. Please check logs for details. is there any way to avoid this if it happens. Additionally the reboot-cluster-from-complete-outage action either does nothing, or from time to time will restart the cluster but in a split-brain situation in which each node considers itself a separate cluster. is there any way to fix this error as i am yet to find anything.
Hi Aleksandar, please include both the Juju and MySQL logs and maybe someone here can spot something.
Hi Peter after some more research i found that we did not follow a proper shutdown procedure which would involve pausing the applications in the proper order before gracefully shutting down the servers, and additionally i found that the reboot-cluster-from-complete-outage is not a good idea to run on all the charms at once from https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-managing-power-events.html#mysql-innodb-cluster that you replied with on a different post on the forum. Which leads me to my next question which is: How possible would it be to create a script that will pause multiple charms at once, since with juju run-action charm_name pause i can only pause a single charm at a time. Thanks in advance.
I’m not 100% on this but it is probable that pausing a database node will trigger an exchange of information with the other nodes in the cluster. Therefore, I don’t think it is wise to instantaneously pause all nodes. A sequential approach should work however.