Hello, I hope someone can help me with this. We had a power failure in our server room from which only one mysql-innodb-cluster node came back after using reboot-cluster-from-complete-outage. For now it seams that the openstack cloud is working with a single node but I need to recover redundancy and bring the two remaining nodes back. Cluster status is as follows:
geoint@MAAS-01:~$ juju run-action --wait mysql-innodb-cluster/1 cluster-status unit-mysql-innodb-cluster-1: UnitId: mysql-innodb-cluster/1 id: “2228” results: cluster-status: ‘{“clusterName”: “jujuCluster”, “defaultReplicaSet”: {“name”: “default”, “primary”: “10.2.101.149:3306”, “ssl”: “REQUIRED”, “status”: “OK_NO_TOLERANCE”, “statusText”: “Cluster is NOT tolerant to any failures. 2 members are not active.”, “topology”: {“10.2.101.149:3306”: {“address”: “10.2.101.149:3306”, “mode”: “R/W”, “readReplicas”: {}, “replicationLag”: null, “role”: “HA”, “status”: “ONLINE”, “version”: “8.0.42”}, “10.2.101.153:3306”: {“address”: “10.2.101.153:3306”, “instanceErrors”: ["ERROR: GR Recovery channel applier stopped with an error: Worker 1 failed executing transaction ‘‘7d734394-9b3d-11ec-8fec-00163e96d3c5:1197807237’’ at source log mysql-bin.002717, end_log_pos 396093; Could not execute Update_rows event on table neutron.ovn_hash_ring; Can’‘t find record in ‘‘ovn_hash_ring’’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’‘s source log mysql-bin.002717, end_log_pos 396093 (1032) at 2025-08-06 18:57:24.092347", “ERROR: group_replication has stopped with an error.”], “memberState”: “ERROR”, “mode”: “R/O”, “readReplicas”: {}, “role”: “HA”, “status”: “(MISSING)”, “version”: “8.0.42”}, “10.2.101.181:3306”: {“address”: “10.2.101.181:3306”, “instanceErrors”: ["ERROR: GR Recovery channel applier stopped with an error: Worker 1 failed executing transaction ‘‘7d734394-9b3d-11ec-8fec-00163e96d3c5:1197808000’’ at source log mysql-bin.002719, end_log_pos 197065; Could not execute Update_rows event on table neutron.ovn_hash_ring; Can’‘t find record in ‘‘ovn_hash_ring’’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’‘s source log mysql-bin.002719, end_log_pos 197065 (1032) at 2025-08-06 22:40:37.011884"], “mode”: “R/O”, “readReplicas”: {}, “recovery”: {“applierError”: "Worker 1 failed executing transaction ‘‘7d734394-9b3d-11ec-8fec-00163e96d3c5:1197808000’’ at source log mysql-bin.002719, end_log_pos 197065; Could not execute Update_rows event on table neutron.ovn_hash_ring; Can’‘t find record in ‘‘ovn_hash_ring’’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’‘s source log mysql-bin.002719, end_log_pos 197065", “applierErrorNumber”: 1032, “state”: “APPLIER_ERROR”}, “recoveryStatusText”: “Distributed recovery in progress”, “role”: “HA”, “status”: “RECOVERING”, “version”: “8.0.42”}}, “topologyMode”: “Single-Primary”}, “groupInformationSourceMember”: “10.2.101.149:3306”}’ status: completed timing: completed: 2025-08-06 22:41:05 +0000 UTC enqueued: 2025-08-06 22:41:01 +0000 UTC started: 2025-08-06 22:41:01 +0000 UTC geoint@MAAS-01:~$
I am attaching juju status and the logs of the two nodes that are not joining the cluster
juju status: Ubuntu Pastebin
mysql-innodb-cluster/2: Ubuntu Pastebin
mysql-innodb-cluster/0: Ubuntu Pastebin
I already try
STOP GROUP_REPLICATION; RESET SLAVE;
on the other two node but did not work.
please help!