WARNING: it is an internal article. Do NOT use it in production! Contact Canonical Data Platform team if you are interested in the topic.
Creating a MySQL Cluster Set
1 - Deploy two MySQL clusters, named Rome and Lisbon:
juju add-model az1 # db cluster 1, location: Rome
juju add-model az2 # db cluster 1, location: Lisbon
juju add-model app # db client application, location: somewhere
Rome:
juju switch az1
juju deploy mysql-k8s db1 --trust --channel=8.0/edge/arepl --config profile=testing --config cluster-name=rome --base ubuntu@22.04
Lisbon:
juju switch az2
juju deploy mysql-k8s db2 --trust --channel=8.0/edge/arepl --config profile=testing --config cluster-name=lisbon --base ubuntu@22.04
Client:
juju switch app
juju deploy mysql-test-app
juju deploy mysql-router-k8s --trust --channel 8.0/edge
2 - Create and consume an offer
It’s required to define the roles that the clusters will play in the cluster-set, i.e. which will be the primary and the standby. For this setup, the application db1
will be setup as the primary cluster and db2
will be the replica/standby cluster.
NOTE: The side of the relation is used for the setup phase only. In an event of a planned switchover or a failover, a standby cluster can be promoted to active independently of the relation side.
To create a cluster set from this two clusters, we need to create a relation, which uses the async_replication
interface, through the async-primary
and async-replica
relation names.
But first it’s necessary to create and consume an offer:
juju switch az1
juju offer db1:async-primary async-primary
juju offer db1:database db1database
juju switch az2
juju offer db2:async-primary async-primary
juju offer db2:database db2database
juju switch app
juju consume az1.db1database
juju consume az2.db2database
juju consume az1.async-primary -m az2
juju consume az2.async-primary -m az1
3 - Relate the applications
juju relate -m app mysql-test-app mysql-router-k8s
juju relate -m app mysql-router-k8s db1database
juju relate -m az2 async-primary db2:async-replica
And wait until the process is finished. Behind the scenes the replica cluster will be setup as a replica, cloning data from the primary cluster and rejoining all units to it. The mysql-test-app will automatically write to db1 on az1 side and data will be automatically propogated to db2 on az2 side.
4 - Checking cluster-set status
Run the get-cluster-status
with the cluster-set=True
flag:
juju run -m az1 db1/0 get-cluster-status cluster-set=True
Results:
status:
clusters:
lisbon:
clusterrole: replica
clustersetreplicationstatus: ok
globalstatus: ok
rome:
clusterrole: primary
globalstatus: ok
primary: db1-0.db1-endpoints.az1.svc.cluster.local:3306
domainname: cluster-set-119185404c15ba547eb5f0750a5c34b5
globalprimaryinstance: db1-0.db1-endpoints.az1.svc.cluster.local:3306
primarycluster: rome
status: healthy
statustext: all clusters available.
success: "True"
5 - Scaling clusters
The two clusters works independently, this means that it’s possible to independently scaling in/out each cluster without much hassle, e.g.:
juju scale-application -m az1 db1 3
juju scale-application -m az2 db2 3
NOTE: resource usage configurations are also independent.
Safe removing a cluster from the cluster set
For removing a given cluster from the cluster set, you need be sure that the given cluster is not the primary cluster. Case it is, there’s a provided action that can execute a safe switchover:
1 - (optional) Safely promote to active
juju run -m az2 db2/leader promote-standby-cluster cluster-set-name=<cluster-set-119185404c15ba547eb5f0750a5c34b5>
It’s required to provide the cluster-set-name
option as a foolproof method.
2 - Remove the relation
When removing a async_replication
relation, the primary cluster will keep working as is, while the replica cluster will be dissolved with all it’s units in standalone read-only mode.
The side of the relation (primary/replica) does not matter, since only the current role of the cluster will be considered.
juju remove-relation -m az2 async-primary db2
3 - Recovering blocked cluster
To recover a replica cluster after the relation is removed, there’s a provided action:
juju run -m az2 db2/leader recreate-cluster
The action will recover the cluster as a standalone cluster and with the data from the cluster-set.
Failover
When a failover of the primary is required, it’s necessary to manually set cluster roles.
1 - Promote to active
To promote the standby cluster to active/primary, it’s necessary to run the action with the force
flag set.
juju run -m az2 db2/leader promote-standby-cluster cluster-set-name=<my-cluster-set> force=True
The force
will cause the old primary to be invalidated. It’s required to provide the cluster-set-name
option as a foolproof method.
2 - Fence writes from old primary
To avoid a split brain scenario, where more than one cluster is set as primary
, it’s important to fence all write traffic from the failed primary cluster. For doing so there’s an action:
juju run -m az1 db1/leader fence-writes cluster-set-name=<my-cluster-set>
The action can be run against any of the cluster units.
Case the old primary is reestablished and/or have all transactions reconciled, one can resume write traffic to it, by using the unfence-writes
action, e.g.:
juju run -m az1 db1/leader unfence-writes cluster-set-name=<my-cluster-set>
Switch app/clients between AZ
It is necessary to switch between AZ only if the previous AZ is not reachable:
juju remove-relation -m app mysql-router-k8s db1database
# wait for relation removed
juju relate -m app mysql-router-k8s db2database