We strongly recommend to NOT perform any other extraordinary operations on Charmed OpenSearch cluster, while upgrading. Some examples would be:
- Adding or removing units
- Creating or destroying new relations
- Changes in workload configuration
- Upgrading other connected/related/integrated applications simultaneously
- Backup / restore of snapshots
The concurrency with other operations is not supported, and it can lead the cluster into inconsistent states.
Make sure to have Charmed OpenSearch backups of your data before running any type of upgrades.
Minor upgrade steps
- Collect all necessary pre-upgrade information. It will be necessary for the rollback (if requested). Do NOT skip this step.
- (optional) Scale up: The new sacrificial unit will be the first one to be updated, and it will simplify the rollback procedure in case of the upgrade failure.
- Prepare “Charmed OpenSearch” Juju application for the in-place upgrade. See the step description below for all technical details executed by charm here.
- Upgrade: Once started, only one unit of the app will be upgraded. In case of failure, roll back with juju refresh.
- Resume upgrade: If the new unit is OK after the refresh, the upgrade can be resumed. All units in an app will be executed sequentially from highest to lowest unit number.
- (optional) Consider rolling back in case of disaster. Please inform and include us in your case scenario troubleshooting to trace the source of the issue and prevent it in the future.
- (optional) Scale back: Remove no longer necessary units created in step 2 (if any).
- Post-upgrade check: Make sure all units are in the proper state and the cluster is healthy.
Step 1: Collect
The first step is to record the revision of the running application, as a safety measure for a rollback action. To accomplish this, simply run the juju status command and look for the deployed Charmed OpenSearch revision in the command output, e.g.:
Model Controller Cloud/Region Version SLA Timestamp
test localhost-localhost localhost/localhost 3.3.4 unsupported 13:02:15Z
App Version Status Scale Charm Channel Rev Exposed Message
opensearch active 3 opensearch 86 no
self-signed-certificates active 1 self-signed-certificates latest/stable 72 no
Unit Workload Agent Machine Public address Ports Message
opensearch/0* active idle 1 10.229.18.7 9200/tcp
opensearch/1 active idle 2 10.229.18.182 9200/tcp
opensearch/2 active idle 3 10.229.18.34 9200/tcp
self-signed-certificates/0* active idle 0 10.229.18.118
Machine State Address Inst id Base AZ Message
0 started 10.229.18.118 juju-d69356-0 ubuntu@22.04 Running
1 started 10.229.18.7 juju-d69356-1 ubuntu@22.04 Running
2 started 10.229.18.182 juju-d69356-2 ubuntu@22.04 Running
3 started 10.229.18.34 juju-d69356-3 ubuntu@22.04 Running
If the deployment is of a local charm, make sure you save a copy of the current .charm file BEFORE going further. You might need it for rollback.
For this example, the current revision is 86 for OpenSearch.
Store the revision or the .charm file safely to use in case of rollback.
Step 2: Scale-up (optional)
Optionally, it is recommended to scale the application up by one unit before starting the upgrade process.
The new unit will be the first one to be updated, and it will assert that the upgrade is possible. In case of failure, having the extra unit will ease the rollback procedure, without disrupting service. More in Minor rollback how-to.
juju add-unit opensearch
Wait for the new unit up and ready.
Step 3: Prepare
- IMPORTANT: Create a backup of your cluster
Ensure you create a backup of your cluster, please refer to the backup section.
- pre-upgrade-check
After the application has settled, it’s necessary to run the pre-upgrade-check action against the leader unit:
juju run opensearch/leader pre-upgrade-check
The action will ensure and check the health of OpenSearch as well as if the charm is well prepared to start an upgrade procedure.
Step 4: Upgrade
Use the juju refresh command to trigger the charm upgrade process.
juju refresh opensearch --channel 2/edge
The opensearch upgrade will execute only on the highest ordinal unit, for the running example opensearch, the juju status will look as follows:
Model Controller Cloud/Region Version SLA Timestamp
test localhost-localhost localhost/localhost 3.3.4 unsupported 13:02:15Z
App Version Status Scale Charm Channel Rev Exposed Message
opensearch active 3 opensearch 87 no Upgrading. Verify highest unit is healthy & run `resume-upgrade` action. To rollback, `juju refresh` to last revision
self-signed-certificates active 1 self-signed-certificates latest/stable 72 no
Unit Workload Agent Machine Public address Ports Message
opensearch/0 active idle 1 10.229.18.7 9200/tcp OpenSearch 2.12.0 running; Snap rev 40 (outdated); Charmed operator 1+631f817-dirty+71f8619-dirty
opensearch/1 active idle 2 10.229.18.182 9200/tcp OpenSearch 2.12.0 running; Snap rev 40 (outdated); Charmed operator 1+631f817-dirty+71f8619-dirty
opensearch/2* active idle 3 10.229.18.34 9200/tcp OpenSearch 2.12.0 running; Snap rev 44; Charmed operator 1+631f817-dirty+71f8619-dirty
self-signed-certificates/0* active idle 0 10.229.18.118
The unit should recover shortly after, but the time can vary depending on the amount of data written to the cluster while the unit was not part of the cluster. Please be patient on the huge installations.
Step 5: Resume
After the unit is upgraded, the charm will set the unit upgrade state as completed. If deemed necessary, the user can further assert the success of the upgrade. If the unit is healthy within the cluster, the next step is to resume the upgrade process by running:
juju run-action opensearch/leader resume-upgrade
The resume-upgrade will roll out the OpenSearch upgrade for the following unit, always from highest to lowest. For each successfully upgraded unit beyond the first, the process will roll out the next one automatically.
Step 6: Rollback (optional)
If the upgrade was incompatible, it’s important to roll back the charm to a previous revision so that an update can be later attempted after a further inspection of the failure. More in Minor rollback how-to.
Step 7: Scale-back
Case the application scale was changed for the upgrade procedure, it is now safe to scale it back to the desired unit count:
juju remove-unit opensearch/<highest unit number>
Step 8: Check
First, check the units have settled as “active/idle” state on juju status, with the newer revision number:
Model Controller Cloud/Region Version SLA Timestamp
test localhost-localhost localhost/localhost 3.3.4 unsupported 13:02:15Z
App Version Status Scale Charm Channel Rev Exposed Message
opensearch active 3 opensearch 87 no
self-signed-certificates active 1 self-signed-certificates latest/stable 72 no
Unit Workload Agent Machine Public address Ports Message
opensearch/0* active idle 1 10.229.18.7 9200/tcp
opensearch/1 active idle 2 10.229.18.182 9200/tcp
opensearch/2 active idle 3 10.229.18.34 9200/tcp
self-signed-certificates/0* active idle 0 10.229.18.118
Machine State Address Inst id Base AZ Message
0 started 10.229.18.118 juju-d69356-0 ubuntu@22.04 Running
1 started 10.229.18.7 juju-d69356-1 ubuntu@22.04 Running
2 started 10.229.18.182 juju-d69356-2 ubuntu@22.04 Running
3 started 10.229.18.34 juju-d69356-3 ubuntu@22.04 Running
Check the cluster is healthy. OpenSearch’s upstream documentation suggests the following check:
GET "/_cluster/health?pretty"
The response should look similar to the following example:
{
"cluster_name" : "test-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"discovered_master" : true,
"active_primary_shards" : 1,
...
"active_shards_percent_as_number" : 100.0
}