Charmed OpenSearch How To | Large Deployments | Deploy a cluster

How to launch a large deployment

The Charmed OpenSearch operator can be deployed at scale to support large deployments. This guide explains how to launch a large deployment of OpenSearch using Juju.

Summary


OpenSearch node roles

When deploying OpenSearch at scale, it is important to understand the roles that nodes can assume on a cluster.

Amongst the multiple roles supported by OpenSearch, two notable roles are especially crucial for a successful cluster formation:

  • cluster_manager: assigned to nodes responsible for handling cluster-wide operations such as creating and deleting indices, managing shards, and rebalancing data across the cluster. Every cluster has a single cluster_manager node elected as the master node among the cluster_manager eligible nodes.
  • data: assigned to nodes which store and perform data-related operations like indexing and searching. Data nodes hold the shards that contain the indexed data. Data nodes can also be configured to perform ingest and transform operations. In charmed OpenSearch, data nodes can optionally be further classified into tiers - to allow for defining index lifecycle management policies:
    • data.hot
    • data.warm
    • data.cold

There are also other roles that nodes can take on in an OpenSearch cluster, such as ingest nodes, and coordinating nodes etc.

Roles in charmed OpenSearch are applied on the application level, in other words, all nodes get assigned the same set of roles defined for an application.

Set roles

Roles can either be set by the user or automatically generated by the charm.

Auto-generated roles

When no roles are set on the roles config option of the opensearch application, the charm automatically assigns the following roles to all nodes.

["data", "ingest", "ml", "cluster_manager"]

User set roles

There are currently two ways for users to set roles in an application: at deploy time, or via a config change. Note that a role change will effectively trigger a rolling restart of the OpenSearch application.

To set roles at deploy time, run

juju deploy opensearch -n 3 --config roles="cluster_manager,data,ml"

To set roles later on through a config change, run

juju config opensearch roles="cluster_manager,data,ml"

Note: We currently do not allow the removal of either cluster_manager or data roles.

Deploy a large OpenSearch cluster

The OpenSearch charm manages large deployments and diversity in the topology of its nodes through juju integrations.

The cluster will consist of multiple integrated juju applications (clusters) with each application configured to have a mix of cluster_manager and data roles defined for its nodes.

Deploy the clusters

  1. First, deploy the orchestrator app.

    juju deploy -n 3 \
        opensearch main \
        --config cluster_name="app" \
        --channel 2/edge
    

    As a reminder, since we did not set any role to this application, the operator will assign each node the cluster_manager,coordinating_only,data,ingest,ml roles.

  2. (Optional, but recommended) Next, deploy a failover application with cluster_manager nodes to ensure high availability and fault tolerance. The failover app will take over the orchestration of the fleet in the events where the main app fails or gets removed. Thus, it is important that this application has the cluster_manager role as part of its roles to ensure the continuity of the existence of the cluster.

    juju deploy -n 3 \
        opensearch failover \
        --config cluster_name="app" \
        --config init_hold="true" \
        --config roles="cluster_manager" 
        --channel 2/edge
    

    The failover nodes are not required for a basic deployment of OpenSearch. They are however highly recommended for production deployments to ensure high availability and fault tolerance.

    Note 1: It is imperative that the cluster_name config values match between applications in large deployments. A cluster_name mismatch will effectively prevent 2 applications from forming a cluster.

    Note 2: It is imperative that only the main orchestrator app sets the init_hold config option to false (by default) - the non-main orchestrator apps should set the value to true to prevent the application from starting before being integrated with the main.

  3. After deploying the nodes of the main app and additional cluster_manager nodes on the failover, we will deploy a new app with data.hot node roles.

     juju deploy -n 3 \
         opensearch data-hot \
         --config cluster_name="app" \
         --config roles="data.hot" \
         --config init_hold="true" \
         --channel 2/edge 
    
  4. We also need to deploy a TLS operator to enable TLS encryption for the cluster. We will deploy the self-signed-certificates charm to provide self-signed certificates for the cluster.

    juju deploy self-signed-certificates
    
  5. We can now track the progress of the deployment by running:

    juju status --watch 1s
    

    Once the deployment is complete, you should see the following output:

    Model  Controller   Cloud/Region         Version  SLA          Timestamp
    dev    development  localhost/localhost  3.5.3    unsupported  06:01:06Z
    
    App                       Version  Status   Scale  Charm                     Channel        Rev  Exposed  Message
    data-hot                           blocked      3  opensearch                2/edge         159  no       Cannot start. Waiting for peer cluster relation...
    failover                           blocked      3  opensearch                2/edge         159  no       Cannot start. Waiting for peer cluster relation...
    main                               blocked      3  opensearch                2/edge         159  no       Missing TLS relation with this cluster.
    self-signed-certificates           active       1  self-signed-certificates  latest/stable  155  no
    
    Unit                         Workload  Agent  Machine  Public address  Ports  Message
    data-hot/0                   active    idle   6        10.214.176.165
    data-hot/1*                  active    idle   7        10.214.176.7
    data-hot/2                   active    idle   8        10.214.176.161
    failover/0*                  active    idle   3        10.214.176.194
    failover/1                   active    idle   4        10.214.176.152
    failover/2                   active    idle   5        10.214.176.221
    main/0                       blocked   idle   0        10.214.176.231         Missing TLS relation with this cluster.
    main/1                       blocked   idle   1        10.214.176.57          Missing TLS relation with this cluster.
    main/2*                      blocked   idle   2        10.214.176.140         Missing TLS relation with this cluster.
    self-signed-certificates/0*  active    idle   9        10.214.176.201
    
    Machine  State    Address         Inst id        Base          AZ  Message
    0        started  10.214.176.231  juju-d6b263-0  ubuntu@22.04      Running
    1        started  10.214.176.57   juju-d6b263-1  ubuntu@22.04      Running
    2        started  10.214.176.140  juju-d6b263-2  ubuntu@22.04      Running
    3        started  10.214.176.194  juju-d6b263-3  ubuntu@22.04      Running
    4        started  10.214.176.152  juju-d6b263-4  ubuntu@22.04      Running
    5        started  10.214.176.221  juju-d6b263-5  ubuntu@22.04      Running
    6        started  10.214.176.165  juju-d6b263-6  ubuntu@22.04      Running
    7        started  10.214.176.7    juju-d6b263-7  ubuntu@22.04      Running
    8        started  10.214.176.161  juju-d6b263-8  ubuntu@22.04      Running
    9        started  10.214.176.201  juju-d6b263-9  ubuntu@22.04      Running
    

Add the required relations

Configure TLS encryption

The Charmed OpenSearch operator does not function without TLS enabled. To enable TLS, integrate the self-signed-certificates with all opensearch applications.

juju integrate self-signed-certificates main
juju integrate self-signed-certificates failover
juju integrate self-signed-certificates data-hot

Once the integrations are established, the self-signed-certificates charm will provide the required certificates for the OpenSearch clusters.

Once TLS is fully configured in the main app, the latter will start immediately. As opposed to the other apps which are still waiting for the admin certificates to be shared with them by the main orchestrator.

When the main app is ready, juju status will show something similar to the sample output below:

Model  Controller   Cloud/Region         Version  SLA          Timestamp
dev    development  localhost/localhost  3.5.3    unsupported  06:03:49Z

App                       Version  Status   Scale  Charm                     Channel        Rev  Exposed  Message
data-hot                           blocked      3  opensearch                2/edge         159  no       Cannot start. Waiting for peer cluster relation...
failover                           blocked      3  opensearch                2/edge         159  no       Cannot start. Waiting for peer cluster relation...
main                               active       3  opensearch                2/edge         159  no
self-signed-certificates           active       1  self-signed-certificates  latest/stable  155  no

Unit                         Workload  Agent  Machine  Public address  Ports     Message
data-hot/0                   active    idle   6        10.214.176.165
data-hot/1*                  active    idle   7        10.214.176.7
data-hot/2                   active    idle   8        10.214.176.161
failover/0*                  active    idle   3        10.214.176.194
failover/1                   active    idle   4        10.214.176.152
failover/2                   active    idle   5        10.214.176.221
main/0                       active    idle   0        10.214.176.231  9200/tcp
main/1                       active    idle   1        10.214.176.57   9200/tcp
main/2*                      active    idle   2        10.214.176.140  9200/tcp
self-signed-certificates/0*  active    idle   9        10.214.176.201

Machine  State    Address         Inst id        Base          AZ  Message
0        started  10.214.176.231  juju-d6b263-0  ubuntu@22.04      Running
1        started  10.214.176.57   juju-d6b263-1  ubuntu@22.04      Running
2        started  10.214.176.140  juju-d6b263-2  ubuntu@22.04      Running
3        started  10.214.176.194  juju-d6b263-3  ubuntu@22.04      Running
4        started  10.214.176.152  juju-d6b263-4  ubuntu@22.04      Running
5        started  10.214.176.221  juju-d6b263-5  ubuntu@22.04      Running
6        started  10.214.176.165  juju-d6b263-6  ubuntu@22.04      Running
7        started  10.214.176.7    juju-d6b263-7  ubuntu@22.04      Running
8        started  10.214.176.161  juju-d6b263-8  ubuntu@22.04      Running
9        started  10.214.176.201  juju-d6b263-9  ubuntu@22.04      Running

Form the OpenSearch cluster

Now, in order to form the large OpenSearch cluster (constituted of all the 3 previous opensearch apps), integrate the main charm to the failover and data-hot juju apps.

juju integrate main:peer-cluster-orchestrator failover:peer-cluster 
juju integrate main:peer-cluster-orchestrator data-hot:peer-cluster 
juju integrate failover:peer-cluster-orchestrator data-hot:peer-cluster

Once the relations are added, the main application will orchestrate the formation of the OpenSearch cluster. This will start the rest of the nodes in the cluster. You can track the progress of the cluster formation by running:

juju status --watch 1s

Once the cluster is formed and all nodes are up and ready, juju status will show something similar to the sample output below:

Model  Controller   Cloud/Region         Version  SLA          Timestamp
dev    development  localhost/localhost  3.5.3    unsupported  06:11:18Z

App                       Version  Status  Scale  Charm                     Channel        Rev  Exposed  Message
data-hot                           active      3  opensearch                2/edge         159  no
failover                           active      3  opensearch                2/edge         159  no
main                               active      3  opensearch                2/edge         159  no
self-signed-certificates           active      1  self-signed-certificates  latest/stable  155  no

Unit                         Workload  Agent  Machine  Public address  Ports     Message
data-hot/0                   active    idle   6        10.214.176.165  9200/tcp
data-hot/1*                  active    idle   7        10.214.176.7    9200/tcp
data-hot/2                   active    idle   8        10.214.176.161  9200/tcp
failover/0*                  active    idle   3        10.214.176.194  9200/tcp
failover/1                   active    idle   4        10.214.176.152  9200/tcp
failover/2                   active    idle   5        10.214.176.221  9200/tcp
main/0                       active    idle   0        10.214.176.231  9200/tcp
main/1                       active    idle   1        10.214.176.57   9200/tcp
main/2*                      active    idle   2        10.214.176.140  9200/tcp
self-signed-certificates/0*  active    idle   9        10.214.176.201

Machine  State    Address         Inst id        Base          AZ  Message
0        started  10.214.176.231  juju-d6b263-0  ubuntu@22.04      Running
1        started  10.214.176.57   juju-d6b263-1  ubuntu@22.04      Running
2        started  10.214.176.140  juju-d6b263-2  ubuntu@22.04      Running
3        started  10.214.176.194  juju-d6b263-3  ubuntu@22.04      Running
4        started  10.214.176.152  juju-d6b263-4  ubuntu@22.04      Running
5        started  10.214.176.221  juju-d6b263-5  ubuntu@22.04      Running
6        started  10.214.176.165  juju-d6b263-6  ubuntu@22.04      Running
7        started  10.214.176.7    juju-d6b263-7  ubuntu@22.04      Running
8        started  10.214.176.161  juju-d6b263-8  ubuntu@22.04      Running
9        started  10.214.176.201  juju-d6b263-9  ubuntu@22.04      Running

Caution: The cluster will not come online if no data nodes are available. Ensure the data nodes are deployed and ready before forming the cluster.

Reminder1: In order to form a large deployment out of multiple juju apps, all applications must have the same cluster_name config option value or not set it at all, in which case it will be auto-generated in the main orchestrator and inherited by the other members.

Reminder2: init_hold must be set to true for any subsequent (non main orchestrator) application. Otherwise the application may start and never be able to join the rest of the deployment fleet.

1 Like