Eliminating duplication across charms by using a reusable reconcile function

Summary

This post describes how Charmed Kubeflow has reduced duplication across our charms by:

  • defining the Component, which represents any piece of logic in a charm
  • implementing a reusable reconcile function (CharmReconciler) that executes one or more Component

You can find the code in our charmed-kubeflow-chisme repository.

Charming with the reconcile pattern

Most charms produced for Charmed Kubeflow follow the reconcile pattern (also mentioned here as a holistic pattern): on most events (config-changed, *-pebble-ready, …) we:

  • observe the current input state, such as config values, relation data, etc.
  • apply the desired output state for the things we manage, such as update Pebble services

This pattern is typical for controllers in the Kubernetes world.

While it sounds wasteful to recompute everything when you receive a very specific event such as X-pebble-ready, the savings are in cognitive load at development time. Rather than think ahead about every possible state transition (if we get X-pebble-ready when we’ve previously had Y-relation-joined but haven’t seen RelationZ yet …) we can simply observe the world and act accordingly.

This pattern led to a lot of charms looking like this:

class MyCharm(CharmBase):
  def __init__(self):
    for event in [
      self.on.install,
      self.on.config_changed,
      self.on.containerA_pebble_ready,
      self.on[relationX].relation_changed,
      self.on[relationY].relation_changed, 
      ...
    ]:
      self.framework.observe(event, self.reconcile)

  def reconcile(self, event):
    self._get_data_from_relation_X()
    self._send_data_to_relation_Y()
    self._deploy_kubernetes_resource_using_relation_X_data()
    self._update_container_a()
    ...

where install, config-changed, etc. are handled by a single reconcile() function that executes a series of handlers.

This does the job, but has weaknesses like:

  • catching errors and setting descriptive statuses gets verbose and repetitive
  • this naive implementation executes helpers sequentially, but that’s not always ideal. We have charms that serve two or more independent things and a breakdown in one shouldn’t affect the other

Over time, the reconcile function started looking more like (taken from istio-pilot):

def reconcile(event):
    # If we are not the leader, the charm should do nothing and exit
    try:
        self._check_leader()
    except Exception as err:
        self._log_and_set_status(err.message)
        return

    # Record non-fatal errors so that we can report them at the end.
    handled_errors = []

    # Process and action authentication settings
    ingress_auth_reconcile_successful = False
    try:
        ingress_auth_data = self._get_ingress_auth_data(event)
        self._reconcile_ingress_auth(ingress_auth_data)
        ingress_auth_reconcile_successful = True
    except Exception as err:
        handled_errors.append(err)

    try:
        # If previous step was unsuccessful, always remove the Gateway
        # to prevent unauthenticated traffic
        if ingress_auth_reconcile_successful:
            self._reconcile_gateway()
        else:
            self.log.info(
                "Removing gateway due to errors in processing the ingress-auth relation."
            )
            self._remove_gateway()
    except Exception as err:
        handled_errors.append(err)

    # Report any handled errors, or sets to ActiveStatus
    self._report_handled_errors(errors=handled_errors)

A variant of this lives in every charm using this pattern. This duplication led us to think about a better way.

A Sunbeam of Inspiration

We needed a reusable reconcile function that could run arbitrary bits of charm logic in a given order. These bits of logic might be something we wrote, but also could be something inherited like a charm library from another charm. To do this, we needed to define a common interface that all charm logic satisfied so that we could run these bits interchangeably. Thankfully, the Sunbeam base charm was nearly what we needed.

Sunbeam, from the Canonical OpenStack team, is a similar reconcile-style charm implementation. It defines two abstractions, RelationHandler and ContainerHandler, for defining charm logic to manage relations and containers. Sunbeam wasn’t quite as generic as we wanted (we also need to manage Kubernetes resources, and did not want Sunbeam’s rigid execution order), but it highlighted something really helpful.

Looking at the Sunbeam RelationHandler and ContainerHandler abstractions, we realised they were very similar and could be generalised further. They could be represented by a single abstraction that implements:

  • a method to do what it is meant to do (to configure a Pebble container, deploy a resource, etc)
  • a method to report its current state (is your Pebble container running the services you want, is your resource correctly deployed, etc)

This was key insight that helped implement our own reconciler.

The Component abstraction

With this simple interface in mind, we designed Component: an abstraction for representing any single piece of logic in a Charm (wrapping a relation, a container, or any thing else). The helpers in the simple reconcile() example at the top of this post would each good candidates to be a Component. Each component implements:

  • .configure_charm(): does the work of this Component (configures a Pebble container, deploys a resource, etc.)
  • .get_status(): computes the Status of this Component given the current state, returning a ops.model.StatusBase (like ActiveStatus, BlockedStatus, etc.)
  • .remove(): does any work that should be done to remove this Component during a Charm’s remove event
    • while not strictly necessary, including .remove() makes managing a Component's whole lifecycle more reusable.

Component lets us treat any arbitrary logic in a charm interchangeably, and promotes reuse by helping us write encapsulated pieces of logic. Many of our charm’s common tasks have been extracted from charm code to generic components we import in multiple charms:

CharmReconciler: a Reusable Reconcile Function

CharmReconciler is a reusable reconcile function for executing one or more Components. It handles:

  • charm reconcile events (typically install, config-changed, *-pebble-ready, some relation events): .execute_components(event) executes all Component.configure_charm() in a user-defined order and updates the Charm’s status based on their results
  • remove: .remove_components(event) runs Component.remove() for all Components
  • update-status: .update_status(event) computes the status of each Component and updates the Charm’s status

Typically, these handlers can replace existing ones for these events, but they could be used in combination with other custom code within the Charm.

Components are CharmReconciler.add()ed to to the reconciler, optionally defining dependency/ordering between Components using the depends_on argument. Rewriting the reconcile-style charm example above using CharmReconciler gives:

class MyCharm(CharmBase):
  def __init__(self):
    self.charm_reconciler = CharmReconciler()

    self.relation_x_component = self.charm_reconciler.add(GetDataFromRelationXComponent)
    self.relation_y_component = self.charm_reconciler.add(SendDataToRelationYComponent)
    self.k8s_component = self.charm_reconciler.add(
      DeployKubernetesResourceComponentUsingRelationXData, 
      depends_on=self.relation_x_component
    )
    self.container_a_component = self.charm_reconciler.add(UpdateContainerAComponent)

    # Replaces all self.framework.observe statements above
    self.charm_reconciler.install_default_event_handlers()

where k8s_component depends_on=self.relation_x_component, meaning that k8s_component is only executed after relation_x_component has succeeded (gone to ActiveStatus). This dependency management lets our charms have optional dependency between each task rather than the sequential execution of the naive reconcile function.

With this reusable CharmReconciler, we are able to implement one good reconcile() function in shared code and import it to all our charms rather than make a bespoke reconciler for each charm. This keeps our charms easier to read and more to the point - the code in our charm.py files directly defines what this charm’s real function is, not the infrastructure that helps implement it.

What is the state of this effort? Where can I see real examples?

We’ve just started using this approach in our charms. So far, we have written:

While we’re still learning how we feel about this new style, it has at least helped us reduce the duplication across our charms. As a side benefit, it has also helped us make our logging and status handling more robust and descriptive because we implemented in one place (CharmReconciler). We hope that, as we inevitably hit bugs or desire new features, our edits will mainly be in the shared code and reduce the maintenance burden.

If you’d like to try this out for yourself, you can find it as part of the charmed-kubeflow-chisme repository. We use this Chisme repo (Spanish for gossip) to hold the code we share across our charms. Take a look for yourself and tell us what you think!

5 Likes

I love it that in the simple cases the charm is reduced to only have an __init__.