Running Workloads

There are several ways your charm might start a workload, depending on the type of charm you’re authoring. In the case of a Kubernetes charm, your workload is likely a container, but that may not be the case for a machine charm. Before writing the code to start your workload, recall the Lifecycle events section, and note that when the start event is emitted, charm authors should ensure their workloads are configured to “persist in a started state without further intervention from Juju or an administrator”.

Machine charms

For a machine charm, it is likely that packages will need to be fetched, installed and started to provide the desired charm functionality. This can be achieved by interacting with the system’s package manager, ensuring that package and service status is maintained by reacting to events accordingly.


It is important to consider which events to respond to in the context of your charm. A simple example might be:

# ...
from subprocess import check_call, CalledProcessError
# ...
class MachineCharm(CharmBase):

    def __init__(self, *args):
        self.framework.observe(self.on.install, self._on_install)
        self.framework.observe(self.on.start, self._on_start)
        # ...

    def _on_install(self, event: InstallEvent) -> None:
      """Handle the install event"""
        # Install the openssh-server package using apt-get
        check_call(["apt-get", "install", "-y", "openssh-server"])
      except CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Package install failed with return code %d", e.returncode)
        self.unit.status = BlockedStatus("Failed to install packages")

    def _on_start(self, event: StartEvent) -> None:
      """Handle the start event"""
        # Enable the ssh systemd unit, and start it
        check_call(["systemctl", "enable", "--now", "openssh-server"])
      except CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Starting systemd unit failed with return code %d", e.returncode)
        self.unit.status = BlockedStatus("Failed to start/enable ssh service")

      # Everything is awesome
      self.unit.status = ActiveStatus()

If the machine is likely to be long-running and endure multiple upgrades throughout its life, it may be prudent to ensure the package is installed more regularly, and handle the case where it needs upgrading or reinstalling. Consider this excerpt from the ubuntu-advantage charm code (with some additional comments):

class UbuntuAdvantageCharm(CharmBase):
    """Charm to handle ubuntu-advantage installation and configuration"""
    _state = StoredState()

    def __init__(self, *args):
        self._state.set_default(hashed_token=None, package_needs_installing=True, ppa=None)
        self.framework.observe(self.on.config_changed, self.config_changed)

    def config_changed(self, event):
        """Install and configure ubuntu-advantage tools and attachment""""Beginning config_changed")
        self.unit.status = MaintenanceStatus("Configuring")
        # Helper method to ensure a custom PPA from charm config is present on the system
        # Helper method to ensure latest package is installed
        # Handle some ubuntu-advantage specific configuration
        # Set the unit status using a helper _handle_status_state
        if isinstance(self.unit.status, BlockedStatus):
        self._handle_status_state()"Finished config_changed")

In the example above, the package install status is ensured each time the charm’s config-changed event fires, which should ensure correct state throughout the charm’s deployed lifecycle.

Kubernetes charms

As described in the introduction, the preferred way to run workloads on Kubernetes with charms is to start your workload with Pebble. You do not need to modify upstream container images to make use of Pebble for managing your workload. The Juju controller automatically injects Pebble into workload containers using an Init Container and Volume Mount. The entrypoint of the container is overridden so that Pebble starts first and is able to manage running services. Charms communicate with the Pebble API using a UNIX socket, which is mounted into both the charm and workload containers.

By default, you’ll find the Pebble socket at /var/lib/pebble/default/pebble.sock in the workload container, and /charm/<container>/pebble.sock in the charm container.

All Kubernetes charms must define a non-empty containers map in their metadata.yaml:

# ...
    resource: myapp-image
    resource: redis-image

    type: oci-image
    description: OCI image for my application
    type: oci-image
    description: OCI image for Redis
# ...

For each container, a resource of type oci-image must also be specified. The resource is used to inform the Juju controller how to find the correct OCI-compliant container image for your workload on Charmhub.

If multiple containers are specified in metadata.yaml (as above), each Pod will contain an instance of every specified container. Using the example above, each Pod would be created with a total of 3 running containers:

  • a container running the myapp-image
  • a container running the redis-image
  • a container running the charm code

The Juju controller emits PebbleReadyEvents to charms when Pebble has initialised its API in a container. These events are named <container_name>_pebble_ready. Using the example above, the charm would receive two Pebble related events (assuming the Pebble API starts correctly in each workload):

  • myapp_pebble_ready
  • redis_pebble_ready.


Consider the following example snippet from a metadata.yaml:

# ...
    resource: pause-image

    type: oci-image
    description: Docker image for google/pause
# ...

Once the containers are initialised, the charm needs to tell Pebble how to start the workload. Pebble uses a series of “layers” for its configuration. Layers contain a description of the processes to run, along with the path and arguments to the executable, any environment variables to be specified for the running process and any relevant process ordering (more information available in the Pebble README).

When using an OCI-image that is not built specifically for use with Pebble, layers are defined at runtime using Pebble’s API. Recall that when Pebble has initialised in a container (and the API is ready), the Juju controller emits a PebbleReadyEvent event to the charm. Often it is in the callback bound to this event that layers are defined, and services started:

# ...

class PauseCharm(CharmBase):
    # ...
    def __init__(self, *args):
        self.framework.observe(self.on.pause_pebble_ready, self._on_pause_pebble_ready)
        # ...

    def _on_pause_pebble_ready(self, event: PebbleReadyEvent) -> None:
        """ Handle the pebble_ready event"""
        # Get a reference to the container from the PebbleReadyEvent
        container = event.workload
        # Add our initial config layer, combining with any existing layer
        container.add_layer("pause", self._pause_layer(), combine=True)
        # Start the services that specify 'startup: enabled'
        self.unit.status = ActiveStatus()

    def _pause_layer(self) -> dict:
        """Returns Pebble configuration layer for google/pause"""
        return {
            "summary": "pause layer",
            "description": "pebble config layer for google/pause",
            "services": {
                "pause": {
                    "override": "replace",
                    "summary": "pause service",
                    "command": "/pause",
                    "startup": "enabled",
# ...

A common method for configuring container workloads is by manipulating environment variables. The layering in Pebble makes this easy. Consider the following extract from a config-changed callback which combines a new overlay layer (containing some environment configuration) with the current Pebble layer and restarts the workload:

# ...
from ops.pebble import ServiceStatus
# ...
def _on_config_changed(self, event: ConfigChangedEvent) -> None:
    """Handle the config changed event."""
    # Get a reference to the container so we can manipulate it
    container = self.unit.get_container("pause")

    # Get the 'pause' service from within the container
    service = container.get_service("pause")

    # Create a new config layer - specify 'override: merge' in the 'pause'
    # service definition to overlay with existing layer
    layer = {
        "services": {
            "pause": {
                "override": "merge",
                "environment": {
                    "IMPORTANT_CONFIG": self.model.config["important-config"],

    # Get the current plan from the container
    plan = container.get_plan()
    # Check if there are any changes to the config
    # So we can avoid unnecessarily restarting the service
    if["environment"] != layer["services"]["environment"]:
        # Add the layer to Pebble
        container.add_layer("pause", layer, combine=True)
        logging.debug("Added config layer to Pebble plan")

        # If the 'pause' service is currently running in the container, stop it
        if service.is_running()
        # Start/restart the 'pause' service in the container
        container.start("pause")"Restarted pause service")
    # All is well, set an ActiveStatus
    self.unit.status = ActiveStatus()
# ...

In this example, each time a config-changed event is fired, a new overlay layer is created that only includes the environment config, populated using the charm’s config. The application is only restarted if the configuration has changed.

About the charm and workload containers communication may be would be useful to add:

Charms communicate with the Pebble API using a UNIX socket, which is mounted into both the charm and workload containers:

/var/lib/pebble/default/pebble.sock inside the workload container.
/charm/<container>/pebble.sock inside the charm container.

1 Like

Nice catch, I’ll update this soon. Thanks :blush:

I infer that command in the pebble layer specification must be a daemon process. Should this be made explicit ? This may cause some confusion otherwise. For instance a charm writer may pass a shell script that launches a daemon process and exits. Would this work ? Also it is not clear how Pebble interacts with command when trying to stop it. For example does it just send a SIGKILL or a SIGTERM or a SIGINT. How does a charm writer ensure that the charm’s application terminates gracefully when Pebble invokes stop() on the service ? Finally is it possible to configure the startup process of a service in bespoke ways ? For example if starting a process requires multiple steps does one pass step1 && step2 && step3 as the command ?

Hey @bthomas!

So I’ll try and answer these, and bring some other people along who might be able to add some context too…!

Firstly, yes, it seems at the moment that Pebble expects your service will run indefinitely, and will try to maintain it as such. There has been some discussion about Pebble evolving to include the ability to run arbitrary one-time commands in the container, which would be useful both in the context of one-time setup activity, but also for actions (see bug). I expect @niemeyer may have an opinion here.

On stopping processes, I believe the process first gets a SIGTERM, then after a (currently fixed) timeout, gets a SIGKILL - assuming I’ve parsed this correctly!

To start a process with multiple steps you have a couple of options:

  • Use command: "bash -c 'command1 && command2 && command3'" in your layer
  • Write a Pebble layer with multiple service entries and make use of the ordering facilities like before and after. See the Pebble README.

@jnsgruk is correct on all points.

The general position is that we are still focusing on the foundations of Pebble. That is, we’re still working on critical features that need to land very soon, such as proper logging, some details of service termination, etc. But you can expect the typical features one would wish for running a daemon to come soon after, such as starting/stopping commands, configuration of timings, possibly the support for one-off jobs, etc.

Hey @jnsgruk, we don’t seem to import ModelError in the code snipped. I’m assuming it’s coming from ops.model, but just wanted to make sure.

possibly a typo? ./pause?

Thanks for checking @leon-mintz; in this case that’s actually correct. The container in question is a super simple container with a single binary at /pause.

Thanks @joeborg - I’m going to simplify that example a little actually, a couple of changes have happened since I wrote this :rocket:

1 Like