How to run workloads with a charm - machines

jnsgruk · 15 April 2021 11:35

There are several ways your charm might start a workload, depending on the type of charm you’re authoring.

For a machine charm, it is likely that packages will need to be fetched, installed and started to provide the desired charm functionality. This can be achieved by interacting with the system’s package manager, ensuring that package and service status is maintained by reacting to events accordingly.

It is important to consider which events to respond to in the context of your charm. A simple example might be:

# ...
from subprocess import check_call, CalledProcessError
# ...
class MachineCharm(ops.CharmBase):
    #...

    def __init__(self, *args):
        super().__init__(*args)
        self.framework.observe(self.on.install, self._on_install)
        self.framework.observe(self.on.start, self._on_start)
        # ...

    def _on_install(self, event: ops.InstallEvent) -> None:
      """Handle the install event"""
      try:
        # Install the openssh-server package using apt-get
        check_call(["apt-get", "install", "-y", "openssh-server"])
      except ops.CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Package install failed with return code %d", e.returncode)
        self.unit.status = ops.BlockedStatus("Failed to install packages")

    def _on_start(self, event: ops.StartEvent) -> None:
      """Handle the start event"""
      try:
        # Enable the ssh systemd unit, and start it
        check_call(["systemctl", "enable", "--now", "openssh-server"])
      except ops.CalledProcessError as e:
        # If the command returns a non-zero return code, put the charm in blocked state
        logger.debug("Starting systemd unit failed with return code %d", e.returncode)
        self.unit.status = ops.BlockedStatus("Failed to start/enable ssh service")
        return

      # Everything is awesome
      self.unit.status = ops.ActiveStatus()

If the machine is likely to be long-running and endure multiple upgrades throughout its life, it may be prudent to ensure the package is installed more regularly, and handle the case where it needs upgrading or reinstalling. Consider this excerpt from the ubuntu-advantage charm code (with some additional comments):

class UbuntuAdvantageCharm(ops.CharmBase):
    """Charm to handle ubuntu-advantage installation and configuration"""
    _state = ops.StoredState()

    def __init__(self, *args):
        super().__init__(*args)
        self._state.set_default(hashed_token=None, package_needs_installing=True, ppa=None)
        self.framework.observe(self.on.config_changed, self.config_changed)

    def config_changed(self, event):
        """Install and configure ubuntu-advantage tools and attachment"""
        logger.info("Beginning config_changed")
        self.unit.status = ops.MaintenanceStatus("Configuring")
        # Helper method to ensure a custom PPA from charm config is present on the system
        self._handle_ppa_state()
        # Helper method to ensure latest package is installed
        self._handle_package_state()
        # Handle some ubuntu-advantage specific configuration
        self._handle_token_state()
        # Set the unit status using a helper _handle_status_state
        if isinstance(self.unit.status, ops.BlockedStatus):
            return
        self._handle_status_state()
        logger.info("Finished config_changed")

In the example above, the package install status is ensured each time the charm’s config-changed event fires, which should ensure correct state throughout the charm’s deployed lifecycle.

jose · 30 April 2021 22:01

About the charm and workload containers communication may be would be useful to add:

Charms communicate with the Pebble API using a UNIX socket, which is mounted into both the charm and workload containers:

/var/lib/pebble/default/pebble.sock inside the workload container.
/charm/<container>/pebble.sock inside the charm container.

jnsgruk · 1 May 2021 19:04

Nice catch, I’ll update this soon. Thanks

bthomas · 4 May 2021 16:29

I infer that command in the pebble layer specification must be a daemon process. Should this be made explicit ? This may cause some confusion otherwise. For instance a charm writer may pass a shell script that launches a daemon process and exits. Would this work ? Also it is not clear how Pebble interacts with command when trying to stop it. For example does it just send a SIGKILL or a SIGTERM or a SIGINT. How does a charm writer ensure that the charm’s application terminates gracefully when Pebble invokes stop() on the service ? Finally is it possible to configure the startup process of a service in bespoke ways ? For example if starting a process requires multiple steps does one pass step1 && step2 && step3 as the command ?

jnsgruk · 5 May 2021 09:23

Hey @bthomas!

So I’ll try and answer these, and bring some other people along who might be able to add some context too…!

Firstly, yes, it seems at the moment that Pebble expects your service will run indefinitely, and will try to maintain it as such. There has been some discussion about Pebble evolving to include the ability to run arbitrary one-time commands in the container, which would be useful both in the context of one-time setup activity, but also for actions (see bug). I expect @niemeyer may have an opinion here.

On stopping processes, I believe the process first gets a SIGTERM, then after a (currently fixed) timeout, gets a SIGKILL - assuming I’ve parsed this correctly!

To start a process with multiple steps you have a couple of options:

Use command: "bash -c 'command1 && command2 && command3'" in your layer
Write a Pebble layer with multiple service entries and make use of the ordering facilities like before and after. See the Pebble README.

niemeyer · 5 May 2021 10:57

@jnsgruk is correct on all points.

The general position is that we are still focusing on the foundations of Pebble. That is, we’re still working on critical features that need to land very soon, such as proper logging, some details of service termination, etc. But you can expect the typical features one would wish for running a daemon to come soon after, such as starting/stopping commands, configuration of timings, possibly the support for one-off jobs, etc.

joeborg · 5 May 2021 15:33

Hey @jnsgruk, we don’t seem to import ModelError in the code snipped. I’m assuming it’s coming from ops.model, but just wanted to make sure.

sed-i · 6 May 2021 02:52

possibly a typo? ./pause?

jnsgruk · 6 May 2021 07:32

Thanks for checking @sed-i; in this case that’s actually correct. The container in question is a super simple container with a single binary at /pause.

jnsgruk · 6 May 2021 07:33

Thanks @joeborg - I’m going to simplify that example a little actually, a couple of changes have happened since I wrote this

rgildein · 10 May 2021 14:56

It shouldn’t have been plan.services["pause"]["environment"]? In any case, you forgot to select the pause service from the layer["services"] variable. (layer["services"]["pause"]["environment"])

jose · 14 May 2021 20:39

Hello @jnsgruk

I believe I made a mistake with the sockets paths:

In the workload container I have: /charm/container/pebble.socket

root@mysql-0:/# ls -l /charm/container/pebble.socket
srw-rw-rw- 1 root root 0 May 14 20:22 /charm/container/pebble.socket

And in the charm container I have: /charm/containers/<CONTAINER>/pebble.socket

root@mysql-0:/# ls -l /charm/containers/mysql/pebble.socket
srw-rw-rw- 1 root root 0 May 14 20:22 /charm/containers/mysql/pebble.socket

@jameinel is this correct?

jameinel · 15 May 2021 21:04

The default pebble dir in pebble is: const defaultPebbleDir = "/var/lib/pebble/default", but you are correct that Juju configures it to be /charm/container/pebble.socket in the workload container, to line up with /charm/containers/<container>/pebble.socket from inside the charm container.

pedroleaoc · 8 June 2021 18:06

kos.tsakalozos · 23 June 2021 18:12

Hi, here are some thoughts on how we can improve this page.

The images where called “myapp-image” and “redis-image” why do the events skip the “-image” part? The “-” will not produce a valid function name, what happens in this case?
““IMPORTANT_CONFIG”: self.model.config[“important-config”]” It would have been nice to have a real life example showing this functionality.

jnsgruk · 23 June 2021 18:20

In this case, the events are about the containers, not the resources. The event signifies that pebble is ready in the container, irrespective of the OCI image.

As for a better example, I agree. I’ll get on it

Thanks!

rbarry · 23 June 2021 20:40

Out of curiosity: is the existing example for on_config_changed missing something for showing this functionality? We can have an overt example which has it in the base layer also, but the clarity would be great.

kos.tsakalozos · 25 June 2021 13:23

This is not accurate. If you inspect the openhab/openhab image the command you see is not the one called. You have to find the Dockerfile to find the right entrypoint + command combination. Maybe would be safer/easier to have pebble by default not override the startup command?

EDIT: I just saw the “In many cases” so I stand corrected. It is accurate.

jnsgruk · 25 June 2021 15:28

Right, you can actually fetch all the data you need with docker inspect, it’s just you’ll need to take into account both the Entrypoint and Cmd parts