Declaring pebble layer in the rock

dimaqq · 7 November 2024 04:21

Hi all, I tried this idiom and I kinda like it.

TL;DR declare the pebble layer in rockcraft.yaml, on pebble ready replan() without data

Specific top-level keys are copied from rockcraft.yaml to the pebble layer by rockcraft tool.

This allowed me to declare the service and it’s health checks in the file like so:

# rockcraft.yaml

services:
    gubernator:
        override: replace
        startup: enabled
        command: /bin/gubernator  # the app
        ...
        on-success: shutdown  # Lovely!
        on-failure: shutdown
        on-check-failure:
            online: shutdown

checks:
    online:
        override: replace
        exec:
            command: /bin/healthcheck  # created when building the app
        period: 3s

Leaving the Python code trivial, though still necessary, as Juju starts pebble with the --on-hold flag, and pebble needs to be kicked off to start the declared services:

# src/charm.py

    def __init__(self, framework: ops.Framework):
        super().__init__(framework)
        self.framework.observe(self.on["gubernator"].pebble_ready, self._on_pebble_ready)

    def _on_pebble_ready(self, event: ops.PebbleReadyEvent):
        event.workload.replan()
        self.unit.status = ops.ActiveStatus()

What do you think about this idiom? Good? Bad? Only for small projects? Other?

Other observations:

on-success: shutdown reads kinda WAT?, but there’s some logic to it
health check gets same environment as the service, which allows it to poke the correct port

sed-i · 11 November 2024 14:26

Can you elaborate why the shutdown is everywhere?

BTW I started a draft that has an overlap with this idea.

benhoyt · 11 November 2024 21:03

Yeah, interesting idea to declare all of this including health checks in the Rock. But why do you use shutdown, rather than letting Pebble auto-restart the service?

dimaqq · 12 November 2024 07:02

That’s a good question.

My thinking was that allowing k8s to restart the container was cleaner in a sense that newly started process starts from a known good, blank state.

For example, if the service filled up the container with logs or spawned off daemon subproceses, a pebble service restart would not suffice.

The other benefit is that the charm code gets alerted when the container starts again. Arguably that could be exposed using the new on check failed/recovered events.

When I was building the rock, I was perhaps thinking of deployments like AWS, where failed container is automatically removed from the load balancer’s list of upstreams. That’s not the case in Juju deployment though.

In AWS-style, or cloud deployments in general, a service may become unresponsive because the underlying VM is hogged… In those case it’s better to quit and have the orchestration system reschedule the container. The k8s equivalent would be rescheduling the pod. Again, this not the case in Juju microk8s, afaik.

In terms o monitoring, perhaps the container restart is exposed automatically, as it’s visible to the k8s control plane, that is juju controller. I’m unclear if Juju actually exports container restarts within a healthy pod to telemetry. A service restart within pebble needs to be exported to telemetry manually.

In short, I barely have half an answer.

ca-scribner · 12 November 2024 20:52

Feels like there’s two ideas in this post:

declaring pebble layers in the rock
using pebble to restart vs container orchestrator

for (1), the two concerns that come to mind are:

this only works when the container doesn’t need input (for example, command line args)
when possible, I want my rocks to be drop-in replacements to for upstream images, that way I can switch between them if needed (for example, if upstream releases something that breaks my rock, I have a backup plan). This proposal might disrupt that, although I think most cases where that’s true then we’ve already failed the previous “doesn’t need input” bullet anyway

For (2), I might be colored by coming to Juju from a k8s background, but I’m with you in thinking its a virtue of the system to, when a container is misbehaving, throw it away and get a new one. I’d rather pebble not intervene in that because I don’t see what value pebble adds that I don’t get from k8s. And now that you mention it, letting k8s restart things for us might simplify the work a charm does to confirm its workload is up. In the “normal” configuration where pebble restarts things, a charm both checks if the container is up/pebble is ready and if the checks pass. I think(?) we could collapse that to one check if pebble wasn’t restarting the container on failures, and we’d also get pebble_ready events whenever the workload stopped/failed checks (since the entire container would restart).

But I say all this without having tried it. Maybe there’s other sharp edges in here

dimaqq · 13 November 2024 08:41

There are some sharp edges either way:

if container dies, I don’t think the charm is notified until new container is started and pebble is ready
if pebble restarts the service that crashed, I imagine that restart may be faster than 3x health checks and charm is not notified either

So, if the charm is responsible to e.g. deregister this pod ip from the load balancer, I suspect that neither path is bullet-proof.

Regarding rocks: I believe we’re supposed to build images ourselves, from source, on corp trusted hardware, thus swapping in an upstream image goes against the security paradigm.

Regarding config: that really depends on the workload. Something like Apache allows config file to be modified and a signal issued to reload the config. If an env var or a command line argument needs to change at runtime, then obv a replan is needed; then again in that case there’s a service interruption, so maybe that’s not ideal?

ca-scribner · 13 November 2024 15:51

yeah tbh I do like pebble’s alerts a bit better when combined with some of the newer pebble features so the charm actually gets woken up when the workload is down.

re rocks: you’re right, the goal is that everything is a rock. But I’m a pessimistic person who likes to have backup plans

re config: agreed all this is a bit workload dependent. Some cases will make this easier or harder for you

dimaqq · 14 November 2024 07:50

@sed-i here’s the shutdown summary:

on-success if the workload command exits with return code 0, quit pebble
on-failure if the workload command exits with a non-zero return code, quit pebble
on-check-failure.online when the check called “online” fails, quit pebble

When pebble quits, the container dies and I’m relying on k8s to restart it. The pod remains.

See earlier replies for the tradeoff between having k8s and pebble restarting the container.