Health checks in pebble

Hi,

I’ve been taking a look at the new docs on health checks in pebble and thought I’d try it out.

I took the sidecar version of the mattermost-k8s charm and applied the following diff:

diff --git a/src/charm.py b/src/charm.py
index 6eb523b..0d402a9 100755
--- a/src/charm.py
+++ b/src/charm.py
@@ -135,6 +135,14 @@ class MattermostK8sCharm(CharmBase):
                     "environment": env_config,
                 }
             },
+            "checks": {
+                "ready": {
+                    "override": "replace",
+                    "http": {
+                        "url": "http://localhost:{}/api/v4/system/ping".format(CONTAINER_PORT),
+                    }
+                }
+            },
         }
         return pebble_config

I then built the charm and deployed it in microk8s. The pebble plan looks okay:

root@mattermost-k8s-0:/mattermost# /charm/bin/pebble plan
services:
    mattermost:
        summary: Mattermost service
        startup: enabled
        override: replace
        command: /mattermost/bin/mattermost
        environment:
            [...]
checks:
    ready:
        override: replace
        threshold: 3
        http:
            url: http://localhost:8065/api/v4/system/ping

By inspecting the deployed pod in k8s I can see a readiness check is configured as:

http-get http://:38813/v1/health%3Flevel=ready delay=30s timeout=1s period=5s #success=1 #failure=1

If I query that endpoint I see the following:

{"type":"sync","status-code":200,"status":"OK","result":{"healthy":true}}

Is this the expected result? I was expecting to see something in the endpoint output that corresponded to the check I’d defined.

Hey @mthaddon

Yes! Pretty sure this is correct. I just took a look back over the design for this, and indeed that’s the expected result for the outward facing API that Pebble presents.

You can get more detail by getting into the charm container and running pebble checks at the command line, or by using ops.model.Container.get_checks() in the Operator Framework

Ah nice:

# /charm/bin/pebble checks
Check  Level  Status  Failures
ready  -      up      0/3

Because you configure the pebble checks ‘dynamically’, Juju needs to instruct Kubernetes about a fixed check against pebble. (eg, we tell K8s that it can just ask Pebble if it is happy, and then your health checks run based on whatever the charm has configured for the services that have launched.) Hence pebble checks is used to give the details.

If we had your charm defining the checks being represented in K8s, that would cause the pod to be restarted to start utilizing those checks.

That feels like content that should be added to the discussion around Health checks, at least from a ‘how does it work’ perspective. @tmihoc @benhoyt

1 Like

But currently when you run the k8s checks you get:

{"type":"sync","status-code":200,"status":"OK","result":{"healthy":true}}

I don’t know if it would be possible for this to return a similar result to pebble checks but it feels like it might be useful to have some info in here that explains the checks being run. I’m not suggesting you need to reconfigure the checks in k8s, just that you return slightly different info. But I may be misunderstanding what’s possible in terms of the returned data here.