Compound status tree representation: a deep dive into a little `jhack` utility

ppasotti · 10 June 2024 09:53

As you know, ops ‘recently’ introduced a collect-unit|app-status event that you can hook onto to gather all statuses relevant to your unit or application, and let the operator framework figure out for you which one is the most relevant that should be reported to the user.

However, as a developer, it would be useful to have access to the full list of statuses that the charm collects, instead of only seeing the ‘toplevel’ one. We could go one step further and think: what if you could add to your charm a set of status checks that are not per se interesting for the cloud admin, but could interesting for you, when debugging.

What if we tagged our statuses using some labeling convention like:

        e.add_status(WaitingStatus("[database] database not ready yet"))
        e.add_status(BlockedStatus("[tempo_relation] cannot curl tempo server"))
        e.add_status(BlockedStatus("[tls.cert_on_disk] cert not on disk"))
        e.add_status(ActiveStatus("[tls] tls ready"))
        e.add_status(ActiveStatus("[relations.ingress] ingress ready"))
        e.add_status(WaitingStatus("[relations.tracing] some tracing relation is waiting for data"))

and found a way to read them off of a live charm?

Some time ago I introduced jhack eval (and its generic brother jhack script) , so I thought: why not use jhack eval to force the unit to emit a collect-status event, then gather whatever statuses are collected and print them out for the user to see. That’s easily done:

jhack eval my-app/0 "ops.charm._evaluate_status(self) or print([{{'name': s.name, 'message': s.message}} for s in self.{status_owner}._collected_statuses])

However, nobody likes reading __repr__s, so I started working on jhack sitrep, a tool to consume that repr and output a pretty-printed status report that would look something like:

<root status>: Active (message)
  - database: Blocked(message)
  -  relations:
    - ingress: Active(message)
    - tracing: Waiting(message)
...

So initially I thought one could run eval and pipe the output to sitrep to obtain a pretty-printed output.

But that’s too much typing and you know I don’t like that, unless I’m telling a story.

Extending `jhack eval` with output functionality

So the next step was modifying some of eval’s internal plumbing to allow sending data back from the unit.

Eval works like this:

b64-encode the python expression to evaluate
scp a magic-sauce script onto the unit
juju-exec the magic script with a bunch of environment variables to tell the script, among other things:
- the python expression to evaluate
- name of the charm to execute, and path to the module containing it
the magic script will run and:
- import the charm module
- find the charm type
- set up ops much the same way as ops.main does, but without emitting any event on the charm
- eval() the expression you passed, with a few globals such as self (the charm instance) and ops (the ops module)
print whatever the output was
cleanup: delete the magic script

What was missing then was adding a mechanism for the expression being evaluated to send data back to the jhack process.

So I added a new global to the eval call: output. Calling output now json-encodes and dumps whatever you pass to it to a file on the unit. When the juju-exec call returns, jhack will scp that file out of the unit and delete it, then json-decode the data and give it back to the caller or print it out if you’re using jhack eval directly.

So now you can, for example:

and it becomes a lot easier to write scripts that wrap jhack eval to do fancy stuff such as sitrep.

The result looks like this:

This is now available on edge.

Fun fact: you can add UnknownStatus instances to the collect-status events, and you can even add a message to them with a little bit of hackery:

        @StatusBase.register
        class MyUnknownStatus(StatusBase):
            name = "unknown"
            def __init__(self, msg):
                super().__init__(msg)

        e.add_status(MyUnknownStatus("[relations.tracing] unclear whether tracing backend is online"))

tony-meyer · 10 June 2024 10:11

sed-i · 10 June 2024 13:20

Nice!

Would it be easy to pack all the logic into a charm’s collect-status action?

Maybe even from jhack.utils import collect_status?

ppasotti · 10 June 2024 13:41

you could pack all the ‘record-and-collect’ logic in collect-status, but you still need someone on the juju client end to pick up the data, parse it and present it in some way to the user

definitely doable though

Compound status tree representation: a deep dive into a little `jhack` utility

Extending jhack eval with output functionality

Extending `jhack eval` with output functionality