As you know, ops
‘recently’ introduced a collect-unit|app-status
event that you can hook onto to gather all statuses relevant to your unit or application, and let the operator framework figure out for you which one is the most relevant that should be reported to the user.
However, as a developer, it would be useful to have access to the full list of statuses that the charm collects, instead of only seeing the ‘toplevel’ one. We could go one step further and think: what if you could add to your charm a set of status checks that are not per se interesting for the cloud admin, but could interesting for you, when debugging.
What if we tagged our statuses using some labeling convention like:
e.add_status(WaitingStatus("[database] database not ready yet"))
e.add_status(BlockedStatus("[tempo_relation] cannot curl tempo server"))
e.add_status(BlockedStatus("[tls.cert_on_disk] cert not on disk"))
e.add_status(ActiveStatus("[tls] tls ready"))
e.add_status(ActiveStatus("[relations.ingress] ingress ready"))
e.add_status(WaitingStatus("[relations.tracing] some tracing relation is waiting for data"))
and found a way to read them off of a live charm?
Some time ago I introduced jhack eval
(and its generic brother jhack script
) , so I thought: why not use jhack eval
to force the unit to emit a collect-status
event, then gather whatever statuses are collected and print them out for the user to see.
That’s easily done:
jhack eval my-app/0 "ops.charm._evaluate_status(self) or print([{{'name': s.name, 'message': s.message}} for s in self.{status_owner}._collected_statuses])
However, nobody likes reading __repr__
s, so I started working on jhack sitrep
, a tool to consume that repr and output a pretty-printed status report that would look something like:
<root status>: Active (message)
- database: Blocked(message)
- relations:
- ingress: Active(message)
- tracing: Waiting(message)
...
So initially I thought one could run eval
and pipe the output to sitrep
to obtain a pretty-printed output.
But that’s too much typing and you know I don’t like that, unless I’m telling a story.
Extending jhack eval
with output functionality
So the next step was modifying some of eval
’s internal plumbing to allow sending data back from the unit.
Eval works like this:
- b64-encode the python expression to evaluate
- scp a magic-sauce script onto the unit
juju-exec
the magic script with a bunch of environment variables to tell the script, among other things:- the python expression to evaluate
- name of the charm to execute, and path to the module containing it
- the magic script will run and:
- import the charm module
- find the charm type
- set up
ops
much the same way asops.main
does, but without emitting any event on the charm eval()
the expression you passed, with a few globals such asself
(the charm instance) andops
(the ops module)
- print whatever the output was
- cleanup: delete the magic script
What was missing then was adding a mechanism for the expression being evaluated to send data back to the jhack process.
So I added a new global to the eval
call: output
. Calling output
now json-encodes and dumps whatever you pass to it to a file on the unit.
When the juju-exec
call returns, jhack will scp that file out of the unit and delete it, then json-decode the data and give it back to the caller or print it out if you’re using jhack eval
directly.
So now you can, for example:
and it becomes a lot easier to write scripts that wrap jhack eval
to do fancy stuff such as sitrep
.
The result looks like this:
This is now available on edge
.
Fun fact: you can add UnknownStatus
instances to the collect-status events, and you can even add a message to them with a little bit of hackery:
@StatusBase.register
class MyUnknownStatus(StatusBase):
name = "unknown"
def __init__(self, msg):
super().__init__(msg)
e.add_status(MyUnknownStatus("[relations.tracing] unclear whether tracing backend is online"))