Instrumenting commands execution

Several months ago we started a journey to have a performant tool. The first steps were manually, measuring pack times using the produced logs, to which followed several improvements in Charmcraft and some libraries that Charmcraft uses.

The evolution for that manual work is to instrument the code so we can gather measurements automatically.

Just passing a --measure option to the pack command shall allow the developer to gather timings for that run. This will allow us to collect the command timings in any situation (custom CIs, developer machines, strange environments, etc), and we’re aiming to use this information in two ways:

  • manual analysis of one timing collection: using a custom tool to visualize the results will allow us to understand where the time is spent in that particular situation (I already shown in this forum post some graphs produced by a initial version of that tool)

  • timing comparison through time: we will periodically run the command on a stable collection of charms and plot the resulting measures (overall time and internal steps) in a time series, to allow us to see how we improve the tool performance and detect if we have a regression in a specific Charmcraft version

To achieve those objectives we need to measure times in a very specific way.

One one hand we need to probe specific parts of the code. We cannot just log timings on different parts of the code, because as different run take different code paths it’s later in retrospective very difficult (if not impossible) to understand what happened between a time mark and the next one; we need a way to measure around code blocks that conceptually represent specific actions or steps, annotating what we’re measuring.

On the other hand we cannot just measure “everything” (like a code profiler would do) as that information, even if it will allow us to analyze where time is spent, never will give us understanding on the evolution of the different parts of the execution through time (just too much information to be useful).

Something we also need to consider is that Charmcraft (and other craft tools), particularly for packing, represent a specific challenge because the run times not only need to be gathered in the main application but also in the sub-run of the application inside the instance provided by LXD or Multipass, and then also get the times in another process that is run by the correspondent plugin inside the Craft Parts machinery.

Requirements

The following are the different requirements that we need to provide in the implementation of an “instrumentator”.

It shall work as a context manager that generates a measurement of the block of code inside its context, or as a decorator to get the timing of the whole decorated function/method. In both cases it shall receive a text message (mandatory) to annotate the measure and also (optionally) any other specific information that the developer would want to record (saving this extra information not as part of the text message will allow other reading tools to use it in a simple way). It shall be possible for the captured times to be nested.

Usage as a decorator:

@timer("some message", foo="bar")
def foo(...):
    ...

Usage as a context manager:

with timer("some message", more="stuff", answer=42):
    ...

It shall provide a method to divide a measurement in parts, allowing the developer to mark middle measurements inside a contextual one. This is specially useful to measure around other context managers. E.g.:

with timer("Launching subsystem") as timer_cm:
    with subsystem.launch() as stuff:
        timer_cm.mark("Start subsystem usage")
        ...
        timer_cm.mark("Finish subsystem usage")

That specific example shall create three regular measurements:

  • time spent for the other context manager to set up and run its __enter__ method (subsystem.launch()), annotated as “Launching subsystem”

  • time spent in the code block itself, annotated as “Start subsystem usage”

  • time spent in the __exit__ method of the other context manager, annotated as “Finish subsystem usage”

All the recorded times shall be relative: they will be easier to read and simple to understand. The baseline for those relative times needs to be taken as soon as possible during the process start up (e.g. when the module is imported), not on the first measure, because that will give the indication of the time spent between the process starting and the first measure. It’s a good idea to record this baseline in the saved information.

The instrumentator module shall provide a way to dump all the recorded information to a file, in JSON format (so it’s easily consumable by subsequent reading tools).

To support the measurements in different layers (inside a provided instance, other processes) the instrumentator module shall provide a way to merge information from a previously dumped file to the current in-memory structure, re-linking the “root nodes” in the source info to the current measure and adapting all the relative times (is it ok to consider that timings retrieved in the different layers are synchronized, as it’s what happens for LXD and Multipass instances, and of course different process; note that timezones may not be the same).

1 Like