The Observability team maintains a lot of rocks. Since the very beginning of oci-factory, I’ve been quite involved in managing them (and trying to make it easier for others), and that includes adding new versions whenever an upstream project releases something.
Being mildly interested in automation, I set up our workflows to do so periodically, by creating a new rockcraft.yaml
for the newly-released version. And that’s great! We can then auto-merge the pull request as soon as the tests pass — wait, TESTS?!
How do I even test an OCI image? 
To have some quality guarantees (and to be able to increase the scope of our CI), we need to make sure the new image we’re building is actually working. What if rockcraft pack
doesn’t fail, but the binary can’t run correctly?
I gathered a set of tools to help us through this task:
just
, our task runner;goss
, to run the checks and verify the rock is working;microk8s
, to run the rocks (any Kubernetes cluster would work, really); make sure tomicrok8s enable registry
to enable the registry plugin!
Let’s go through how we test our rocks, step-by-step! I’ll be referencing our opentelemetry-collector-rock repository to guide us through
Structuring your repository 
d .
├── d 0.110.0
│ └── rockcraft.yaml
├── d 0.117.0
│ └── rockcraft.yaml
├── d 0.118.0
│ └── rockcraft.yaml
└── ...
Our rock repositories contain a folder for each rock version. This allows us to integrate nicely with both OCI Factory and just
, as you’ll see below.
TL;DR: use one folder per each rock version.
Running an OCI image 
We need to be able to locally run a freshly-packed rock: we can push it to the local image registry provided by microk8s
. The rockcraft
snap is bundled with skopeo
(accessible via rockcraft.skopeo
), which is exactly what we need.
Let’s add that logic in our justfile
:
### Snippet from /justfile ###
set quiet # Recipes are silent by default
set export # Just variables are exported to environment variables
rock_name := `echo ${PWD##*/} | sed 's/-rock//'`
# To find the latest version, get the "last" folder that starts with a number
latest_version := `find . -maxdepth 1 -type d -name '[0-9]*' | sort -V | tail -n1 | sed 's@./@@'`
[private]
default:
just --list
# Pack a rock of a specific version
pack version:
echo "Packing opentelemetry-collector: $version"
cd "$version" && rockcraft pack
# Push an OCI image to a local registry
[private]
push-to-registry version:
echo "Pushing $rock_name $version to local registry"
rockcraft.skopeo --insecure-policy copy --dest-tls-verify=false \
"oci-archive:${version}/${rock_name}_${version}_amd64.rock" \
"docker://localhost:32000/${rock_name}-dev:${version}" >/dev/null
# Run a rock and open a shell into it with `kgoss`
run version=latest_version: (push-to-registry version)
kubectl run otel-collector --image localhost:32000/${rock_name}-dev:${version}
Other than some just
configuration at the start and the default recipe, plus some simple parsing of the rock name (from the repository name) and the latest local version, we have three recipes:
just pack
, which allows you to pack a specific version of a rock from the repository root (i.e.,just pack 0.117.0
);just push-to-registry
, which usesskopeo
to push a.rock
image to your local registry; it’s set to[private]
because you shouldn’t need to call it directly (although you can);just run
, to push the.rock
to a local registry (notice the recipe dependency) and spin up a pod so you can do manual testing and exploration (i.e.,just run 0.117.0
).
All those recipes conveniently assume the latest rock version as their default argument.
The default recipe allows us to simply run just
without arguments to list the available recipes:
Available recipes:
pack version # Pack a rock for the specified version
run version=latest_version # Run a rock
TL;DR: use skopeo
to push rocks to a local registry, and kubectl run
to create pods running them.
Testing in isolation 
To test the rock in isolation, we’re using kgoss
, a community-maintained goss
-related utility that does the following:
- run a pod with the provided image;
- execute the checks defined in
goss.yaml
from inside the pod;kgoss
handles this by:- copying the
goss.yaml
file inside the pod; - running
goss
viakubectl exec
.
- copying the
Goss is extremely powerful, allowing for easy configuration of timeouts and retry-intervals, so your pod has enough time to settle. Some example checks you could write are:
- checking if the process is running;
- making sure a configuration file is there;
- checking whether something is listening on some ports.
Here’s a real (shortened) example:
### Snippet from /goss.yaml ###
process:
otelcol:
running: true
...
port:
tcp6:8888: # self-monitoring metrics
listening: true
port: 'tcp6:8888'
skip: false
Let’s add a recipe to our justfile
so we can easily run these tests with kgoss
:
# Test the rock with `kgoss`
[group("test")]
test-isolation version=latest_version: (push-to-registry version)
GOSS_OPTS="--retry-timeout 60s" kgoss run -i localhost:32000/${rock_name}-dev:${version}
Running just test-isolation
will first push the image to your local registry, so then kgoss
can do the heavy lifting by running the checks we just wrote in /goss.yaml
. It’s very simple to do, and extremely useful!
TL;DR: use kgoss
to spin up a pod with your rock, and run some Goss checks on it.
Integration testing 
If you want to go a step further, you might want to check whether your rock correctly integrates with other workloads. This is especially useful if your build process doesn’t exactly follow the upstream, and you want to make sure you didn’t break anything.
I wanted to keep this goss
-driven approach, while still dodging the docker
requirement. The general idea is to write some Kubernetes manifests to deploy the necessary workloads, kubectl apply
them, and run some Goss checks for validation.
I introduced a tests/
folder and structured it as such:
d .
└── d tests
└── d prometheus_integration
├── f goss.yaml # external `goss` checks
├── f otel-collector.yaml
└── f prometheus.yaml
Each YAML file is a Kubernetes manifest, declaring a set of Deployments, Services, and ConfigMaps that form the actual deployment. I won’t paste the files here because they’re lengthy, but you can take a look here. Note that otel-collector.yaml
uses the image from localhost:32000
, the local image registry.
For reference, here’s how we check that our OpenTelemetry Collector rock can remote-write to Prometheus:
command:
remote-write:
exit-status: 0
exec: |
echo "Namespace: {{.Env.NAMESPACE}}"
# Get Prometheus pod
PROMETHEUS_IP="$(kubectl get pod -n {{.Env.NAMESPACE}} -l app="prometheus" \
-o jsonpath='{.items[*].status.podIP}')"
if [ -z "$PROMETHEUS_IP" ]; then
echo "Prometheus pod IP not found, maybe the pod isn't ready yet"
exit 1
fi
echo "Prometheus IP: $PROMETHEUS_IP"
# Check there is a `job` label with value `otel-collector`
LABELS="$(curl -s "${PROMETHEUS_IP}:9090/api/v1/label/job/values")"
echo "Prometheus 'job' label values: $LABELS"
if ! echo "$LABELS" | grep -q "otel-collector"; then
echo "'job=otel-collector' label not found"
exit 2
fi
Other than some parsing to get the namespace and Prometheus’ pod IP, the core part of the check is simply a curl
command, checking whether the self-monitoring metrics from the Collector are present in Prometheus.
We can put it all together in the justfile
, by adding a generic test
recipe that runs all tests, and a test-integration
recipe for what you just read:
# Run the rock tests
test version=latest_version: (push-to-registry version) \
(test-isolation version) \
(test-integration version)
# Test the rock integration with other workloads
[group("test")]
test-integration version=latest_version: (push-to-registry version)
#!/usr/bin/env bash
# For all the subfolder in tests/
for test_folder in $(find tests -mindepth 1 -maxdepth 1 -type d | sed 's@tests/@@'); do
# Create a namespace for the tests to run in
namespace="test-${rock_name}-rock-${test_folder//_/-}"
echo "+ Preparing the testing environment"
kubectl delete all --all -n "$namespace" >/dev/null
kubectl delete namespace "$namespace" >/dev/null
kubectl create namespace "$namespace"
# For each '.yaml' file (excluding 'goss.yaml')
for manifest in $(find tests/${test_folder} -type f -name '*.yaml' | grep -v 'goss.yaml'); do
kubectl apply -f "$manifest" -n "$namespace" # deploy it in the test namespace
done
sleep 15 # Wait for the pods to settle and otel-collector to remote-write
NAMESPACE="$namespace" goss \
--gossfile "tests/${test_folder}/goss.yaml" \
--log-level debug \
validate \
--retry-timeout=120s \
--sleep=5s
# Cleanup
echo "+ Cleaning up the testing environment"
kubectl delete all --all -n "$namespace"
kubectl delete namespace "$namespace"
done
You can now run just test-integration [rock-version]
and let the magic happen!
TL;DR: write Kubernetes manifests using your dev rock image, apply them, and use goss
to validate the whole thing.
Conclusions 
If you look at your justfile
now, you’ll see you effectively built some developer tools for your rock, and just
makes this simple and easily accessible.
∮ just
Available recipes:
clean version # `rockcraft clean` for a specific version
pack version # Pack a rock
run version=latest_version # Run a rock
test version=latest_version # Run all the tests
[test]
test-integration version=latest_version # Test the rock integration with other workloads
test-isolation version=latest_version # Test the rock with `kgoss`
Testing our rocks is extremely important in order to guarantee a higher level of quality in our work — and to allow you to auto-merge pull requests without manual review.
Hope this can be useful!