Canonical Observability Stack

Highly-integrated, low-operations observability stack powered by Juju and Microk8s.

The Canonical Observability Stack (COS Lite) gathers, processes, visualizes, and alerts on telemetry signals generated by workloads running both within, and outside of, Juju.

By leveraging the topology model of Juju to contextualize the data, and charm relations to automate configuration and integration, it provides a low-ops observability suite based on best-in-class, open-source observability tools.

For Site-Reliability Engineers, Canonical Observability Stack provides a turn-key, out-of-the-box solution for improved day 2 operational insight.

In this documentation

Get started - a hands-on introduction for new users deploying COS.
How-to guides
Step-by-step guides covering key operations and common tasks
Concepts - discussion and clarification of key topics
Technical information - specifications, APIs, architecture

Project and community

The Canonical Observability Stack is a member of the Ubuntu family. It’s an open source project that warmly welcomes community projects, contributions, suggestions, fixes and constructive feedback.

Thinking about using the Canonical Observability Stack for your next project? Get in touch!


Level Path Navlink
1 overview Home
1 tutorials Tutorial
2 tutorials/install-microk8s Getting started on MicroK8s
2 tutorials/sync-alert-rules-from-git Sync alert rules from Git
2 tutorials/instrumenting-machine-charms Instrumenting machine charms
1 how-to How-to
2 how-to/configure-scrape-jobs Configure Prometheus scrape jobs
2 how-to/metrics-endpoint Expose a metrics endpoint
2 how-to/add-alert-rules Add Alert Rules
2 how-to/troubleshoot-gateway-address-unavailable Troubleshoot Traefik “Gateway address unavailable”
2 how-to/migrate-from-lma Migrate from LMA to COS
2 how-to/integrate-cos-lite-with-uncharmed-applications Integrate COS Lite with uncharmed applications
2 how-to/fix-socket-too-many-open-files Fix socket: too many open files
1 explanation Explanation
2 explanation/the-stack The Stack
3 editions/lite COS Lite
3 design-goals Design Goals
3 juju-topology Juju topology
2 explanation/the-practice The Practice
3 what-is-observability What is observability?
3 model-driven-observability-tag Model-Driven Observability
2 explanation/tls COS and TLS
2 explanation/ingress Charmed Ingress
2 explanation/telemetry-labels Telemetry labels
2 explanation/topology-labels Topology labels
1 reference Reference
2 reference/solution-matrix Solution matrix
2 reference/best-practices Deployment Best Practices
2 reference/performance Performance
3 reference/performance/on-4cpu-8gb-ssd on 4cpu-8gb-ssd
3 reference/performance/on-8cpu-16gb-ssd on 8cpu-16gb-ssd
2 reference/kubernetes-charms Kubernetes Charms
3 reference/traefik-k8s Traefik K8s
3 reference/alertmanager-k8s Alertmanager K8s
3 reference/prometheus-k8s Prometheus K8s
3 reference/prometheus-scrape-target-k8s Scrape Target K8s
3 reference/prometheus-scrape-config-k8s Scrape Config K8s
3 reference/loki-k8s Loki K8s
3 reference/grafana-k8s Grafana K8s
3 reference/grafana-agent-k8s Grafana Agent K8s
3 reference/catalogue-k8s Catalogue K8s
3 reference/mimir-k8s Mimir K8s
3 reference/cos-configuration-k8s COS Config K8s
3 reference/karma-k8s Karma K8s
3 reference/karma-alertmanager-proxy-k8s Karma Alertmanager Proxy K8s
2 reference/machine-charms Machine Charms
3 reference/grafana-agent Grafana Agent
3 reference/cos-proxy COS Proxy


Mapping table
Path Location
/topics/canonical-observability-stack/editions/ha /topics/canonical-observability-stack/editions/standard
/topics/canonical-observability-stack/on MicroK8s /topics/canonical-observability-stack/install/microk8s
/topics/canonical-observability-stack/on%20MicroK8s /topics/canonical-observability-stack/install/microk8s
/topics/canonical-observability-stack/install/microk8s /topics/canonical-observability-stack/tutorials/install-microk8s

Updated article to reflect the renaming of LMA Light to COS Lite

This is very interesting. Would you consider running a demo in the community workshop on this? @tmihoc @hallback @anvial @mmrezaie @emcp would perhaps also be interested.

A questions is how Nagios comes in here, or if it isn’t? I’d love to know what I should include in my own charms to quickly get support from your stack here. What interfaces/relations I should provide etc.


I definitely think we can arrange a community demo :slight_smile:

About Nagios: Nagios itself is not in the picture. Alerts are generated by Prometheus or Loki. The NRPE checks embedded in current charms are supported through an adaptor charm, the cos-proxy operator, that implements the nrpe-external-master and similar LMA relations, including many of those supported by the Prometheus 2 charm. It runs a Prometheus nrpe-exporter and we will then define rules in Prometheus to raise alerts when checks go bad. The alerts are routed to Alertmanager and, from there, pretty much wherever you want :slight_smile:

In general our goal is very much to provide a smooth transition for charms supporting previous relations, as well as provide more consistent relation interfaces going forward.

1 Like

This is super interesting!

How about you run the show in two weeks?

I would love to learn how to work with this as I can see many scenarios where it will work.

Does it require me to know K8?

We’ll work with the folks that organise the community hours to see which slot works.

In terms of having to know K8s: not really. We are developing COS as an appliance, which you should not have much to fumble with, and the abstractions we are exposing to the Juju admin do not seem leaky. But it would actually be an excellent datapoint to get the opinion of someone on that who is not deeply marinated in Kubernetes the way I am :slight_smile:


Why don’t you let us try it in the community with your guidance?

If we succeed = :grinning:

If we fail = :grinning_face_with_smiling_eyes:

We have not been yet diligent to a sufficient degree in documenting the various moving parts :slight_smile:

But we are developing in the open and are looking forward to feedback, so please by all means do: Deploy Canonical Observability Stack Lite using Charmhub - The Open Operator Collection

Added link to how-to article on resolving ulimit issues.