Canonical Observability Stack

michele-mancioppi · 23 September 2021 10:43

Highly-integrated, low-operations observability stack powered by Juju and Microk8s.

The Canonical Observability Stack (COS Lite) gathers, processes, visualizes, and alerts on telemetry signals generated by workloads running both within, and outside of, Juju.

By leveraging the topology model of Juju to contextualize the data, and charm relations to automate configuration and integration, it provides a low-ops observability suite based on best-in-class, open-source observability tools.

For Site-Reliability Engineers, Canonical Observability Stack provides a turn-key, out-of-the-box solution for improved day 2 operational insight.

In this documentation


Tutorial Get started - a hands-on introduction for new users deploying COS.	How-to guides Step-by-step guides covering key operations and common tasks
Explanation Concepts - discussion and clarification of key topics	Reference Technical information - specifications, APIs, architecture

Project and community

The Canonical Observability Stack is a member of the Ubuntu family. It’s an open source project that warmly welcomes community projects, contributions, suggestions, fixes and constructive feedback.

Thinking about using the Canonical Observability Stack for your next project? Get in touch!

Navigation

Level	Path	Navlink
1	overview	Home
1	tutorials	Tutorial
2	tutorials/install-microk8s	Getting started on MicroK8s
2	tutorials/sync-alert-rules-from-git	Sync alert rules from Git
2	tutorials/instrumenting-machine-charms	Instrumenting machine charms
2	tutorials/distributed-storage	Set up distributed storage
1	how-to	How-to
2	how-to/configure-scrape-jobs	Configure Prometheus scrape jobs
2	how-to/metrics-endpoint	Expose a metrics endpoint
2	how-to/add-alert-rules	Add Alert Rules
2	how-to/troubleshoot-gateway-address-unavailable	Troubleshoot Traefik “Gateway address unavailable”
2	how-to/migrate-from-lma	Migrate from LMA to COS
2	how-to/integrate-cos-lite-with-uncharmed-applications	Integrate COS Lite with uncharmed applications
2	how-to/fix-socket-too-many-open-files	Fix `socket: too many open files`
1	explanation	Explanation
2	editions/lite	COS Lite
2	design-goals	Design Goals
2	juju-topology	Juju topology
2	what-is-observability	What is observability?
2	model-driven-observability-tag	Model-Driven Observability
2	explanation/tls	COS and TLS
2	explanation/ingress	Charmed Ingress
2	explanation/telemetry-labels	Telemetry labels
2	explanation/topology-labels	Topology labels
2	explanation/logging	Logging architecture
1	reference	Reference
2	reference/solution-matrix	Solution matrix
2	reference/bundle-topology	Bundle topology
2	reference/best-practices	Deployment Best Practices
2	reference/performance	Performance
3	reference/performance/on-4cpu-8gb-ssd	on `4cpu-8gb-ssd`
3	reference/performance/on-8cpu-16gb-ssd	on `8cpu-16gb-ssd`
2	reference/kubernetes-charms	Kubernetes Charms
3	reference/traefik-k8s	Traefik K8s
3	reference/alertmanager-k8s	Alertmanager K8s
3	reference/prometheus-k8s	Prometheus K8s
3	reference/prometheus-scrape-target-k8s	Scrape Target K8s
3	reference/prometheus-scrape-config-k8s	Scrape Config K8s
3	reference/loki-k8s	Loki K8s
3	reference/grafana-k8s	Grafana K8s
3	reference/grafana-agent-k8s	Grafana Agent K8s
3	reference/catalogue-k8s	Catalogue K8s
3	reference/mimir-k8s	Mimir K8s
3	reference/cos-configuration-k8s	COS Config K8s
3	reference/karma-k8s	Karma K8s
3	reference/karma-alertmanager-proxy-k8s	Karma Alertmanager Proxy K8s
2	reference/machine-charms	Machine Charms
3	reference/grafana-agent	Grafana Agent
3	reference/cos-proxy	COS Proxy

Redirects

Mapping table

Path	Location
/topics/canonical-observability-stack/editions/ha	/topics/canonical-observability-stack/editions/standard
/topics/canonical-observability-stack/on MicroK8s	/topics/canonical-observability-stack/install/microk8s
/topics/canonical-observability-stack/on%20MicroK8s	/topics/canonical-observability-stack/install/microk8s
/topics/canonical-observability-stack/install/microk8s	/topics/canonical-observability-stack/tutorials/install-microk8s

0x12b · 26 January 2022 13:08

Updated article to reflect the renaming of LMA Light to COS Lite

erik-lonroth · 1 February 2022 08:44

This is very interesting. Would you consider running a demo in the community workshop on this? @tmihoc @hallback @anvial @mmrezaie @emcp would perhaps also be interested.

A questions is how Nagios comes in here, or if it isn’t? I’d love to know what I should include in my own charms to quickly get support from your stack here. What interfaces/relations I should provide etc.

michele-mancioppi · 1 February 2022 20:18

I definitely think we can arrange a community demo

About Nagios: Nagios itself is not in the picture. Alerts are generated by Prometheus or Loki. The NRPE checks embedded in current charms are supported through an adaptor charm, the cos-proxy operator, that implements the nrpe-external-master and similar LMA relations, including many of those supported by the Prometheus 2 charm. It runs a Prometheus nrpe-exporter and we will then define rules in Prometheus to raise alerts when checks go bad. The alerts are routed to Alertmanager and, from there, pretty much wherever you want

In general our goal is very much to provide a smooth transition for charms supporting previous relations, as well as provide more consistent relation interfaces going forward.

erik-lonroth · 2 February 2022 00:53

This is super interesting!

How about you run the show in two weeks?

I would love to learn how to work with this as I can see many scenarios where it will work.

Does it require me to know K8?

michele-mancioppi · 2 February 2022 04:57

We’ll work with the folks that organise the community hours to see which slot works.

In terms of having to know K8s: not really. We are developing COS as an appliance, which you should not have much to fumble with, and the abstractions we are exposing to the Juju admin do not seem leaky. But it would actually be an excellent datapoint to get the opinion of someone on that who is not deeply marinated in Kubernetes the way I am

erik-lonroth · 3 February 2022 21:33

Why don’t you let us try it in the community with your guidance?

If we succeed =

If we fail =

michele-mancioppi · 4 February 2022 13:10

We have not been yet diligent to a sufficient degree in documenting the various moving parts

But we are developing in the open and are looking forward to feedback, so please by all means do: Deploy Canonical Observability Stack Lite using Charmhub - The Open Operator Collection

pedroleaoc · 7 April 2022 08:33

0x12b · 17 November 2023 12:26

Added link to how-to article on resolving ulimit issues.