How to integrate `cos-lite` with uncharmed applications

The cos-lite bundle is meant to be run by Juju. However, not all workloads that you may want to monitor do. The good news is that you can use cos-lite to monitor workloads that are not charmed (aka ‘not managed by Juju’). The bad news is that it’s relatively straightforward to do so. Not bad at all.

Contents:

Deploy cos-lite

The first step will be to get a hold of a machine, somewhere, and follow this guide on how to get started with COS lite on microk8s.

And be sure to follow the best practices!

Unless you’re also planning to monitor some charmed applications with this cos-lite deployment, you will not need to use the offers overlay.

Deploy grafana-agent

The Grafana agent will act as an intermediary between the applications you want to monitor and the cos-lite stack. It will gather telemetry from your applications and send them to cos-lite, where you will be able to inspect them through the Grafana dashboards.

We recommend to host the Grafana agent as close as possible to the workloads you intend to monitor, to minimise the risk of network faults and the resulting gaps in telemetry collection. Also, we recommend to install the Grafana agent via a handy snap we maintain:

Get it from the Snap Store

However, Grafana agent is also available as a single Go binary, and you are free to install it and run it the way you like. See the official documentation for the publisher’s recommendations and guides.

Last but not least, we also have it containerized and petrified.

Now that you have Grafana Agent up and running, you will need to configure it.

Get the API endpoints from traefik

cos-lite includes a traefik instance that takes care of load balancing and ingressing the various observability components of the stack. Since cos-lite runs on Kubernetes, this allows you to talk to them via traefik over a stable URL.

Before you can use Traefik from an external service such as Grafana agent, you will need to ensure that the Traefik URL is routable from the service host, and that the address is stable. (e.g. not a dynamic IP) In other words, Traefik’s own URL aso needs to be stable.

In the Juju model where cos-lite is installed, you can run:

juju run traefik/0 show-proxied-endpoints

Assuming you have configured the traefik charm to use an external hostname, for example "traefik.url", you will see something like:

proxied-endpoints: '{
    "prometheus/0": {"url": "https://traefik.url/mymodel-prometheus-0"},
    "loki/0": {"url": "https://traefik.url/mymodel-loki-0"},
    "alertmanager": {"url": "https://traefik.url/mymodel-alertmanager"},
    "catalogue": {"url": "https://traefik.url/mymodel-catalogue"},
}'

You can also open https://traefik.url/mymodel-catalogue in a browser to see a page with links to all cos-lite components’ user interfaces.

At this point you will need to follow the documentation on how to configure the Grafana agent. Use the urls you obtained from traefik to tell the agent where to send its telemetry.

Add custom dashboards and alerts

In order to add your own dashboards and alerts to cos-lite you will need to deploy the cos-config charm on top of cos-lite.

Follow this guide to set up cos-config in the same Juju model in which cos-lite is deployed.

TLS

You can deploy cos-lite with the tls overlay to enable secure communications with and within COS Lite.

You can follow this guide to enable TLS in Traefik and COS Lite.

Known limitations and upcoming features

Identity

We are “working towards”[citation needed] an integration with canonical’s IAM bundle to provide a charmed identity solution to support locking down your observability stack behind an identity provider. Stay tuned for updates!

Tracing

We are “working towards”[citation needed] a tracing overlay to add distributed tracing capabilities to cos-lite. Once that work is done, you will be able to add Grafana Tempo to the stack.

Only export metrics with prometheus-scrape-target

In some rare circumstances, you might prefer to use prometheus-scrape-target instead of grafana-agent. Namely:

  • when you only need metrics (no logs, traces, etc…)
  • when you’d rather make the necessary firewall changes in the workload you want to monitor, than ingress cos-lite
  • when you’re not able to install anything (or the grafana-agent anyway) on the workload you want to monitor

If this is your situation, we’ve got you covered. You can deploy prometheus-scrape-target and configure it to scrape your workload.

1 Like

@tmihoc can you help get the TOC to work? @jose added it earlier yesterday but for some reason it doesn’t seem to work

1 Like

Done. (Also fixed some typos.)

1 Like

Hey, can you provide a grafana-agent.yaml sample in a way it would connect to COS Lite? I can’t really figure out which one of the proxyed endpoints I should use, and where I should set it in the agent.

Having a simple metric collection from the host we are monitoring, for example, CPU usage, would be very desirable.

Good point! I’ll see if someone from the team can chip in a sample config for you.