Observability Team Updates - Week 47-48 (2022)

Hi everyone!

Below you’ll find the biweekly updates from the team for weeks 47 and 48.

The Team

The observability team at Canonical consists of Dylan, Jose, Leon, Luca, Pietro, Ryan, and Simme. Our goal is to provide you with the best open-source observability stack possible, turning your day-2 operations into smooth sailing.

COS Lite

COS Lite is a light-weight, highly-integrated observability suite, powered by python operators and running on Juju. Find more information on charmhub or go straight to github.

Features

In addition, we’ve continued working on an upstream contribution to Grafana Loki, allowing engineering to configure a size-based retention threshold in addition to the current age-based ditto

Fixes

  • Make the rule file detection of prometheus_scrape more permissive by allowing for .yml and yaml file extensions in addition to .rule and .rules #400
  • Disable injection of filter selectors for Grafana dashboards contributed through the COS Configuration charm #30 #33
  • Add tests to verify prometheus_scrape allows for paths per unit #402
  • Make Loki provide the ingressed URL, when one is available, over the prometheus_scrape relation #214 #397
  • Fix timing issues that caused Prometheus to intermittently go into an error state during removal #392
  • Fix timing issue where Prometheus was trying to access a relation databag that was no longer there, causing it to enter an error state during relation break #407
  • Fix an issue where rebooting the Kubernetes cluster node(s) would wipe existing alert rules from Prometheus #398

Have a nice week!

2 Likes