How to selectively drop metrics prior to ingestion

Sometimes applications are instrumented with more metrics than we want to afford, from a resources perspective. In such cases we can choose to selectively drop some metrics before they are ingested.

Discover high-cardinality metrics

Prometheus’s /tsdb-status endpoint is quite handy for discovering high cardinality metrics you may want to drop prior to ingestion.

In this guide, we will show how to drop all the go_gc_* metrics. To check how many go_gc_* metrics you currently have, run the following query:

count by (__name__) ({__name__=~"go_gc_.+"})

Means for dropping metrics

Metrics can be dropped by using the drop action in several different places:

  • Under <scrape_config> section (<metric_relabel_configs> subsection). For example: all the self-monitoring scrape jobs that e.g. COS Lite has in place.
  • Under <remote_write> section (<write_relabel_configs> subsection). For example: prometheus can be told to drop metrics before pushing them to another prometheus over remote-write API. This use case is not addressed in this guide.

Drop metrics via metric name in scrape config

Charms that integrate with prometheus or grafana agent, provide a “scrape config” to MetricsEndpointProvider (imported from charms.prometheus_k8s.v0.prometheus_scrape).

Let’s take for example the alertmanager self-metrics that prometheus scrapes. If we do not want prometheus or grafana agent to ingest any go_gc_* metrics from alertmanager, then we need to adjust the scrape job specified in the alertmanager charm:

diff --git a/src/charm.py b/src/charm.py
index fa3678c..f0e943b 100755
--- a/src/charm.py
+++ b/src/charm.py
@@ -250,6 +250,13 @@ class AlertmanagerCharm(CharmBase):
             "scheme": metrics_endpoint.scheme,
             "metrics_path": metrics_path,
             "static_configs": [{"targets": [target]}],
+            "metric_relabel_configs": [
+                {
+                    "source_labels": ["__name__"],
+                    "regex": "go_gc_.+",
+                    "action": "drop",
+                }
+            ]
         }
 
         return [config]

Drop metrics via the scrape-config charm

In a typical scrape-config deployment such as:

graph LR
some-external-target --- scrape-target --- scrape-config --- prometheus

We can specify the drop action via a config option for the scrape-config charm:

$ juju config sc metric_relabel_configs="$(cat <<EOF
- source_labels: ["__name__"]
  regex: "go_gc_.+"
  action: "drop"
EOF
)"

References

2 Likes

Nice @sed-i !

These configs are in the MetricsProvider side of the relation (not Prometheus)… I was wondering if we should let a COS-Lite admin to add these configs to Prometheus jobs? WDYT @0x12b ?