As we have explained in a previous post in Loki, high cardinality can lead to significant performance and storage issues, as each unique combination of labels creates a new log stream. When labels, such as filename
, frequently change, the number of streams can grow exponentially, affecting the scalability and efficiency of the system. This is where structured_metadata
becomes essential: instead of indexing dynamic labels like filename
, we can store them as unindexed metadata, reducing cardinality while still retaining the ability to filter or search by those values.
Configuring Grafana Agent to move filename
into structured_metadata
To prevent labels like filename
from generating unnecessary streams, Grafana Agent allows you to move these labels to structured_metadata
. This can be done using log processing pipeline stages. A basic example is as follows:
scrape_configs:
- job_name: log_scraper
pipeline_stages:
- structured_metadata:
filename: filename
- labeldrop:
- filename
static_configs:
- labels:
__path__: /var/log/app.log
job: app
targets:
- localhost
This pipeline extracts filename
from the labels and places it into structured_metadata
, removing the filename
label from logs sent to Loki, but preserving the information in unindexed metadata.
How queries need to be modified
When using structured metadata instead of labels, queries in Loki change slightly. Instead of filtering logs based on indexed labels like filename
, you now need to query within the log contents. For instance, instead of querying filename="/var/log/app.log"
, you would remove filename
from the label selectors and instead search for it within the log lines using the |
operator, like this: {job="app"}
followed by a text search | filename="app.log"
. This approach reduces index load and helps avoid high cardinality while still allowing flexible filtering.
Before using structured metadata (with filename
as a label):
{job="app", filename="/var/log/app.log"}
After using structured metadata (with filename
moved to structured metadata):
{job="app"} | filename="/var/log/app.log"
The limitation in architectures with multiple Grafana Agent instances
However, this solution encounters a limitation in more complex architectures. If you have a deployment where Promtail or Grafana Agent sends logs to an intermediate Grafana Agent, which then forwards the logs to Loki
promtail (or grafana-agent) --> grafana-agent --> Loki
the structured_metadata
will not be transmitted correctly. Grafana Agent, in its current state, does not forward structured_metadata
it receives from another source, resulting in the loss of this information before it reaches Loki.
The need for alternatives in more complex configurations
In such cases, one option would be to connect Promtail or the first Grafana Agent instance directly to Loki, ensuring that the structured_metadata
reaches its destination intact. Alternatively, if the architecture requires Grafana Agent to function as an intermediary, it may be necessary to explore other solutions, such as avoiding the use of structured_metadata
at this stage and managing the data differently to retain key information without affecting Loki’s performance.