Tempo HA resource consumption

michaeldmitry · 16 August 2024 11:07

Table of Contents

Environment
Method
- Metrics
- Ingestion
Setup
Results
Observations
Conclusions

The purpose of this document is to get an idea on how Tempo HA performs under different loads and assign a sensible value to use for CPU and memory for both, coordinator and worker.

Environment

No resource limit set
Microk8s v1.28.12
Juju 3.4.5
x86_64 8 core CPU
16GB RAM
SSD disk

The results were obtained with the following charm versions:

App	Version	Charm	Channel	Rev
alertmanager	0.27.0	alertmanager-k8s	edge	129
catalogue		catalogue-k8s	edge	59
grafana	9.5.3	grafana-k8s	edge	118
grafana-agent-k8s	0.40.4	grafana-agent-k8s	edge	86
loki	2.9.6	loki-k8s	edge	163
prometheus	2.52.0	prometheus-k8s
traefik	2.11.0	traefik-k8s	edge	203
tempo-coordinator		tempo-coordinator-k8s	edge	2
tempo-worker	2.4.0	tempo-worker-k8s	edge	6
minio		minio	edge	357
s3-integrator		s3-integrator	edge	33

Method

We’ll be using Tempo workload metrics and Microk8’s kubelet cAdvisor metrics scraped by Prometheus to monitor the workloads and inspect them in Grafana.

Metrics

Tempo metrics

Tempo exposes a set of metrics that would be useful for our testing purposes. Below are the expressions used with those metrics:

rate(tempo_distributor_bytes_received_total[5m]) / 1024
rate(tempo_distributor_spans_received_total[5m])
tempodb_backend_bytes_total / 1024 / 1024 / 1024

cAdvisor metrics

cAdvisor is component integrated in kubernetes’ kubelet that exposes container-level metrics that include CPU and memory usage for each container deployed on the kubernetes cluster. Below are the expressions used with those metrics:

container_memory_usage_bytes{pod=~"<POD NAME>"} / 1024 / 1024
rate(container_cpu_usage_seconds_total{pod=~"<POD NAME>"}[5m]) * 1000

Ingestion

We’ll be ingesting spans using a tracing generation script to Tempo Coordinator at 3 different ingested spans per second rates and observe how will that affect the CPU and memory usage:

100 spans/sec
500 spans/sec
1000 spans/sec

Setup

First, deploy cos-lite and grafana-agent-k8s

juju deploy cos-lite --channel=latest/edge --trust
juju deploy grafana-agent-k8s --channel=latest/edge

Deploy `tempo-bundle`

Note that there are several modes of operation for the tempo cluster:

Monolithic mode
Distributed microservices mode

Both modes will be used to run the same set of loads.

Deploy Tempo in Monolithic Mode

git clone https://github.com/canonical/tempo-bundle.git
cd tempo-bundle
tox -e render-bundle
juju deploy ./bundle.yaml --trust

Deploy Tempo in Microservices Mode

git clone https://github.com/canonical/tempo-bundle.git
cd tempo-bundle
tox -e render-bundle -- --mode=recommended-microservices
juju deploy ./bundle.yaml --trust

In order for the Tempo cluster to fully work, you need to integrate with an S3-like object storage like here.

Scrape Microk8s metrics

Currently, Prometheus charm does not have a mechanism to scrape cAdvisor metrics. In order to collect those metrics, modify Prometheus’s configuration and add the below scrape job:

- job_name: kubernetes-cadvisor
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  kubernetes_sd_configs:
  - role: node
  relabel_configs:
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - target_label: __address__
    replacement: kubernetes.default.svc.cluster.local:443
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

Integrate

jhack imatrix fill

Results

Note that the CPU and memory requests values would be for the highest resource consuming container in the pod, which is charm in coordinator and tempo in worker.

Monolithic mode

100 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Worker	~10-30	~220-497
Coordinator	~2.4-9.2	~107-110

500 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Worker	~25-82	~172-584
Coordinator	~3-10	~90-107

1000 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Worker	~34-100	~186-684
Coordinator	~4-12	~86-118

Microservices mode

100 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Compactor	~6-48	~117-445
Ingester	~5.6-10.6	~145-183
Distributor	~4-8	~99-103
Querier	~2-4.5	~115-121
Metrics Generator	~1-3.7	~70
Query Frontend	~1.8-4.5	~94-99
Coordinator	~2.6-15.8	~94-100

500 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Compactor	~4-60	~114-528
Ingester	~7.5-14	~126-187
Distributor	~6-11	~93-99
Querier	~3-6	~112-117
Metrics Generator	~1-4	~64-68
Query Frontend	~1.7-5	~89-98
Coordinator	~3-16	~92-100

1000 spans/sec

	CPU (min-max millicore)	RAM (min-max MiB)
Compactor	~7-66	~116-550
Ingester	~10-17	~127-190
Distributor	~9-12	~88-97
Querier	~1.6-5	~107-115
Metrics Generator	~1.2-4	~62-67
Query Frontend	~4-12	~79-93
Coordinator	~3-20	~88-97

Observations

Most resource-consuming components are the compactor and ingester. Shifting from monolithic to microservices mode clearly showed the resource spikes for each component separately.
Per-pod resource consumption is a bit lower in microservices mode than in monolithic mode. However, the overall resource consumption is higher.
Coordinator CPU and memory usage is not highly affected by the rate of spans ingested per second.

Compactor

The Compactor is responsible for most of the memory and CPU consumption amongst the Tempo workers and there’s a significant gap between the min and max CPU/memory consumed during a compactor’s lifecycle and that is due to its compaction cycle of Tempo as shown below:

CPU

Memory

Ingester

The second most CPU consuming component is the Ingester and in this microservices mode, there are 3 units of the ingester getting the workload distributed among them as this is the recommended microservices deployment. The spikes occur when the ingester cuts a block (i.e when memory-buffered data are batched and the batch reaches a predefined size or a time threshold is met and is now ready to be flushed to the backend).

Conclusions

The values chosen below for CPU and memory are values to be used as a request not as a limit to the pod’s needed resources.

They are sensible values to what the workload might need and they don’t guarantee that the workload will not ever need more resources.

Coordinator

From the above observations, the coordinator is not highly affected by the mode of operation and the rate of ingested spans per second.

From the above results, the CPU consumption varies between 2-20millicores. A reasonable CPU request for the container might be 50m since its a relatively small value and would cover the coordinator needs for CPU without consuming much of the cluster’s pool.

From the above results, the memory consumption varies between 85-100 MiB. A reasonable Memory request for the container might be 100MiB to cover the pod’s observed minimum needs.

Worker

From the above results, the CPU consumption in monolithic mode varies between 10-100 millicores and 1-70 millicores in the microservices mode. A reasonable CPU request for each workload container might be 50m.

From the above results, the memory consumption in monolithic mode varies between 170-700 MiB and 100-550 MiB in the microservices mode. A 500 MiB request for each pod to cover the compactor spikes seems an overkill, so a reasonable memory request for each workload container might be 200MiB to cover most worker roles needs and the compactor can still request for more RAM during its spikes.

Tempo HA resource consumption

Environment

Method

Metrics

Tempo metrics

cAdvisor metrics

Ingestion

Setup

Deploy tempo-bundle

Deploy Tempo in Monolithic Mode

Deploy Tempo in Microservices Mode

Scrape Microk8s metrics

Integrate

Results

Monolithic mode

100 spans/sec

500 spans/sec

1000 spans/sec

Microservices mode

100 spans/sec

500 spans/sec

1000 spans/sec

Observations

Compactor

Ingester

Conclusions

Coordinator

Worker

Deploy `tempo-bundle`