Table of Contents
The purpose of this document is to get an idea on how Tempo HA performs under different loads and assign a sensible value to use for CPU and memory for both, coordinator and worker.
Environment
- No resource limit set
- Microk8s v1.28.12
- Juju 3.4.5
- x86_64 8 core CPU
- 16GB RAM
- SSD disk
The results were obtained with the following charm versions:
App | Version | Charm | Channel | Rev |
---|---|---|---|---|
alertmanager | 0.27.0 | alertmanager-k8s | edge | 129 |
catalogue | catalogue-k8s | edge | 59 | |
grafana | 9.5.3 | grafana-k8s | edge | 118 |
grafana-agent-k8s | 0.40.4 | grafana-agent-k8s | edge | 86 |
loki | 2.9.6 | loki-k8s | edge | 163 |
prometheus | 2.52.0 | prometheus-k8s | ||
traefik | 2.11.0 | traefik-k8s | edge | 203 |
tempo-coordinator | tempo-coordinator-k8s | edge | 2 | |
tempo-worker | 2.4.0 | tempo-worker-k8s | edge | 6 |
minio | minio | edge | 357 | |
s3-integrator | s3-integrator | edge | 33 |
Method
We’ll be using Tempo workload metrics and Microk8’s kubelet cAdvisor metrics scraped by Prometheus to monitor the workloads and inspect them in Grafana.
Metrics
Tempo metrics
Tempo exposes a set of metrics that would be useful for our testing purposes. Below are the expressions used with those metrics:
rate(tempo_distributor_bytes_received_total[5m]) / 1024
rate(tempo_distributor_spans_received_total[5m])
tempodb_backend_bytes_total / 1024 / 1024 / 1024
cAdvisor metrics
cAdvisor is component integrated in kubernetes’ kubelet that exposes container-level metrics that include CPU and memory usage for each container deployed on the kubernetes cluster. Below are the expressions used with those metrics:
container_memory_usage_bytes{pod=~"<POD NAME>"} / 1024 / 1024
rate(container_cpu_usage_seconds_total{pod=~"<POD NAME>"}[5m]) * 1000
Ingestion
We’ll be ingesting spans using a tracing generation script to Tempo Coordinator at 3 different ingested spans per second rates and observe how will that affect the CPU and memory usage:
- 100 spans/sec
- 500 spans/sec
- 1000 spans/sec
Setup
First, deploy cos-lite
and grafana-agent-k8s
juju deploy cos-lite --channel=latest/edge --trust
juju deploy grafana-agent-k8s --channel=latest/edge
Deploy tempo-bundle
Note that there are several modes of operation for the tempo cluster:
- Monolithic mode
- Distributed microservices mode
Both modes will be used to run the same set of loads.
Deploy Tempo in Monolithic Mode
git clone https://github.com/canonical/tempo-bundle.git
cd tempo-bundle
tox -e render-bundle
juju deploy ./bundle.yaml --trust
Deploy Tempo in Microservices Mode
git clone https://github.com/canonical/tempo-bundle.git
cd tempo-bundle
tox -e render-bundle -- --mode=recommended-microservices
juju deploy ./bundle.yaml --trust
In order for the Tempo cluster to fully work, you need to integrate with an S3-like object storage like here.
Scrape Microk8s metrics
Currently, Prometheus charm does not have a mechanism to scrape cAdvisor metrics. In order to collect those metrics, modify Prometheus’s configuration and add the below scrape job:
- job_name: kubernetes-cadvisor
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc.cluster.local:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
Integrate
jhack imatrix fill
Results
Note that the CPU and memory requests values would be for the highest resource consuming container in the pod, which is
charm
incoordinator
andtempo
inworker
.
Monolithic mode
100 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Worker | ~10-30 | ~220-497 |
Coordinator | ~2.4-9.2 | ~107-110 |
500 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Worker | ~25-82 | ~172-584 |
Coordinator | ~3-10 | ~90-107 |
1000 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Worker | ~34-100 | ~186-684 |
Coordinator | ~4-12 | ~86-118 |
Microservices mode
100 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Compactor | ~6-48 | ~117-445 |
Ingester | ~5.6-10.6 | ~145-183 |
Distributor | ~4-8 | ~99-103 |
Querier | ~2-4.5 | ~115-121 |
Metrics Generator | ~1-3.7 | ~70 |
Query Frontend | ~1.8-4.5 | ~94-99 |
Coordinator | ~2.6-15.8 | ~94-100 |
500 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Compactor | ~4-60 | ~114-528 |
Ingester | ~7.5-14 | ~126-187 |
Distributor | ~6-11 | ~93-99 |
Querier | ~3-6 | ~112-117 |
Metrics Generator | ~1-4 | ~64-68 |
Query Frontend | ~1.7-5 | ~89-98 |
Coordinator | ~3-16 | ~92-100 |
1000 spans/sec
CPU (min-max millicore) | RAM (min-max MiB) | |
---|---|---|
Compactor | ~7-66 | ~116-550 |
Ingester | ~10-17 | ~127-190 |
Distributor | ~9-12 | ~88-97 |
Querier | ~1.6-5 | ~107-115 |
Metrics Generator | ~1.2-4 | ~62-67 |
Query Frontend | ~4-12 | ~79-93 |
Coordinator | ~3-20 | ~88-97 |
Observations
- Most resource-consuming components are the
compactor
andingester
. Shifting from monolithic to microservices mode clearly showed the resource spikes for each component separately. - Per-pod resource consumption is a bit lower in microservices mode than in monolithic mode. However, the overall resource consumption is higher.
- Coordinator CPU and memory usage is not highly affected by the rate of spans ingested per second.
Compactor
The Compactor
is responsible for most of the memory and CPU consumption amongst the Tempo workers and there’s a significant gap between the min and max CPU/memory consumed during a compactor’s lifecycle and that is due to its compaction cycle of Tempo as shown below:
CPU
Memory
Ingester
The second most CPU consuming component is the Ingester
and in this microservices mode, there are 3
units of the ingester getting the workload distributed among them as this is the recommended
microservices deployment. The spikes occur when the ingester cuts a block
(i.e when memory-buffered data are batched and the batch reaches a predefined size or a time threshold is met and is now ready to be flushed to the backend).
Conclusions
The values chosen below for CPU and memory are values to be used as a request not as a limit to the pod’s needed resources.
They are sensible values to what the workload might need and they don’t guarantee that the workload will not ever need more resources.
Coordinator
From the above observations, the coordinator is not highly affected by the mode of operation and the rate of ingested spans per second.
From the above results, the CPU consumption varies between 2-20
millicores. A reasonable CPU request for the container might be 50m
since its a relatively small value and would cover the coordinator needs for CPU without consuming much of the cluster’s pool.
From the above results, the memory consumption varies between 85-100
MiB. A reasonable Memory request for the container might be 100MiB
to cover the pod’s observed minimum needs.
Worker
From the above results, the CPU consumption in monolithic mode varies between 10-100
millicores and 1-70
millicores in the microservices mode. A reasonable CPU request for each workload container might be 50m
.
From the above results, the memory consumption in monolithic mode varies between 170-700
MiB and 100-550
MiB in the microservices mode. A 500
MiB request for each pod to cover the compactor spikes seems an overkill, so a reasonable memory request for each workload container might be 200
MiB to cover most worker roles needs and the compactor can still request for more RAM during its spikes.