Tutorial: Deploy Tempo HA on top of COS-Lite

This is a work in progress and will be updated.

Contents:

Introduction

The Tempo HA solution consists of two Juju charms, tempo-coordinator-k8s and tempo-worker-k8s, that together deploy and operate Tempo, a distributed tracing backend by Grafana. The solution can operate independently, but we recommend running it alongside COS Lite

Prerequisites

This tutorial assumes you already have a Juju model with COS Lite deployed on a Kubernetes cloud. To deploy one, follow COS-Lite tutorial first. If you just want to test out deploying Tempo, you can omit this part and Relating to COS-Lite, however your Tempo instance would need to be accessed using juju ssh. If you don’t know how to do it, you likely will benefit more from going through COS-Lite tutorial first.

Tempo charm uses an S3-compatible object storage as its storage backend. In most production deployments, it either means deploying Ceph or connecting to an S3 backend via the s3-integrator charm. For testing and development, we recommend using the minio charm. To deploy and configure Minio to provide an S3 bucket to Tempo, you will need to follow this guide.

In the following steps we will assume that you have, in a model called cos:

  • a COS-Lite deployment
  • an application called s3 providing an s3 endpoint.

Deploy monolithic setup

To deploy Tempo HA in monolithic mode, where a single worker node runs all components of distributed Tempo, you need to deploy a single instance of the tempo-coordinator-k8s charm as tempo and a single instance of the tempo-worker-k8s charm as tempo-worker.

Integrate them over cluster and integrate the tempo application with s3. At that point tempo should go to active status.

code
juju deploy tempo-coordinator-k8s --channel edge --trust tempo
juju deploy tempo-worker-k8s --channel edge --trust tempo-worker
juju integrate tempo tempo-worker
juju integrate tempo s3
juju integrate tempo:ingress traefik:traefik-route

# if you want to set up tracing for COS-Lite charms and the workloads that support it:
juju integrate tempo traefik:tracing
juju integrate tempo loki:tracing
juju integrate tempo grafana:tracing
juju integrate tempo prometheus:tracing

# grafana datasource integration
juju integrate tempo:grafana-source grafana:grafana-source

# self-monitoring integrations
juju integrate tempo:grafana-dashboard grafana:grafana-dashboard
juju integrate tempo:metrics-endpoint prometheus:metrics-endpoint
juju integrate tempo:logging loki:logging

Add dedicated worker roles

You can turn the monolithic deployment you have now in a distributed one by assigning some or all Tempo roles to individual worker units. See this doc for an explanation of the architecture, and this document for reference of the tempo roles specifically. For Tempo to work, each one of the required roles must be assigned to a worker. This requirement is trivially satisfied if a worker has the role all.

So, without removing the tempo-worker application, we are going to start by adding an ingester worker node. Deploy another instance of the tempo-worker-k8s charm as tempo-ingester and add it to the same tempo coordinator instance to have it join the cluster. Configure the ingester application, as you deploy it or afterwards, to only have the ingester role enabled.

code
juju deploy tempo-worker-k8s --channel edge --trust tempo-ingester --config role-all=false --config role-ingester=true
juju integrate tempo tempo-ingester

Wait for the worker to go to active. Now you have two worker nodes running the ingester component, and one node also running all other components.

Repeat the above step for all other required roles: querier, query-frontend, distributor, compactor, and when you’re done, remove the tempo-worker application. Each of the required roles initially taken by the tempo-worker application has now been transferred to the new worker instances.

Further reading