How to deploy COS in a distributed with terraform

The reference COS terraform module makes use of the following Terraform modules:

Setup requirements

The first step is to create a new model dedicated to COS in a k8s controller:

juju add-model cos

A S3 compatible object storage is needed so Loki, Mimir and Tempo can store logs, metrics and traces. MicroCeph is the easiest way to get up and running with Ceph. You can follow this guide to deploy MicroCeph.

After deploying MicroCeph you will need this information in order to deploy COS:

  • The S3 endpoint for s3_endpoint variable.
  • The Secrets: access_key and secret_key for s3_user and s3_password variables.
  • The names of the buckets for loki_bucket, mimir_bucket and tempo_bucket variables.

Deploy

Create a main.tf file, importing the cos module:

# COS module that deploy the whole Canonical Observability Stack
module "cos" {
  source       = "git::https://github.com/canonical/observability//terraform/modules/cos"
  model_name   = var.model_name
  channel      = var.channel
  s3_endpoint  = var.s3_endpoint
  s3_password  = var.s3_password
  s3_user      = var.s3_user
  loki_bucket  = var.loki_bucket
  mimir_bucket = var.mimir_bucket
  tempo_bucket = var.tempo_bucket

  loki_backend_units            = var.loki_backend_units
  loki_read_units               = var.loki_read_units
  loki_write_units              = var.loki_write_units
  mimir_backend_units           = var.mimir_backend_units
  mimir_read_units              = var.mimir_read_units
  mimir_write_units             = var.mimir_write_units
  tempo_compactor_units         = var.tempo_compactor_units
  tempo_distributor_units       = var.tempo_distributor_units
  tempo_ingester_units          = var.tempo_ingester_units
  tempo_metrics_generator_units = var.tempo_metrics_generator_units
  tempo_querier_units           = var.tempo_querier_units
  tempo_query_frontend_units    = var.tempo_query_frontend_units
}

variable "channel" {
  description = "Charms channel"
  type        = string
  default     = "latest/edge"
}

variable "model_name" {
  description = "Model name"
  type        = string
}

variable "use_tls" {
  description = "Specify whether to use TLS or not for coordinator-worker communication. By default, TLS is enabled through self-signed-certificates"
  type        = bool
  default     = true
}

variable "s3_endpoint" {
  description = "S3 endpoint"
  type        = string
}

variable "s3_user" {
  description = "S3 user"
  type        = string
  sensitive   = true
}

variable "s3_password" {
  description = "S3 password"
  type        = string
  sensitive   = true
}

variable "loki_bucket" {
  description = "Loki bucket name"
  type        = string
  sensitive   = true
}

variable "mimir_bucket" {
  description = "Mimir bucket name"
  type        = string
  sensitive   = true
}

variable "tempo_bucket" {
  description = "Tempo bucket name"
  type        = string
  sensitive   = true
}

variable "loki_backend_units" {
  description = "Number of Loki worker units with backend role"
  type        = number
  default     = 1
}

variable "loki_read_units" {
  description = "Number of Loki worker units with read role"
  type        = number
  default     = 1
}

variable "loki_write_units" {
  description = "Number of Loki worker units with write roles"
  type        = number
  default     = 1
}

variable "mimir_backend_units" {
  description = "Number of Mimir worker units with backend role"
  type        = number
  default     = 1
}

variable "mimir_read_units" {
  description = "Number of Mimir worker units with read role"
  type        = number
  default     = 1
}

variable "mimir_write_units" {
  description = "Number of Mimir worker units with write role"
  type        = number
  default     = 1
}

variable "tempo_compactor_units" {
  description = "Number of Tempo worker units with compactor role"
  type        = number
  default     = 1
}

variable "tempo_distributor_units" {
  description = "Number of Tempo worker units with distributor role"
  type        = number
  default     = 1
}

variable "tempo_ingester_units" {
  description = "Number of Tempo worker units with ingester role"
  type        = number
  default     = 1
}

variable "tempo_metrics_generator_units" {
  description = "Number of Tempo worker units with metrics-generator role"
  type        = number
  default     = 1
}

variable "tempo_querier_units" {
  description = "Number of Tempo worker units with querier role"
  type        = number
  default     = 1
}
variable "tempo_query_frontend_units" {
  description = "Number of Tempo worker units with query-frontend role"
  type        = number
  default     = 1
}

Then, use terraform to deploy the module, using 3 units per worker:

terraform init

terraform apply -var='s3_password=bar' -var='s3_user=foo' -var='s3_endpoint=http://192.168.1.145' \
-var='loki_bucket=loki' -var='model_name=cos' -var='mimir_bucket=mimir' -var='tempo_bucket=tempo' \
-var='loki_backend_units=3' -var='loki_read_units=3' -var='loki_write_units=3' \
-var='mimir_backend_units=3' -var='mimir_read_units=3' -var='mimir_write_units=3' \
-var='tempo_compactor_units=3' -var='tempo_distributor_units=3' -var='tempo_ingester_units=3' \
-var='tempo_metrics_generator_units=3' -var='tempo_querier_units=3' -var='tempo_query_frontend_units=3'

Some minutes after running these two commands, we have a distributed COS deployment!

Cleanup

terraform destroy -var model_name=cos

Feedback welcome!

As always, feedback is very welcome! Feel free to let us know your thoughts, questions, or suggestions either here or as an issue.

4 Likes