We have been working hard to understand more and more about the COS-light stack over the last year, slowly working our way to be able to deploy it into a production-like environment with our company.
We have worked and learned from @0x12b to deploy it in a microk8-stack. Watched the Charmhub | Deploy Grafana Agent using Charmhub - The Open Operator Collection mature and now also developed a few example charms to learn how to work with it.
At this point, we have some remaining questions as we are now taking the last few steps into production which we would appreciate some help getting clarity to:
- What is the technical details on how cross-controller-cross-model integrations work?
- What prerequisites are there for a user to offer and consume integrations over two separate controllers for example?
- What are the benefits/drawbacks of using separate controllers for k8-clouds and vm-controllers as opposed to adding “lxd-cloud -> k8-controller”, or, “k8-cloud -> lxd-controller”?
-
Loki: We haven’t yet been able to figure out how get the COSAgentProvider to point to specific log-files. How do we do that? Perhaps there are some docs on this?
-
COS light uses an old version of Grafana. Will the charms handle this always in the COS-light bundle, or how does upgrades work on grafana and other components in the COS-light stack work?
-
We would like to go to production on the “stable channel” - but traefik has an issue presently that causes it to lose its network-address after a system-reboot. Is there a hotfix which would allow us to go into production on a stable channel?
-
Backup/restore procedures. Is there anything documented on this for the COS-light stack?
-
TLS - how can we setup certificates that allows us to expose grafana/prometheus/traefik externally? Is there anyting written here? Guides to follow?
-
Are there any exemple/informat on how an alert-rule needs to be constructed for prometheus and loki. We have looked into the repo https://github.com/canonical/cos-configuration-k8s-operator/tree/main/tests/samples but its not explaining what general method/process of creating own dashboards, rules etc. Its hard to create your own.
-
Are there any descriptions on how we create integrations with:
- pagerduty: https://www.pagerduty.com/
- zammad: https://zammad.com/en
- Node-RED: https://nodered.org/
- How do you actually manage to UPDATE the grafana-dashboards as part of upgrading a charm using the COSAgentProvider since this is not supported by the API of grafana? We are super interested in learning the details of this since it would help us in the development process of our own dashboards.
We’ll work hard from our end to discover the above, but I figure some of you already covered this ground or some of it.
Thanx!