How to get TLS to a HA juju controller (vm/lxd/aws)

I’ve just bootstrapped a juju controller in aws:

juju bootstrap my-controller aws/eu-north-1

After the controller comes online, I’ve enabled high availability (ha):

juju switch admin/controller
juju enable-ha

All good - my controller is now running on 3 instances in aws.

Now I’m looking to get this controller under a proper domain - for example my-controller.example.com which means I need to get proper TLS certificates installed and distributed to all 3 instances. Typically some acme.sh or letsencrypt thing.

Is there anyone that can guide me through the “best” ™ way to get this in place so I can get my controller all SSL:ed? Pointers to docs?

1 Like

So the “best” way to get there is to start with the dns name that you want, and use that at bootstrap time. We’d have to dig a little bit, because at one point juju could automate getting a signed certificate from let’s encrypt. You had to set up the DNS record to point at the IP address, but Juju could handle the let’s encrypt authentication bits. (We originally put the work together for JAAS a number of years ago.) However, IIRC Let’s Encrypt changed how they were doing verification of users, so it is plausible that the mechanism that we were using is no longer supported by them.

I do know that you can bootstrap a juju controller with a pre-defined certificate authority (we use a CA because we use a different cert for each controller, and want to be able to update those certificates when a controller gets a new IP address, or you add more, etc).

We also pass that root certificate to all of the clients (and agents) so that they know they are talking to a trusted certificate. (If you look at ~/.local/share/controllers.yaml it includes the expected CA that will have signed the certificate that the client connects to. And that is also reproduced in /var/lib/juju/agents/*/agent.conf)

I don’t think it is a field that we support changing after bootstrap (because of the distribution problem telling all the existing clients that there was an old certificate, but we’re now using the new certificate). It would be possible to do, but would be database and config file surgery.

I also know that JAAS doesn’t use that functionality today, but instead runs HA Proxy instances to terminate external TLS. (We still TLS encrypt internally, they just use one TLS termination to HA Proxy, which then knows the second TLS inside the LAN.) I haven’t personally set that up, but I do know it is a route that has been done.

1 Like

Thanx for the pointer @jameinel.

What I’ll try first is to deploy the juju-controller (which I know works with both 2.9 and 2.0 controllers from this discussion) which have support for the http interface.

I might need to juju config controller-url='my-controller.example.com' in the juju-controller charm to match the domain name for the cert. Not sure, but it seems likely. @wallyworld might know. There isn’t really much documentation about it yet.

With that I can first deploy Charmhub | Deploy HAProxy using Charmhub - The Open Operator Collection and then relate it to juju-controller:

juju relate haproxy juju-controller

Finally, I can deploy and relate a certbot/acme subordinate to haproxy. (We have rolled our own for this purpose, but there might be other public ones.)

Note: There is a certbot subordinate charm written by Martin Hilton that might be useful… I guess this part is where things get messy since this may be very different for many users. I think that perhaps a “basic way to add a certificate for haproxy/tls” like this would need to use the haproxy built in capability to “import” certificates via “juju config ssl_cert” and “juju config ssl_key” to demonstrate the concept and then refer to some existing certbot/acme charms for specific needs/setups.

I’ll see if I can get this through…

Definitely there should be some “best practice” published on this topic.

So, my above attemt didn’t work out as I thought.

This is because the traffic on port :17070 is already SSL and I can’t terminate that at haproxy.

I tried doing “pass-through” but I have hit an obstacle at the moment. Your suggestion above is kind of out of my league for the moment.

I need to fight this more I guess. Definitely challenging here.

@jameinel - would you be able to set me up with someone that might assist me getting the haproxy method working?

Yesterday, I deployed a controller 2.9.45 and did enable-ha which scales up to a total of 3 instances and automatically places the controller in ha-mode (awesome feature). I was careful to name the controller to my intended dns-name lets say: foo.example.com I’m not sure if it matters, but still I did so.

After the controller was up and available in AWS, I added the three IP-addresses to our DNS. The DNS returned all the three public addresses in a round-robin way, which was what I expected and wanted. This means that if one controller IP stops working, the controller is still available (ha).

Now, after this - I wanted to see if I could register and get someone else to test the controller. So I called @hallback and asked him to test. so I created his user:

juju add-user hallback

Which returned a registration token string that worked just fine for him - WITHOUT any added SSL-cert or haproxy etc (!) Happy days.

We then REMOVED the controller from @hallback controllers.yaml and had him try again to login to the controller - this time with just supplying the name of the controller and its dns-name (fqdn).

juju login -c foo.example.com  foo.example.com:17070

This also works just fine.

This situation is rather OK - without having to mess with haproxy and certficates etc. Since the traffic seems to be all encrypted, I’m not risking my password and data to be leaked at some internet-cafe and all is good.

Is this secure and “good enough” to be considered a best practice? It matters since I’m also working on a “administration guide” where this kind of information might be added.

What are your thoughts on this @jameinel @hmlanigan ?

As a side note, the notable difference between the client controllers.yaml file after being updated by “register vs login” is that:

  • When using the “register” way, it populates the file with a list of all the IP:s that the controller has, while
  • … the same file after “login” contains only the domain-name fqdn (foo.example.com).

My fight with this continues…