Regarding Vault Certificate Management

Hello Team,
I have deployed Opnestack ussuri using base bundle and all functionalities are working as expected.

One of our external vendor needs to talk to openstack API over public endpoint. I had used self-signed CA cert at the time of deployment (cert san has local lxd ip address/juju hostname) and now wish to use our organization specific CA cert instead of root. As per docs I have retrieved the csr from juju and signed it using our org specific root/intermediate CA. https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-certificate-management.html

Need assistance with below queries?
[1] Is below command correct to issue new certificate to vault ?
juju run-action vault/0 upload-signed-csr \

pem=“$(cat /tmp/vault-charm-int.pem | base64)" \

root-ca="$(cat ./ca/example.com.pem | base64)" \

allowed-domains=‘example.com

[2] Can [1] break the openstack access if cert upload fails for some reason?
How do we roll back original self-signed CA ?

[3] What does juju run-action vault/0 reissue-certificates results into ?
Does it regenerates self-signed cert in vault and issue it across all client apps?

Thank You

Hi Siddhit,

You can switch between the self-signed CA and an external CA signed intermediate cert.

It is important to run the following command when switching from one to the other. This clears flags the charm pays attention to and guarantees the charm re-runs some tasks:

juju run-action --wait vault/0 disable-pki

Question 1: I double checked the action parameters [0] and it looks like allowed-domains is only for the generate-root-ca (self-signed) action. This seems like a documentation bug. See all the options for upload-signed-csr in [0]. So, the command will look like this assuming the /tmp/vault-charm-int.pem and ./ca/example.com.pem files have all the potential intermediates per [1]:

juju run-action --wait vault/0 upload-signed-csr \
  pem=“$(cat /tmp/vault-charm-int.pem | base64)" \
  root-ca="$(cat ./ca/example.com.pem | base64)"

Question 2: When changing a CA it takes time for the model to update and for each service to get the new CA and begin using a new certificate. So there is a downtime window to be aware of always. And yes, if the cert was signed incorrectly it is possible to cause communication failures. You can reset and re-create a new self-signed cert with the following procedure:

juju run-action --wait vault/0 disable-pki
juju run-action --wait vault/0 generate-root-ca

Question 3: reissue-certificates is an action for manually recreating certificates from the requester’s CSRs. This creates new certificates for all the requesting services and notifies them of a new certificate. This is used when certificate’s are about to expire and need to be re-created. Both the generate-root-ca and the upload-signed-csr actions automatically do this at the end of their execution so no need to manually run this in this context.

So the procedure will look like the following:

juju run-action --wait vault/0 disable-pki
juju run-action --wait vault/0 upload-signed-csr \
  pem=“$(cat /tmp/vault-charm-int.pem | base64)" \
  root-ca="$(cat ./ca/example.com.pem | base64)"

If it becomes necessary to go back to a self-signed CA:

juju run-action --wait vault/0 disable-pki
juju run-action --wait vault/0 generate-root-ca

[0] https://api.jujucharms.com/charmstore/v5/vault-41/archive/actions.yaml
[1] https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-certificate-management.html

Thank you,


David Ames

1 Like

Hello David,

Thank you for detailed clarification.

I have attempted upload-signed-csr with our CA signed cert and resulted in below error.
Can you please assist ?
exmample.com.pem --> Contains root & intermediate chain for our org domain.

nxtgen@nxtgen-poc-maas:~$ juju run-action --wait vault/0 upload-signed-csr \

pem="$(cat /tmp/vault-charm-int.pem | base64)"
root-ca="$(cat ./ca/nxtgen.com.pem | base64)"
unit-vault-0:
UnitId: vault/0
id: “88”
message: ‘hvac.exceptions.InvalidPath: no handler for route ‘‘charm-pki-local/intermediate/set-signed’’’
results:
Stderr: |
/var/lib/juju/agents/unit-vault-0/charm/lib/charm/vault_pki.py:207: DeprecationWarning: Call to deprecated function ‘_post’. This method will be removed in version ‘0.8.0’ Please use the ‘post’ method on the ‘hvac.adapters’ class moving forward.
client._post(
status: failed
timing:
completed: 2021-01-05 07:17:10 +0000 UTC
enqueued: 2021-01-05 07:17:08 +0000 UTC
started: 2021-01-05 07:17:09 +0000 UTC

I have the same problem. the original certificated expired and I can’t replace it. Vault Root Certificate Authority (charm-pki-local) expired March 11, 2023 10:34:43PM after trying David’s suggestions, vault will not even start anymore

juju status now shows etcd blocked
etcd/0*                   blocked   idle   3        172.17.50.157   2379/tcp          Missing relation to certificate authority.
etcd/1                    blocked   idle   4        172.17.50.159   2379/tcp          Missing relation to certificate authority.
etcd/2                    blocked   idle   5        172.17.50.158   2379/tcp          Missing relation to certificate authority.

kubeapi-load-balancer/0*  active    idle   6        172.17.50.160   443/tcp,6443/tcp  Loadbalancer ready.
kubernetes-master/2       waiting   idle   17       172.17.50.168   6443/tcp          Waiting for certificates authority.
  containerd/7            active    idle            172.17.50.168                     Container runtime available
  flannel/10              waiting   idle            172.17.50.168                     Waiting for Flannel
kubernetes-master/3*      waiting   idle   19       172.17.0.1      6443/tcp          Waiting for certificates authority.
  containerd/17           active    idle            172.17.0.1                        Container runtime available
  flannel/28              active    idle            172.17.0.1                        Flannel subnet 10.1.55.1/24
kubernetes-worker/0*      blocked   idle   9        172.17.50.161   80/tcp,443/tcp    Missing relation to certificate authority.
  containerd/0*           active    idle            172.17.50.161                     Container runtime available
  flannel/0*              active    idle            172.17.50.161                     Flannel subnet 10.1.81.1/24
kubernetes-worker/4       blocked   idle   15       172.17.50.167   80/tcp,443/tcp    Missing relation to certificate authority.
  containerd/6            active    idle            172.17.50.167                     Container runtime available
  flannel/9               active    idle            172.17.50.167                     Flannel subnet 10.1.21.1/24
kubernetes-worker/5       blocked   idle   18       172.17.50.169   80/tcp,443/tcp    Missing relation to certificate authority.
  containerd/16           active    idle            172.17.50.169                     Container runtime available
  flannel/27              active    idle            172.17.50.169                     Flannel subnet 10.1.51.1/24
kubernetes-worker/7       blocked   idle   21       172.17.0.3      80/tcp,443/tcp    Missing relation to certificate authority.
  containerd/18           active    idle            172.17.0.3                        Container runtime available
  flannel/29              active    idle            172.17.0.3                        Flannel subnet 10.1.43.1/24
kubernetes-worker/8       blocked   idle   23       172.17.0.4      80/tcp,443/tcp    Missing relation to certificate authority.
  containerd/19           active    idle            172.17.0.4                        Container runtime available
  flannel/30              active    idle            172.17.0.4                        Flannel subnet 10.1.5.1/24
postgresql/0*             active    idle   12       172.17.50.164   5432/tcp          Live secondary (12.9)
postgresql/1              active    idle   13       172.17.50.165   5432/tcp          Live master (12.9)
vault/0*                  error     idle   14       172.17.50.166   8200/tcp          hook failed: "start"

Machine  State    Address        Inst id               Series  AZ       Message
0        started  172.17.50.120  vm30-1                focal   default  Deployed
0/lxd/0  started  172.17.50.152  juju-c69cb7-0-lxd-0   focal   default  Container started
1        started  172.17.50.130  vm30-2                focal   default  Deployed
1/lxd/0  started  172.17.50.155  juju-c69cb7-1-lxd-0   focal   default  Container started
2        started  172.17.50.151  vm30-3                focal   default  Deployed
2/lxd/0  started  172.17.50.156  juju-c69cb7-2-lxd-0   focal   default  Container started
3        started  172.17.50.157  tight-elk             focal   default  Deployed
4        started  172.17.50.159  sacred-snake          focal   default  Deployed
5        started  172.17.50.158  sacred-sheep          focal   default  Deployed
6        started  172.17.50.160  kube-worker-test24-1  focal   default  Deployed
9        started  172.17.50.161  stable-mammal         focal   default  Deployed
12       started  172.17.50.164  gentle-magpie         focal   default  Deployed
13       started  172.17.50.165  rare-wombat           focal   default  Deployed
14       started  172.17.50.166  holy-lemur            focal   default  Deployed
15       started  172.17.50.167  kube-worker-30-1      focal   default  Deployed
17       started  172.17.50.168  square-calf           focal   default  Deployed
18       started  172.17.50.169  kube-worker-30-2      focal   default  Deployed
19       started  172.17.0.1     wired-marlin          focal   default  Deployed
20       started  172.17.0.2     one-lynx              focal   default  Deployed
21       started  172.17.0.3     kube-worker-30-3      focal   default  Deployed
23       started  172.17.0.4     kube-worker-test24-2  focal   default  Deployed

Relation provider                    Requirer                                 Interface          Type         Message
ceph-mon:admin                       kubernetes-master:ceph-storage           ceph-admin         regular
ceph-mon:client                      kubernetes-master:ceph-client            ceph-client        regular
ceph-mon:mds                         ceph-fs:ceph-mds                         ceph-mds           regular
ceph-mon:mon                         ceph-mon:mon                             ceph               peer
ceph-mon:osd                         ceph-osd:mon                             ceph-osd           regular
etcd:cluster                         etcd:cluster                             etcd               peer
etcd:db                              flannel:etcd                             etcd               regular
etcd:db                              kubernetes-master:etcd                   etcd               regular
etcd:db                              vault:etcd                               etcd               regular
kubeapi-load-balancer:lb-consumers   kubernetes-master:loadbalancer-external  loadbalancer       regular
kubeapi-load-balancer:lb-consumers   kubernetes-master:loadbalancer-internal  loadbalancer       regular
kubernetes-master:cni                flannel:cni                              kubernetes-cni     subordinate
kubernetes-master:container-runtime  containerd:containerd                    container-runtime  subordinate
kubernetes-master:coordinator        kubernetes-master:coordinator            coordinator        peer
kubernetes-master:kube-control       kubernetes-worker:kube-control           kube-control       regular
kubernetes-master:kube-masters       kubernetes-master:kube-masters           kube-masters       peer
kubernetes-worker:cni                flannel:cni                              kubernetes-cni     subordinate
kubernetes-worker:container-runtime  containerd:containerd                    container-runtime  subordinate
kubernetes-worker:coordinator        kubernetes-worker:coordinator            coordinator        peer
postgresql:coordinator               postgresql:coordinator                   coordinator        peer
postgresql:db                        vault:db                                 pgsql              regular
postgresql:replication               postgresql:replication                   pgpeer             peer
vault:certificates                   etcd:certificates                        tls-certificates   regular      joining
vault:certificates                   kubeapi-load-balancer:certificates       tls-certificates   regular
vault:certificates                   kubernetes-master:certificates           tls-certificates   regular
vault:certificates                   kubernetes-worker:certificates           tls-certificates   regular
vault:cluster                        vault:cluster                            vault-ha           peer


juju run --unit vault/0 'relation-ids certificates'
certificates:18
certificates:19
certificates:20
certificates:22

ubuntu@holy-lemur:~$ systemctl status vault.service
● vault.service - HashiCorp Vault
     Loaded: loaded (/etc/systemd/system/vault.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-05-25 01:55:23 UTC; 4 days ago
   Main PID: 96566 (code=exited, status=1/FAILURE)

May 25 01:55:23 holy-lemur systemd[1]: vault.service: Main process exited, code=exited, status=1/FAILURE
May 25 01:55:23 holy-lemur systemd[1]: vault.service: Failed with result 'exit-code'.
May 25 01:55:23 holy-lemur systemd[1]: vault.service: Scheduled restart job, restart counter is at 5.
May 25 01:55:23 holy-lemur systemd[1]: Stopped HashiCorp Vault.
May 25 01:55:23 holy-lemur systemd[1]: vault.service: Start request repeated too quickly.
May 25 01:55:23 holy-lemur systemd[1]: vault.service: Failed with result 'exit-code'.
May 25 01:55:23 holy-lemur systemd[1]: Failed to start HashiCorp Vault.



@ubumadmin - sorry for a delayed response here. Were you able to resolve your problem?

It’s unclear what you were trying to do. It’s clear that you had certificates expire, but were you trying to switch from a self-signed CA sert to an organization specific CA cert?

vault was deployed by juju and relations setup. vault was setup with vault as the CA with an offline root, like this https://docs.openstack.org/charm-guide/latest/admin/security/tls.html#add-a-ca-certificate

the vault self signed cert expired and now everything is broken, and vault will not start

I just want to create a new cert and get it running again

Did you ever get this resolved? I find myself in the same predicament…