Lost connectivity to MAAS DNS now openstack is not functioning as something with certificates is not working

I am unsure where to start, this openstack has been functional and instance accessability is still ok for the most part. we had an issue with a network being blocked and the undercloud could not see the dns from MAAS. When this was corrected horizon was giving errors when trying to login then openstack CLI gave issues as well. digging into the logs i can see neutron not being able to connect to elements of the stack or start properly due to an ssl issue. After doing some digging on forums, reading logs and even poking around in configs to determine a fix I found a forum that suggested to use vault to re-create a self signed cert. I did that and pushed the cert to the machines and some functionality came back but has since been lost. I checked each machine and the certificate that was generated is in each lxd. Any help on a direction would be good as I am grasping at straws at this point. the instances are running but some of them are mission critical. two have turned off so far and i cannot re-activate them.

here is one good error when i try to run “openstack network list” Failed to discover available identity versions when contacting https://10.0.7.22:5000/v3. Attempting to parse version from URL. SSL exception connecting to https://10.0.7.22:5000/v3/auth/tokens: HTTPSConnectionPool(host=‘10.0.7.22’, port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLError(1, ‘[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)’),))

Let me know what log files may be helpful as well. or anything. Im at a brick wall. I am trying to not need to reroll this stack as it would be weeks of work to restore all the instances.

In terms of openstack network list command not working, does the openstack client have access to the correct CA certificate?

yes, I can check other things but the neutron agent and nova for example are throwing errors with ssl in their logs.

neutron: 2023-11-30 20:19:15.041 1496481 ERROR neutron ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1131)

Nova: 2023-11-30 20:21:02.292 2529081 ERROR nova keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://keystone.dev:35357: HTTPSConnectionPool(host=‘keystone.dev’, port=35357): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, ‘[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)’)))

What exact commands for interacting with Vault and certificates did you run?

I used the command list fromhttps://charmhub.io/vault/actions and ran the juju run-action --wait vault/leader dsable-pki then I ran the generate-root-ca then saved that output and ran reissue-certificates. then I waited. it brought the connectivity back part way then it vanished the next day. I have looked on each system and the juju ca-cert and its the correct one. but then I got token errors. and when I hit 3 pages of over 50 tabs on forums and help docs I called it and am here.

What is the output to

openstack endpoint list

hi Paul,

Should keystone be using “keystone.dev” domain or the bare ip address (10.0.7.22)?

Can you run this command to get the details of the certificate that’s being presented to the client?

echo | \
    openssl s_client  -connect 10.0.7.22:5000 2>/dev/null | \
    openssl x509 -text

This way you can get what Subject and Alternative Subject Name are available in the certificate.