Hi there, my name is Felix, I am currently working on setting up a charmed OpenStack testing enviroment for future datacenter and application modernization projects. The goal of this project is to implement a hyperconverged testing facility.
To achieve this, I modified the OpenStack Base charm bundles to our HA requirements and to my content and proceeded to testing it utilising a MAAS rack controller. The release is OpenStack Zed deployed upon Ubuntu Server 22.04 LTS Jammy.
Considering ceph, I used the quincy/stable releases according to the UCA. I was able to get only one full deployment running - the first at that - but just once. Since retrying to reproduce some things (fully wiped the environment before but retained the configuration), ceph-radosgw never returns fully functional after the deployment, even after waiting a longer time.
The charm ceph-radosgw is stuck in “waiting” state while reporting “Incomplete relations: identity”:
Model Controller Cloud/Region Version SLA Timestamp
openstack juju-maas-controller maas/default 3.1.6 unsupported 12:25:10+01:00
App Version Status Scale Charm Channel Rev Exposed Message
ceph-radosgw 17.2.6 waiting 3 ceph-radosgw quincy/stable 564 no Incomplete relations: identity
ceph-radosgw-hacluster 2.1.2 active 3 hacluster 2.4/stable 131 no Unit is ready and clustered
Unit Workload Agent Machine Public address Ports Message
ceph-radosgw/0 waiting executing 0/lxd/1 192.168.80.10 80/tcp Incomplete relations: identity
ceph-radosgw-hacluster/1 active idle 192.168.80.10 Unit is ready and clustered
ceph-radosgw/1 waiting executing 1/lxd/1 192.168.80.92 80/tcp Incomplete relations: identity
ceph-radosgw-hacluster/2 active idle 192.168.80.92 Unit is ready and clustered
ceph-radosgw/2* waiting executing 2/lxd/1 192.168.80.146 80/tcp Incomplete relations: identity
ceph-radosgw-hacluster/0* active idle 192.168.80.146 Unit is ready and clustered
Machine State Address Inst id Base AZ Message
0 started 192.168.80.212 one-muc-02 ubuntu@22.04 testcenter Deployed
0/lxd/1 started 192.168.80.10 juju-e16cdc-0-lxd-1 ubuntu@22.04 testcenter Container started
1 started 192.168.80.211 one-muc-01 ubuntu@22.04 testcenter Deployed
1/lxd/1 started 192.168.80.92 juju-e16cdc-1-lxd-1 ubuntu@22.04 testcenter Container started
2 started 192.168.80.213 one-muc-03 ubuntu@22.04 testcenter Deployed
2/lxd/1 started 192.168.80.146 juju-e16cdc-2-lxd-1 ubuntu@22.04 testcenter Container started
Removing and restarting the deployment for these components results in the ceph-radosgw service being not recognized as running by juju status or hacluster being reported as “vip not running”. Removing and re-adding the relation also does not seem to help (see the warnings at 11:53 below).
To diagnose this, I replayed the leaders logs:
unit-ceph-radosgw-2: 11:53:16 INFO juju Starting unit workers for "ceph-radosgw/2"
unit-ceph-radosgw-2: 11:53:16 INFO juju.worker.apicaller [48050a] "unit-ceph-radosgw-2" successfully connected to "192.168.80.217:17070"
unit-ceph-radosgw-2: 11:53:16 INFO juju.worker.migrationminion migration phase is now: NONE
unit-ceph-radosgw-2: 11:53:16 INFO juju.worker.logger logger worker started
unit-ceph-radosgw-2: 11:53:16 INFO juju.worker.upgrader no waiter, upgrader is done
unit-ceph-radosgw-2: 11:53:17 WARNING juju.worker.uniter.relation unit keystone/1 in relation 51 no longer exists
unit-ceph-radosgw-2: 11:53:17 WARNING juju.worker.uniter.relation unit keystone/2 in relation 51 no longer exists
unit-ceph-radosgw-2: 11:53:17 WARNING juju.worker.uniter.relation unit keystone/0 in relation 51 no longer exists
unit-ceph-radosgw-2: 11:53:18 INFO juju.worker.uniter unit "ceph-radosgw/2" started
unit-ceph-radosgw-2: 11:53:18 INFO juju.worker.uniter hooks are retried true
unit-ceph-radosgw-2: 11:53:18 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:53:23 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/ceph/ceph.conf
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/apache2/sites-available/openstack_https_frontend.conf
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Loaded template from /var/lib/juju/agents/unit-ceph-radosgw-2/charm/hooks/charmhelpers/contrib/openstack/templates/haproxy.cfg
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Rendering from template: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:53:24 INFO unit.ceph-radosgw/2.juju-log mon:50: Wrote template /etc/haproxy/haproxy.cfg.
unit-ceph-radosgw-2: 11:55:06 INFO juju Starting unit workers for "ceph-radosgw/2"
unit-ceph-radosgw-2: 11:55:06 INFO juju.worker.apicaller [48050a] "unit-ceph-radosgw-2" successfully connected to "192.168.80.217:17070"
unit-ceph-radosgw-2: 11:55:06 INFO juju.worker.apicaller [48050a] "unit-ceph-radosgw-2" successfully connected to "192.168.80.217:17070"
unit-ceph-radosgw-2: 11:55:06 INFO juju.worker.migrationminion migration phase is now: NONE
unit-ceph-radosgw-2: 11:55:06 INFO juju.worker.logger logger worker started
unit-ceph-radosgw-2: 11:55:06 INFO juju.worker.upgrader no waiter, upgrader is done
unit-ceph-radosgw-2: 11:55:07 INFO juju.worker.uniter unit "ceph-radosgw/2" started
unit-ceph-radosgw-2: 11:55:07 INFO juju.worker.uniter hooks are retried true
unit-ceph-radosgw-2: 11:55:08 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:55:13 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:55:13 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:55:13 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/ceph/ceph.conf
unit-ceph-radosgw-2: 11:55:13 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/apache2/sites-available/openstack_https_frontend.conf
unit-ceph-radosgw-2: 11:55:14 INFO unit.ceph-radosgw/2.juju-log mon:50: Loaded template from /var/lib/juju/agents/unit-ceph-radosgw-2/charm/hooks/charmhelpers/contrib/openstack/templates/haproxy.cfg
unit-ceph-radosgw-2: 11:55:14 INFO unit.ceph-radosgw/2.juju-log mon:50: Rendering from template: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:55:14 INFO unit.ceph-radosgw/2.juju-log mon:50: Wrote template /etc/haproxy/haproxy.cfg.
unit-ceph-radosgw-2: 11:59:43 INFO juju Starting unit workers for "ceph-radosgw/2"
unit-ceph-radosgw-2: 11:59:43 INFO juju.worker.apicaller [48050a] "unit-ceph-radosgw-2" successfully connected to "192.168.80.217:17070"
unit-ceph-radosgw-2: 11:59:43 INFO juju.worker.apicaller [48050a] "unit-ceph-radosgw-2" successfully connected to "192.168.80.217:17070"
unit-ceph-radosgw-2: 11:59:43 INFO juju.worker.migrationminion migration phase is now: NONE
unit-ceph-radosgw-2: 11:59:43 INFO juju.worker.logger logger worker started
unit-ceph-radosgw-2: 11:59:43 INFO juju.worker.upgrader no waiter, upgrader is done
unit-ceph-radosgw-2: 11:59:44 INFO juju.worker.uniter unit "ceph-radosgw/2" started
unit-ceph-radosgw-2: 11:59:44 INFO juju.worker.uniter hooks are retried true
unit-ceph-radosgw-2: 11:59:44 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:59:49 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/ceph/ceph.conf
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Registered config file: /etc/apache2/sites-available/openstack_https_frontend.conf
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Loaded template from /var/lib/juju/agents/unit-ceph-radosgw-2/charm/hooks/charmhelpers/contrib/openstack/templates/haproxy.cfg
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Rendering from template: /etc/haproxy/haproxy.cfg
unit-ceph-radosgw-2: 11:59:50 INFO unit.ceph-radosgw/2.juju-log mon:50: Wrote template /etc/haproxy/haproxy.cfg.
I have absolutely no clue what error resolution for the hook is awaited here. Also, the warnings at 1153 are expected since we tried killing and re-adding the relation. We also played with rebooting and restarting services to no avail.
To further look into this, I took a look into the unit leaders log file by ssh-ing into the machine, and observed something that repeated itself at deployment time, but is no longer present in the logs at the moment. See this excerpt, which ends when the logs abruptly cut for this time period:
2023-12-12 15:33:13 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: identity relation's interface, identity-service, is related awaiting the following data from the relationship: service_port, service_host, auth_host, auth_port, internal_host, internal_port, admin_tenant_name, admin_user, admin_password.
2023-12-12 15:33:14 INFO juju.worker.uniter.operation runhook.go:186 ran "mon-relation-changed" hook (via explicit, bespoke hook script)
2023-12-12 15:34:04 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Registered config file: /etc/haproxy/haproxy.cfg
2023-12-12 15:34:04 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Registered config file: /etc/ceph/ceph.conf
2023-12-12 15:34:06 WARNING unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Package python-keystonemiddleware has no installation candidate.
2023-12-12 15:34:06 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:06 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:06 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:06 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: identity relation's interface, identity-service, is related awaiting the following data from the relationship: service_port, service_host, auth_host, auth_port, internal_host, internal_port, admin_tenant_name, admin_user, admin_password.
2023-12-12 15:34:06 INFO juju.worker.uniter.operation runhook.go:186 ran "mon-relation-changed" hook (via explicit, bespoke hook script)
2023-12-12 15:34:29 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Registered config file: /etc/haproxy/haproxy.cfg
2023-12-12 15:34:29 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Registered config file: /etc/ceph/ceph.conf
2023-12-12 15:34:29 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Cluster configured, notifying other services andupdating keystone endpoint configuration
2023-12-12 15:34:30 WARNING unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Package python-keystonemiddleware has no installation candidate.
2023-12-12 15:34:30 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:30 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:30 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:30 INFO unit.ceph-radosgw/2.juju-log server.go:325 ha:92: identity relation's interface, identity-service, is related awaiting the following data from the relationship: service_port, service_host, auth_host, auth_port, internal_host, internal_port, admin_tenant_name, admin_user, admin_password.
2023-12-12 15:34:31 INFO juju.worker.uniter.operation runhook.go:186 ran "ha-relation-changed" hook (via explicit, bespoke hook script)
2023-12-12 15:34:55 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Registered config file: /etc/haproxy/haproxy.cfg
2023-12-12 15:34:55 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Registered config file: /etc/ceph/ceph.conf
2023-12-12 15:34:55 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Loaded template from /var/lib/juju/agents/unit-ceph-radosgw-2/charm/hooks/charmhelpers/contrib/openstack/templates/haproxy.cfg
2023-12-12 15:34:55 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Rendering from template: /etc/haproxy/haproxy.cfg
2023-12-12 15:34:55 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Wrote template /etc/haproxy/haproxy.cfg.
2023-12-12 15:34:56 WARNING unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Package python-keystonemiddleware has no installation candidate.
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Missing required data: service_port service_host auth_host auth_port internal_host internal_port admin_tenant_name admin_user admin_password
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Loaded template from templates/ceph.conf
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Rendering from template: /etc/ceph/ceph.conf
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Wrote template /etc/ceph/ceph.conf.
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Making dir /var/lib/ceph/radosgw/ceph-rgw.juju-e16cdc-2-lxd-1 ceph:ceph 750
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Symlinking /var/lib/ceph/radosgw/ceph-rgw.juju-e16cdc-2-lxd-1/keyring as /etc/ceph/ceph.client.rgw.juju-e16cdc-2-lxd-1.keyring
2023-12-12 15:34:56 WARNING unit.ceph-radosgw/2.mon-relation-changed logger.go:60 Synchronizing state of radosgw.service with SysV service script with /lib/systemd/systemd-sysv-install.
2023-12-12 15:34:56 WARNING unit.ceph-radosgw/2.mon-relation-changed logger.go:60 Executing: /lib/systemd/systemd-sysv-install disable radosgw
2023-12-12 15:34:56 WARNING unit.ceph-radosgw/2.mon-relation-changed logger.go:60 Unit /etc/systemd/system/radosgw.service is masked, ignoring.
2023-12-12 15:34:56 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Installing python-dbus with options: ['--option=Dpkg::Options::=--force-confold']
2023-12-12 15:34:57 WARNING unit.ceph-radosgw/2.mon-relation-changed logger.go:60 E: Package 'python-dbus' has no installation candidate
2023-12-12 15:34:57 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Check command not found: check_disk
2023-12-12 15:34:57 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: removed check_haproxy. This service will be monitored by check_crm
2023-12-12 15:34:57 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Check command not found: check_systemd.py
2023-12-12 15:34:57 INFO unit.ceph-radosgw/2.juju-log server.go:325 mon:50: Nagios user not set up, nrpe checks not updated
These events were present since the deployment and abruptly cut there. I can not reproduce these events but expect that these come from keystone being available later than ceph-radosgw.
I have already searched several pages and forums with no result on my part.
If someone might be experienced with this charm - can you please assist me with troubleshooting this?
Thank you very much.