Problem with ceph-osd

sjtanner · 22 October 2022 03:23

Have had ceph up a running for a few years, deployed as part of focal-ussuri charmed OpenStack deployment
Recently had a single OSD fail
Followed the guide here for replacing a drive
missed this step juju run-action --wait $OSD_UNIT add-disk osd-devices=$OSD_DNAME after inserting the new drive

Instead did:
juju run-action --wait ceph-osd/5 add-disk osd-devices=/dev/sdo

now 4 of 6 OSD units have the following status in juju status ceph-osd Non-pristine devices detected, consultlist-disks, zap-diskandblackl ist-* actions.

ceph is healthy and operating as expected, but juju seems to really want to perform an action that was already performed when the cluster was first deployed.

Any hints or pointers on how to rectify this situation would be much appreciated.

Here are some bits of information that seem relevant.

output of list-disks from one of the impacted units

juju run-action --wait ceph-osd/0 list-disks
unit-ceph-osd-0:
  UnitId: ceph-osd/0
  id: "831"
  results:
    Stderr: |2
        Failed to find physical volume "/dev/sdh".
        Failed to find physical volume "/dev/sdj".
        Failed to find physical volume "/dev/sdc".
        Failed to find physical volume "/dev/nvme0n1".
        Failed to find physical volume "/dev/sdi".
        Failed to find physical volume "/dev/sdk".
        Failed to find physical volume "/dev/sdl".
        Failed to find physical volume "/dev/sdg".
        Failed to find physical volume "/dev/sda".
        Failed to find physical volume "/dev/sdb".
        Failed to find physical volume "/dev/sde".
        Failed to find physical volume "/dev/sdd".
        Failed to find physical volume "/dev/sdf".
    blacklist: '[]'
    disks: '[''/dev/sdh'', ''/dev/sdj'', ''/dev/sdc'', ''/dev/nvme0n1'', ''/dev/sdi'',
      ''/dev/sdk'', ''/dev/sdl'', ''/dev/sdg'', ''/dev/sda'', ''/dev/sdb'', ''/dev/sde'',
      ''/dev/sdd'', ''/dev/sdf'']'
    non-pristine: '[''/dev/sdh'', ''/dev/sdj'', ''/dev/sdc'', ''/dev/nvme0n1'', ''/dev/sdi'',
      ''/dev/sdk'', ''/dev/sdl'', ''/dev/sdg'', ''/dev/sda'', ''/dev/sdb'', ''/dev/sde'',
      ''/dev/sdd'', ''/dev/sdf'']'
  status: completed
  timing:
    completed: 2022-10-21 21:55:25 +0000 UTC
    enqueued: 2022-10-21 21:55:21 +0000 UTC
    started: 2022-10-21 21:55:21 +0000 UTC

charm information ceph-osd 15.2.16 blocked 6 ceph-osd stable 304

output of juju ssh ceph-mon/leader sudo ceph -s

cluster:
    id:     <cluster uuid>
    health: HEALTH_OK
         
  services:
    mon: 3 daemons, quorum ceph-mon01,ceph-mon02,ceph-mon03 (age 2d)
    mgr: ceph-mon01(active, since 26h), standbys: ceph-mon02, ceph-mon03
    osd: 73 osds: 72 up (since 21m), 72 in (since 23h); 
    rgw: 3 daemons active (juju-46ccdb-11-lxd-0, juju-46ccdb-12-lxd-0, juju-46ccdb-9-lxd-0)

  task status:

  data:
    pools:   21 pools, 3097 pgs
    objects: 10.17M objects, 39 TiB
    usage:   118 TiB used, 277 TiB / 395 TiB avail
    pgs:     3097 active+clean

lmlogiudice · 24 October 2022 15:45

Hello,

Can you tell me if running the following helps with the juju issue ?

juju run --application ceph-osd ./hooks/config-changed

This looks very much like a bug we’ve encountered on the past.

sjtanner · 24 October 2022 17:34

It did not resolve the issue but now all osd-units show as non-pristine

Model      Controller   Cloud/Region  Version  SLA          Timestamp
openstack  us-west-default  US-WEST/default   2.9.16   unsupported  10:26:33-07:00

App       Version  Status   Scale  Charm     Channel  Rev  Exposed  Message
ceph-osd  15.2.16  blocked      6  ceph-osd  stable   304  no       Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
ntp       3.5      active       6  ntp       stable    41  no       chrony: Ready

Unit         Workload  Agent  Machine  Public address  Ports    Message
ceph-osd/0   blocked   idle   3        10.100.190.10            Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/8      active    idle            10.100.190.10   123/udp  chrony: Ready
ceph-osd/1   blocked   idle   4        10.100.190.9             Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/6      active    idle            10.100.190.9    123/udp  chrony: Ready
ceph-osd/2   blocked   idle   5        10.100.190.6             Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/3      active    idle            10.100.190.6    123/udp  chrony: Ready
ceph-osd/3   blocked   idle   6        10.100.190.7             Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/5      active    idle            10.100.190.7    123/udp  chrony: Ready
ceph-osd/4*  blocked   idle   7        10.100.190.11            Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/7      active    idle            10.100.190.11   123/udp  chrony: Ready
ceph-osd/5   blocked   idle   8        10.100.190.8             Non-pristine devices detected, consult `list-disks`, `zap-disk` and `blacklist-*` actions.
  ntp/4      active    idle            10.100.190.8    123/udp  chrony: Ready

Machine  State    Address        Inst id         Series  AZ                   Message
3        started  10.100.190.10  ceph-osd05  focal   us-west  Deployed
4        started  10.100.190.9   ceph-osd04  focal   us-west  Deployed
5        started  10.100.190.6   ceph-osd01  focal   us-west  Deployed
6        started  10.100.190.7   ceph-osd02  focal   us-west  Deployed
7        started  10.100.190.11  ceph-osd06  focal   us-west  Deployed
8        started  10.100.190.8   ceph-osd03  focal   us-west  Deployed

utkarshbhatthere · 25 October 2022 08:52

Hello @sjtanner, From what I have understood, your Ceph cluster is working fine (with all OSDs) and only the juju status shows “non-pristine” device error ? Can you confirm if this is correct ?

Also, can you tell a bit more about your ceph deployment ? Do you use LVM / RAID etc ? Have you zapped any disks ?

sjtanner · 25 October 2022 23:00

@utkarshbhatthere, that is correct, Ceph Cluster is working fine.
We are using LVM.
Each OSD is a logical volume comprised of a spinning disk and a partition on a nvme ssd
Yes we did zap the failed disk
in the pre-replacement procedure
sudo ceph-volume lvm zap --destroy $OSD_DNAME
in the post replacement procedure
juju run-action --wait $OSD_UNIT zap-disk devices=$OSD_DNAME i-really-mean-it=yes

but a mistake was made in the final step of replacement

When we added the new disk

sjtanner · 30 November 2022 22:46

Is there a way to “unblock” the units, or mark them as good?