Error reporting description
The status of a stored osd is down, but when we start this osd, the following error will be reported [in fact, this osd must exist]
[root@stor-21 ~]# service ceph start osd.1 /etc/init.d/ceph: osd.1 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines )
Information verification
osd status
Command: ceph osd tree
This is to check whether the status is down. In fact, it is more important to confirm that this host exists in the pool
[root@stor-21 ~]# ceph osd tree | grep -A 9 stor-21 -3 18.20000 host stor-21 1 1.81999 osd.1 down 0.88808 1.00000 3 1.81999 osd.3 down 1.00000 1.00000 6 1.81999 osd.6 down 1.00000 1.00000 9 1.81999 osd.9 down 0.96855 1.00000 12 1.81999 osd.12 down 1.00000 1.00000 15 1.81999 osd.15 down 0.90279 1.00000 18 1.81999 osd.18 down 1.00000 1.00000 21 1.81999 osd.21 down 0.86520 1.00000 27 1.81999 osd.27 down 0.89455 1.00000 [root@stor-21 ~]#
Does the ceph file of osd exist
The path is: / var/lib/ceph/osd. There will be the osd information of the host, which corresponds to the osd tree above
[root@stor-21 osd]# pwd /var/lib/ceph/osd [root@stor-21 osd]# [root@stor-21 osd]# ls ceph-1 ceph-12 ceph-15 ceph-18 ceph-21 ceph-24 ceph-27 ceph-3 ceph-6 ceph-9 [root@stor-21 osd]#
Does the hard disk exist
Directly use the command: lsblk to check whether the number of hard disks is correct. If the mounting path behind the hard disk is gone at this time, it is normal, because the osd status is down [here is the screenshot after normal].
[root@stor-21 osd]# lsblk | tail -n 15 └─sdp1 8:241 1 1.8T 0 part /var/lib/ceph/osd/ceph-12 sdq 65:0 1 1.8T 0 disk └─sdq1 65:1 1 1.8T 0 part /var/lib/ceph/osd/ceph-15 sdr 65:16 1 1.8T 0 disk └─sdr1 65:17 1 1.8T 0 part /var/lib/ceph/osd/ceph-18 sds 65:32 1 1.8T 0 disk └─sds1 65:33 1 1.8T 0 part /var/lib/ceph/osd/ceph-21 sdt 65:48 1 1.8T 0 disk └─sdt1 65:49 1 1.8T 0 part /var/lib/ceph/osd/ceph-24 sdu 65:64 1 1.8T 0 disk └─sdu1 65:65 1 1.8T 0 part /var/lib/ceph/osd/ceph-27 sr0 11:0 1 1024M 0 rom sr1 11:1 1 1024M 0 rom sr2 11:2 1 1024M 0 rom sr3 11:3 1 1024M 0 rom [root@stor-21 osd]#
mon node status
- As for whether the current host is a monitor node, find your own way to verify it. The method is not necessarily. I won't explain it
- Currently, my 21 hosts are mon nodes
The command to view the mon node status is: service ceph status mon. Hostname
[root@stor-21 osd]# service ceph status mon.stor-21 === mon.stor-21 === mon.stor-21: running {"version":"0.94.6"} [root@stor-21 osd]#
- If the status is not running, start
Command: service ceph start mon. Hostname
resolvent
-
Again, if you use the command to start osd and report an error, use the following method
-
There are two kinds of command startup commands [operate on the host to which the osd belongs]
- 1:service ceph osd. Serial number [you can see the serial number in ceph osd tree]
- 2:/etc/init.d/ceph start osd serial number [the serial number can be seen in ceph osd tree]
-
In this case, we can start by activating osd. The command is:
CEPH disk activate / dev / formatted hard disk name [lsblk can see]
See below for details
#Here are the historical commands 944 ceph-disk activate /dev/sdm1 945 ceph-disk activate /dev/sdn1 946 service ceph status 947 ceph-disk activate /dev/sdo1 948 ceph-disk activate /dev/sdp1 949 service ceph status 950 service ceph start mon.stor-21 951 service ceph status 952 ceph-disk activate /dev/sdq1 953 ceph-disk activate /dev/sdr1 954 ceph-disk activate /dev/sds1 955 ceph-disk activate /dev/sdt1 956 lsblk # The following is the hard disk serial number seen by lsblk sdl 8:176 1 1.8T 0 disk └─sdl1 8:177 1 1.8T 0 part /var/lib/ceph/osd/ceph-1 sdm 8:192 1 1.8T 0 disk └─sdm1 8:193 1 1.8T 0 part /var/lib/ceph/osd/ceph-3 sdn 8:208 1 1.8T 0 disk └─sdn1 8:209 1 1.8T 0 part /var/lib/ceph/osd/ceph-6 sdo 8:224 1 1.8T 0 disk └─sdo1 8:225 1 1.8T 0 part /var/lib/ceph/osd/ceph-9 sdp 8:240 1 1.8T 0 disk └─sdp1 8:241 1 1.8T 0 part /var/lib/ceph/osd/ceph-12 sdq 65:0 1 1.8T 0 disk └─sdq1 65:1 1 1.8T 0 part /var/lib/ceph/osd/ceph-15 sdr 65:16 1 1.8T 0 disk └─sdr1 65:17 1 1.8T 0 part /var/lib/ceph/osd/ceph-18 sds 65:32 1 1.8T 0 disk └─sds1 65:33 1 1.8T 0 part /var/lib/ceph/osd/ceph-21 sdt 65:48 1 1.8T 0 disk └─sdt1 65:49 1 1.8T 0 part /var/lib/ceph/osd/ceph-24 sdu 65:64 1 1.8T 0 disk └─sdu1 65:65 1 1.8T 0 part /var/lib/ceph/osd/ceph-27
-
At the extreme, if the above method still fails to mount, we know that the hard disk is mounted to the osd ceph file through the command or the above method. Then we can directly mount the hard disk to the osd ceph file with Mount! [because the mounting relationship is fixed. If there is no previous mounting information record, don't do so. If there is mounting information, you can try one of them. Anyway, only one osd will break in the end. If it's a big deal, rejoin the cluster!]
-
Normal process: if none of the above works, you can consider removing the osd from the cluster and rejoining the cluster. The disadvantage is that the amount of data synchronization will increase, just!!!
osd status view
Command: service seph status
This is how to view the status information of each osd
[root@stor-21 ~]# service ceph status === mon.stor-21 === mon.stor-21: running {"version":"0.94.6"} === osd.1 === osd.1: running {"version":"0.94.6"} === osd.3 === osd.3: running {"version":"0.94.6"} === osd.6 === osd.6: running {"version":"0.94.6"} === osd.9 === osd.9: running {"version":"0.94.6"} === osd.12 === osd.12: running {"version":"0.94.6"} === osd.15 === osd.15: running {"version":"0.94.6"} === osd.18 === osd.18: running {"version":"0.94.6"} === osd.21 === osd.21: running {"version":"0.94.6"} === osd.24 === osd.24: running {"version":"0.94.6"} === osd.27 === osd.27: running {"version":"0.94.6"} [root@stor-21 ~]#