Distributed storage cluster construction on K8s (Rook/ceph)

Posted by sentient0169 on Fri, 10 Dec 2021 05:47:49 +0100

This article will take you hand in hand to build a distributed storage cluster (Rook/ceph) on K8s

1 Environmental preparation

1.1 basic environment

  • 3 virtual machines with consistent configuration:
  • Virtual machine configuration: 4c 8g
  • Virtual machine operating system: cents7
  • Hard disk: vda: 40G vdb: 20G
  • Kubernete version: 1.20 0
  • Docker version: 20.10 seven

By default, k8s installation is completed, and kubedm container installation is adopted

1.2 installed rook/ceph version:

ceph: v15.2.11

rook: 1.6.3

1.3 premise

  • Multi node k8s cluster in normal operation, with two or more child nodes
  • The version of rook is greater than 1.3 and cannot use directories to create clusters. A separate bare disk should be used to create clusters, that is, create a new disk, mount it to the host computer, and use it directly without formatting. Inspection steps:
lsblk -f
NAME   FSTYPE LABEL UUID                                 MOUNTPOINT
└─vda1 xfs          6f15c206-f516-4ee8-a4b7-89ad880647db /
  • The disk with empty FSTYPE is an available disk, which needs to clear data (cannot be formatted).
  • This experiment requires high configuration. The configuration of each child node shall not be less than 2-core 4G, and the configuration of the master node shall not be less than 4-core 8G

2 construction process

2.1 what is rook?

  • Rook itself is not a distributed storage system, but uses the powerful functions of Kubernetes platform to provide services for each storage provider through Kubernetes Operator. It is a storage "choreographer" that can use different back ends (such as Ceph, EdgeFS, etc.) to perform heavy storage management work, thus abstracting a lot of complexity.
  • Rook transforms the distributed storage system into a self managing, self expanding and self repairing storage service. It automates the tasks of the storage administrator: deployment, boot, configuration, provisioning, expansion, upgrade, migration, disaster recovery, monitoring, and resource management
  • Rook orchestrates multiple storage solutions, and each solution has a dedicated Kubernetes Operator to realize automatic management. Ceph, Cassandra and NFS are currently supported.
  • At present, the mainstream backend is Ceph, which provides more than block storage; It also provides S3/Swift compatible object storage and distributed file system. Ceph can distribute the data of a volume on multiple disks, so it is convenient for a volume to actually use more disk space than a single disk. When more disks are added to the cluster, it automatically rebalances / redistributes data between disks.

2.2 CEPH look and k8s integration mode

  • Rook is an open-source cloud native storage choreography that provides a platform and framework; Provide platforms, frameworks and support for various storage solutions for local integration with cloud native environments.
  • Rook transforms storage software into a self managing, self expanding and self-healing storage service. It achieves this goal by automating deployment, boot, configuration, provisioning, expansion, upgrade, migration, disaster recovery, monitoring and resource management.
  • Rook uses the tools provided by the underlying cloud native container management, scheduling and orchestration platform to realize its own functions.
  • Rook currently supports Ceph, NFS, Minio Object Store and CockroachDB.
  • Rook uses the Kubernetes primitive to enable the Ceph storage system to run on Kubernetes

3 installation and deployment

3.1 preparation before installation

#Confirm installation lvm2
yum install lvm2 -y
#Enable rbd module
modprobe rbd
cat > /etc/rc.sysinit << EOF
for file in /etc/sysconfig/modules/*.modules
  [ -x \$file ] && \$file
cat > /etc/sysconfig/modules/rbd.modules << EOF
modprobe rbd
chmod 755 /etc/sysconfig/modules/rbd.modules
lsmod |grep rbd

3.2 download Rook installation file

git clone --single-branch --branch v1.6.3 https://github.com/rook/rook.git

Change configuration

cd rook/cluster/examples/kubernetes/ceph

Modify the Rook CSI image address. The original address may be the image of gcr, but the image of gcr cannot be accessed domestically. Therefore, it is necessary to synchronize the image of gcr to Alibaba cloud image warehouse. This document has been synchronized for you. You can directly modify it as follows:

vim operator.yaml


Replace with:

ROOK_CSI_REGISTRAR_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-node-driver-registrar:v2.0.1"
 ROOK_CSI_RESIZER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-resizer:v1.0.1"
 ROOK_CSI_PROVISIONER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-provisioner:v2.0.4"
 ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-snapshotter:v4.0.0"
 ROOK_CSI_ATTACHER_IMAGE: "registry.cn-beijing.aliyuncs.com/dotbalo/csi-attacher:v3.0.2"

It is also the operator file. The new version of rook turns off the deployment of automatic discovery container by default. You can find rook_ ENABLE_ DISCOVERY_ Change daemon to true:

3.4 deploy ROOK

cd cluster/examples/kubernetes/ceph
kubectl create -f crds.yaml -f common.yaml -f operator.yaml

Wait for the container to start. Only running can proceed to the next step

[root@k8s-master01 ceph]# kubectl -n rook-ceph get pod
NAME                                                     READY   STATUS      RESTARTS   AGE
rook-ceph-operator-675f59664d-b9nch                      1/1     Running     0          32m
rook-discover-4m68r                                      1/1     Running     0          40m
rook-discover-chscc                                      1/1     Running     0          40m
rook-discover-mmk69                                      1/1     Running     0          40m

3.5 creating ceph clusters

kubectl create -f cluster.yaml

After creation, you can view the status of the pod:

[root@k8s-master01 ceph]# kubectl -n rook-ceph get pod
NAME                                                     READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-8d6zn                                   3/3     Running     0          39m
csi-cephfsplugin-dr6wd                                   3/3     Running     0          39m
csi-cephfsplugin-gblpg                                   3/3     Running     0          39m
csi-cephfsplugin-provisioner-846ffc6cb4-qjv7s            6/6     Running     0          39m
csi-cephfsplugin-provisioner-846ffc6cb4-wbjzg            6/6     Running     0          39m
csi-rbdplugin-6bd9t                                      3/3     Running     0          39m
csi-rbdplugin-9b6gt                                      3/3     Running     0          39m
csi-rbdplugin-9vtpp                                      3/3     Running     0          39m
csi-rbdplugin-provisioner-75fd5c779f-9989z               6/6     Running     0          39m
csi-rbdplugin-provisioner-75fd5c779f-zx49t               6/6     Running     0          39m
rook-ceph-crashcollector-k8s-master01-75bb6c6dd9-lnncg   1/1     Running     0          38m
rook-ceph-crashcollector-k8s-node-90-84b555c8c8-5vt72    1/1     Running     0          38m
rook-ceph-crashcollector-k8s-node-94-798667dd4b-dzvbw    1/1     Running     0          31m
rook-ceph-mgr-a-86d4459f5b-8bk49                         1/1     Running     0          38m
rook-ceph-mon-a-847d986b98-tff45                         1/1     Running     0          39m
rook-ceph-mon-b-566894d545-nbw2t                         1/1     Running     0          39m
rook-ceph-mon-c-58c5789c6-xz5l7                          1/1     Running     0          38m
rook-ceph-operator-675f59664d-b9nch                      1/1     Running     0          32m
rook-ceph-osd-0-76db9d477d-dz9kf                         1/1     Running     0          38m
rook-ceph-osd-1-768487dbc8-g7zq9                         1/1     Running     0          31m
rook-ceph-osd-2-5d9f8d6fb-bfwtk                          1/1     Running     0          31m
rook-ceph-osd-prepare-k8s-master01-4b4mp                 0/1     Completed   0          31m
rook-ceph-osd-prepare-k8s-node-90-7jg4n                  0/1     Completed   0          31m
rook-ceph-osd-prepare-k8s-node-94-4mb7g                  0/1     Completed   0          31m
rook-discover-4m68r                                      1/1     Running     0          40m
rook-discover-chscc                                      1/1     Running     0          40m
rook-discover-mmk69                                      1/1     Running     0          40m

The osd-0, osd-1 and osd-2 containers must exist and work normally. If the above pod s run successfully, the cluster installation is considered successful.

3.6 installing ceph client tools

The path of this file is still in the ceph folder

kubectl  create -f toolbox.yaml -n rook-ceph

After the container is Running, you can execute the relevant commands:

[root@k8s-master01 ~]# kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- bash
[root@rook-ceph-tools-fc5f9586c-m2wf5 /]# ceph status
    id:     9016340d-7f90-4634-9877-aadc927c4e81
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            clock skew detected on mon.b

    mon: 3 daemons, quorum a,b,c (age 3m)
    mgr: a(active, since 44m)
    osd: 3 osds: 3 up (since 38m), 3 in (since 38m)

    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   3.0 GiB used, 57 GiB / 60 GiB avail
    pgs:     1 active+clean

Common commands:

ceph status
ceph osd status
ceph df 
rados df

3.7 configuring ceph dashboard

The default ceph is the installed ceph dashboard, but its svc address is service clusterIP and cannot be accessed externally

kubectl apply -f dashboard-external-https.yaml

The NodePort type can be accessed externally

[root@k8s-master01 ~]# kubectl get svc -n rook-ceph|grep dashboard
rook-ceph-mgr-dashboard                  ClusterIP   <none>        8443/TCP            49m
rook-ceph-mgr-dashboard-external-https   NodePort    <none>        8443:32529/TCP      49m

Browser access (replace master01 IP with your own cluster ip):


The user name is admin by default, and the password can be obtained through the following code:

kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}"|base64 --decode && echo

4 delete cluster and clear data

4.1 delete Cephcluster CRD

kubectl -n rook-ceph delete cephcluster rook-ceph

After confirming the deletion in the previous step, query

kubectl -n rook-ceph get cephcluster

4.2 delete Operator and related resources

kubectl delete -f operator.yaml
kubectl delete -f common.yaml
kubectl delete -f crds.yaml

4.3 deleting data on the host

When rook creates a cluster, some data will be unloaded into / var / lib / rook (the directory specified by datadirhostpath) of the local machine. If it is not deleted, it will affect the next cluster deployment. It is said that the next version of rook will add k8s local storage call function, so it will not be stored directly on the hard disk

rm -rf /var/lib/rook

4.4 erasing data on hard disk

When creating an osd, the data is written and needs to be erased. Otherwise, the ceph cluster cannot be created again. There are various hard disk erase commands in the script, which do not need to be executed successfully. It is determined according to the hard disk condition of the current machine.

vim clean-ceph.sh

#!/usr/bin/env bash

sgdisk --zap-all $DISK

dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync

blkdiscard $DISK

ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %

rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*


5.1 uninstall and delete CEPH rook, kubectl get ns, rook CEPH does not display Terminating and cannot be deleted


kubectl proxy &

kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json

curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json$NAMESPACE/finalize

5.2 another sequelae of uninstalling osd or cluster is that the rook CEPH namespace is deleted, but cephcluster cannot be deleted

#View namespace, deleted
[root@k8s-master01 ~]# kubectl get ns
NAME              STATUS   AGE
default           Active   22h
kube-node-lease   Active   22h
kube-public       Active   22h
kube-system       Active   22h
#See if the cluster still exists
[root@k8s-master01 ~]# kubectl -n rook-ceph get cephcluster
rook-ceph   /var/lib/rook   3        20h Progressing Configuring Ceph Mons
[root@k8s-master01 ~]# kubectl api-resources --namespaced=true -o name|xargs -n 1 kubectl get --show-kind --ignore-not-found -n rook-ceph
Error from server (MethodNotAllowed): the server does not allow this method on the requested resource
NAME                         TYPE                                  DATA   AGE
secret/default-token-lz6wh   kubernetes.io/service-account-token   3      8m34s
NAME                     SECRETS   AGE
serviceaccount/default   1         8m34s
Error from server (MethodNotAllowed): the server does not allow this method on the requested resource
NAME                                 DATADIRHOSTPATH   MONCOUNT   AGE   PHASE         MESSAGE                 HEALTH
cephcluster.ceph.rook.io/rook-ceph   /var/lib/rook     3          20h   Progressing   Configuring Ceph Mons   

#terms of settlement:
kubectl edit  cephcluster.ceph.rook.io -n rook-ceph
 hold finalizers Delete the value of, cephcluster.ceph.rook.io It will be deleted by itself

5.3 open dashboard to display HEALTH_WARN warning

Enter CEPH tools and execute the following commands:

ceph config set mon auth_allow_insecure_global_id_reclaim false

Other common warning resolution links:
