Deploy a ceph cluster using Rook (rook.io) in kubernetes

[Updated on 20-Jun-2020: Many changes in Rook Ceph in previous releases, so revisiting this blog article to accomodate the changes based on a ping in the slack 🙂 ]

In this article we will talk about, how to deploy Ceph ( a software-defined storage) cluster using a Kubernetes operator called rook’.
Before we get into the step by step process on how to deploy and get it working, let me touch upon what is ceph and rook.
Obviously we should know something about Ceph and Rook before we try it 🙂

Ceph

Ceph storage is an open, massively scalable storage solution for modern workloads like cloud infrastructure, data analytics, media repositories, and backup and restore systems. It can: Free you from the expensive lock-in of proprietary, hardware-based storage solutions..

If you want to know more about ceph refer https://ceph.com/

Rook

Rook is an open source cloud-native storage orchestrator for Kubernetes, providing the platform, framework, and support for a diverse set of storage solutions to natively integrate with cloud-native environments.

Rook turns storage software into self-managing, self-scaling, and self-healing storage services. It does this by automating deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management. Rook uses the facilities provided by the underlying cloud-native container management, scheduling and orchestration platform to perform its duties.

Rook integrates deeply into cloud-native environments leveraging extension points and providing a seamless experience for scheduling, lifecycle management, resource management, security, monitoring, and user experience…

etc

Refer# http://rook.io for more details.

Installation of Ceph Cluster in a working Kubernetes cluster has been made very easy by rook! I would say it is just a matter of 2/3 commands if you have a working Kubernetes cluster.

I have a Kubernetes cluster with 1 master node and 2 worker nodes.

Now, it is the turn to install rook and get the Ceph Cluster deployed in this kubernetes cluster.

Let us clone the rook repo first. Considering rook version 1.3 is the latest version available as of today, lets use that branch/release:

[terminal]

[root@node]# git clone https://github.com/rook/rook.git
Cloning into ‘rook’…
remote: Enumerating objects: 29, done.
remote: Counting objects: 100% (29/29), done.
remote: Compressing objects: 100% (29/29), done.
Receiving objects: 37% (19965/53810), 11.44 MiB | 739.00 KiB/s0 KiB/s
3eceiving objects: 37% (19965/53810), 14.18 MiB | 396.00 KiB/s3.98 MiB | 392.00 KiB/s
remote: Total 53810 (delta 6), reused 10 (delta 0), pack-reused 53781
Receiving objects: 100% (53810/53810), 34.29 MiB | 513.00 KiB/s, done.
Resolving deltas: 100% (36995/36995), done.
[root@node]# cd rook
[root@node rook]# git checkout release-1.3
Branch ‘release-1.3’ set up to track remote branch ‘release-1.3’ from ‘origin’.
Switched to a new branch ‘release-1.3’
[/terminal]

Now the cloning is done, let’s go inside the “rook” directory and fetch contents as shown below:

As you already read, ‘rook’ is a Kubernetes operator that is capable of deploying other software including ceph. In this article we are
interested in Ceph – so let’s deploy ceph using rook operator.

[terminal]

[humble@node kubernetes]# pwd
/root/rook/cluster/examples/kubernetes
[humble@node kubernetes]#

[root@dhcp53-147 kubernetes]# ls
cassandra ceph cockroachdb edgefs mysql.yaml nfs README.md wordpress.yaml yugabytedb

[humble@node]# cd ceph
[humble@node]# pwd
/root/rook/cluster/examples/kubernetes/ceph
[humble@node ceph]# ls
ceph-client.yaml dashboard-external-https.yaml nfs.yaml pool-test.yaml
cluster-external-management.yaml dashboard-external-http.yaml object-bucket-claim-delete.yaml pool.yaml
cluster-external.yaml dashboard-ingress-https.yaml object-bucket-claim-retain.yaml rgw-external.yaml
cluster-on-pvc.yaml dashboard-loadbalancer.yaml object-ec.yaml scc.yaml
cluster-test.yaml direct-mount.yaml object-openshift.yaml storageclass-bucket-delete.yaml
cluster.yaml filesystem-ec.yaml object-test.yaml storageclass-bucket-retain-external.yaml
common-external.yaml filesystem-test.yaml object-user.yaml storageclass-bucket-retain.yaml
common.yaml filesystem.yaml object.yaml toolbox.yaml
create-external-cluster-resources.py flex operator-openshift.yaml upgrade-from-v1.2-apply.yaml
create-external-cluster-resources.sh import-external-cluster.sh operator.yaml upgrade-from-v1.2-crds.yaml
csi monitoring pool-ec.yaml
[humble@node ceph]#

Creation of Rook Operator is pretty easy as shown in below in 3 commands!

[terminal]
[humble@node rook]# cd cluster/examples/kubernetes/ceph
[humble@node ceph]# kubectl create -f common.yaml
namespace/rook-ceph created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclients.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephrbdmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectrealms.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzonegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzones.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
customresourcedefinition.apiextensions.k8s.io/objectbuckets.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/objectbucketclaims.objectbucket.io created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-object-bucket created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt-rules created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global-rules created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster-rules created
clusterrole.rbac.authorization.k8s.io/rook-ceph-object-bucket created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system-rules created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
podsecuritypolicy.policy/00-rook-privileged created
clusterrole.rbac.authorization.k8s.io/psp:rook created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-default-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter-psp created
serviceaccount/rook-csi-cephfs-plugin-sa created
serviceaccount/rook-csi-cephfs-provisioner-sa created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin-rules created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
serviceaccount/rook-csi-rbd-plugin-sa created
serviceaccount/rook-csi-rbd-provisioner-sa created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin-rules created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner-rules created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created

[humble@node ceph]# kubectl create -f operator.yaml
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created

[humble@node ceph]# kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-567d7945d6-2tlkf 1/1 Running 0 6s
rook-discover-6m5mj 1/1 Running 0 5s
rook-discover-csstp 1/1 Running 0 5s
rook-discover-flflh 1/1 Running 0 5s
[humble@node ceph]#

[/terminal]

Next step is bringing up the Ceph Cluster! We have a sample/default template YAML in the same place and let’s list out the default configuration of the same

[terminal]
[root@dhcp53-147 ceph]# cat cluster.yaml|grep -v “#”

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: ceph/ceph:v14.2.9
allowUnsupported: false
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
mon:
count: 3
allowMultiplePerNode: false
mgr:
modules:
– name: pg_autoscaler
enabled: true
dashboard:
enabled: true
ssl: true
monitoring:
enabled: false
rulesNamespace: rook-ceph
network:
rbdMirroring:
workers: 0
crashCollector:
disable: false
cleanupPolicy:
confirmation: “”

annotations:
resources:
removeOSDsIfOutAndSafeToRemove: false
useAllNodes: true
useAllDevices: true
config:
disruptionManagement:
managePodBudgets: false
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api
[/terminal]

Why to wait more? lets go ahead and create a Ceph Cluster.

[terminal]
[humble@node ceph]# kubectl apply -f cluster.yaml
cephcluster.ceph.rook.io/rook-ceph created
[/terminal]

Your Ceph Cluster is coming up!!! … Watch for new pods and see what’s coming up in your cluster.

[terminal]

NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-2qntx 3/3 Running 0 22m
csi-cephfsplugin-8k2dx 3/3 Running 0 22m
csi-cephfsplugin-provisioner-7469b99d4b-ghrvg 5/5 Running 0 22m
csi-cephfsplugin-provisioner-7469b99d4b-tdhgd 5/5 Running 0 22m
csi-cephfsplugin-rbjlr 3/3 Running 0 22m
csi-rbdplugin-cv7b5 3/3 Running 0 3m28s
csi-rbdplugin-ks4hj 3/3 Running 0 3m35s
csi-rbdplugin-mpfc9 3/3 Running 0 3m35s
csi-rbdplugin-provisioner-865f4d8d-bx888 6/6 Running 0 22m
csi-rbdplugin-provisioner-865f4d8d-v846g 6/6 Running 0 22m
rook-ceph-crashcollector-dhcp53-136.lab.eng.blr.redhat.comqpfx2 1/1 Running 0 21m
rook-ceph-crashcollector-dhcp53-147.lab.eng.blr.redhat.comjqtp9 1/1 Running 0 21m
rook-ceph-crashcollector-dhcp53-148.lab.eng.blr.redhat.com728kh 1/1 Running 0 21m
rook-ceph-mgr-a-67b8954597-ljl4r 1/1 Running 0 21m
rook-ceph-mon-a-6c9d54899d-k2gzh 1/1 Running 0 21m
rook-ceph-mon-b-755dc649c7-gvwtg 1/1 Running 0 21m
rook-ceph-mon-c-7db9b6fd5c-fd5nj 1/1 Running 0 21m
rook-ceph-operator-5b6674cb6-hj9cq 1/1 Running 0 84m
rook-ceph-osd-prepare-dhcp53-136.lab.eng.blr.redhat.com-s4dzw 0/1 Completed 0 20m
rook-ceph-osd-prepare-dhcp53-147.lab.eng.blr.redhat.com-s6f4w 0/1 Completed 0 20m
rook-ceph-osd-prepare-dhcp53-148.lab.eng.blr.redhat.com-lb46h 0/1 Completed 0 20m
rook-discover-fzfpx 1/1 Running 0 64m
rook-discover-jmcxz 1/1 Running 0 42m
rook-discover-zrjt4 1/1 Running 0 84m

[/terminal]

Additionally you can deploy ceph toolbox pod.

[terminal]
humble@node:$ kubectl create -f toolbox.yaml
deployment.apps/rook-ceph-tools created
[/terminal]

To list out the services deployed in this cluster:

[terminal]
[humble@node ceph]# kubectl get svc -n rook-ceph
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
csi-cephfsplugin-metrics ClusterIP 10.96.9.179 8080/TCP,8081/TCP 53m
csi-rbdplugin-metrics ClusterIP 10.101.48.14 8080/TCP,8081/TCP 53m
rook-ceph-mgr ClusterIP 10.98.8.199 9283/TCP 52m
rook-ceph-mgr-dashboard ClusterIP 10.110.217.174 8443/TCP 52m
rook-ceph-mon-a ClusterIP 10.107.195.221 6789/TCP,3300/TCP 53m
rook-ceph-mon-b ClusterIP 10.101.23.143 6789/TCP,3300/TCP 52m
rook-ceph-mon-c ClusterIP 10.96.166.220 6789/TCP,3300/TCP 52m
[/terminal]

In the above output you see, the monitor services and its cluster IPs :

[terminal]
[humble@node ceph]# kubectl logs rook-ceph-tools-67788f4dd7-5rc7s -n rook-ceph
Sat Jun 20 15:08:43 UTC 2020 writing mon endpoints to /etc/ceph/ceph.conf: c=10.96.166.220:6789,a=10.107.195.221:6789,b=10.101.23.143:6789
[/terminal]

Just to mention below are some details about the CephFS and RBD pods and containers deployed here:

CephFS -(csi-cephfsplugin-provisioner-*) : [csi-attacher csi-resizer csi-provisioner csi-cephfsplugin liveness-prometheus]

RBD – (csi-rbdplugin-provisioner-*) : [csi-attacher csi-resizer csi-provisioner csi-rbdplugin liveness-prometheus csi-snapshotter]

CephFS Node plugin pod (csi-rbdplugin-*) : [driver-registrar csi-cephfsplugin liveness-prometheus]

RBD Node plugin pod (csi-cephfsplugin-*) : [driver-registrar csi-rbdplugin liveness-prometheus]

The CephFS & RBD provisioner container logs looks like this:

[terminal]
[humble@node ceph]# kubectl logs csi-cephfsplugin-provisioner-7469b99d4b-ghrvg -c csi-cephfsplugin-provisioner -n rook-ceph
error: container csi-cephfsplugin-provisioner is not valid for pod csi-cephfsplugin-provisioner-7469b99d4b-ghrvg
[humble@node ceph]# kubectl logs csi-cephfsplugin-provisioner-7469b99d4b-ghrvg -c csi-provisioner -n rook-ceph
I0620 14:17:13.243633 1 csi-provisioner.go:98] Version: v1.4.0-0-g1d9bad3
I0620 14:17:13.243708 1 csi-provisioner.go:112] Building kube configs for running in cluster…
I0620 14:17:13.251183 1 connection.go:151] Connecting to unix:///csi/csi-provisioner.sock
I0620 14:17:14.252274 1 connection.go:261] Probing CSI driver for readiness
I0620 14:17:14.253996 1 leaderelection.go:241] attempting to acquire leader lease rook-ceph/rook-ceph-cephfs-csi-ceph-com…

[humble@node ceph]# kubectl logs csi-rbdplugin-provisioner-865f4d8d-bx888 -c csi-provisioner -n rook-ceph
0620 14:16:46.098838 1 csi-provisioner.go:98] Version: v1.4.0-0-g1d9bad3
I0620 14:16:46.098893 1 csi-provisioner.go:112] Building kube configs for running in cluster…
I0620 14:16:46.106968 1 connection.go:151] Connecting to unix:///csi/csi-provisioner.sock
W0620 14:16:56.107257 1 connection.go:170] Still connecting to unix:///csi/csi-provisioner.sock
W0620 14:17:06.107378 1 connection.go:170] Still connecting to unix:///csi/csi-provisioner.sock
I0620 14:17:12.145464 1 connection.go:261] Probing CSI driver for readiness
I0620 14:17:12.147946 1 leaderelection.go:241] attempting to acquire leader lease rook-ceph/rook-ceph-rbd-csi-ceph-com…
I0620 14:17:12.155246 1 leaderelection.go:251] successfully acquired lease rook-ceph/rook-ceph-rbd-csi-ceph-com
I0620 14:17:12.155899 1 controller.go:770] Starting provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-865f4d8d-bx888_112ea0fe-2030-450b-8bdf-cf906b88300c!
I0620 14:17:12.156169 1 volume_store.go:97] Starting save volume queue
I0620 14:17:12.256701 1 controller.go:819] Started provisioner controller rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-865f4d8d-bx888_112ea0fe-2030-450b-8bdf-cf906b88300c!

[humble@node ceph]#

[/terminal]

Now you have a working Ceph Cluster, You can create a Storage class and Persistent volume claim (PVC) from this cluster,

[terminal]
humble@node:$ kubectl create -f storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created
[/terminal]

Make sure you have a storage class (SC) available to consume:

[terminal]
humble@node$ kubectl get sc
NAME PROVISIONER AGE
rook-ceph-block ceph.rook.io/block 27s
humble@node:$
[/terminal]

Once the SC is created, you can create a PVC which points to this SC and create a Claim!! I don’t want to repeat these steps in every blog article, so skipping it for now.1

Digiprove sealCopyright secured by Digiprove © 2019-2020 Humble Chirammal

1 thought on “Deploy a ceph cluster using Rook (rook.io) in kubernetes”

Comments are closed.