OpenShift is a platform as a service product from Red Hat. The software that runs the service is open-sourced under the name OpenShift Origin, and is available on GitHub.
OpenShift v3 is a layered system designed to expose underlying Docker and Kubernetes concepts as accurately as possible, with a focus on easy composition of applications by a developer. For example, install Ruby, push code, and add MySQL.
Docker is an open platform for developing, shipping, and running applications. With Docker you can separate your applications from your infrastructure and treat your infrastructure like a managed application. Docker does this by combining kernel containerization features with workflows and tooling that help you manage and deploy your applications. Docker containers wrap up a piece of software in a complete filesystem that contains everything it needs to run: code, runtime, system tools, system libraries – anything you can install on a server. Available on GitHub.
Kubernetes is an open-source system for automating deployment, operations, and scaling of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes builds upon a decade and a half of experience of running production workloads at Google, combined with best-of-breed ideas and practices from the community. Available on GitHub.
GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data- and bandwidth-intensive tasks. GlusterFS is free and open source software. Available on GitHub.
Hope you know a little bit of all the above Technologies, now we jump right into our topic which is Persistent Volume and Persistent volume claim in Kubernetes and Openshift v3 using GlusterFS volume. So what is Persistent Volume? Why do we need it? How does it work using GlusterFS Volume Plugin?
In Kubernetes, Managing storage is a distinct problem from managing compute. The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. To do this we introduce two new API resources in kubernetes: PersistentVolume and PersistentVolumeClaim.
A PersistentVolume (PV) is a piece of networked storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g, can be mounted once read/write or many times read-only).
In simple words, Containers in Kubernetes Cluster need some storage which should be persistent even if the container goes down or no longer needed. So Kubernetes Administrator creates a Storage(GlusterFS storage, In this case) and creates a PV for that storage. When a Developer (Kubernetes cluster user) needs a Persistent Volume in a container, creates a Persistent Volume claim. Persistent Volume Claim will contain the options which Developer needs in the pods. So from list of Persistent Volume the best match is selected for the claim and Binded to the claim. Now the developer can use the claim in the pods.
Need a Kubernetes or Openshift cluster, My setup is one master and three nodes.
Note: you can use kubectl in place of oc, oc is openshift controller which is a wrapper around kubectl. I am not sure about the difference.
#oc get nodes
NAME LABELS STATUS AGE
dhcp42-144.example.com kubernetes.io/hostname=dhcp42-144.example.com,name=node3 Ready 15d
dhcp42-235.example.com kubernetes.io/hostname=dhcp42-235.example.com,name=node1 Ready 15d
dhcp43-174.example.com kubernetes.io/hostname=dhcp43-174.example.com,name=node2 Ready 15d
dhcp43-183.example.com kubernetes.io/hostname=dhcp43-183.example.com,name=master Ready,SchedulingDisabled 15d
2) Have a GlusterFS cluster setup, Create a GlusterFS Volume and start the GlusterFS volume.
# gluster v status
Status of volume: gluster_vol
Gluster process TCP Port RDMA Port Online Pid
Brick 188.8.131.52:/gluster_brick 49152 0 Y 8771
Brick 184.108.40.206:/gluster_brick 49152 0 Y 7443
NFS Server on localhost 2049 0 Y 7463
NFS Server on 220.127.116.11 2049 0 Y 8792
Task Status of Volume gluster_vol
There are no active volume tasks
3) All nodes in kubernetes cluster must have GlusterFS-Client Package installed.
Now we have the prerequisites \o/ …
In Kube-master administrator has to write required yaml file which will be given as input to the kube cluster.
There are three files to be written by administrator and one by Developer.
Service Keeps the endpoint to be persistent or active.
Endpoint is the file which points to the GlusterFS cluster location.
PV is Persistent Volume where the administrator will define the gluster volume name, capacity of volume and access mode.
PVC is persistent volume claim where developer defines the type of storage as needed.
STEP 1: Create a service for the gluster volume.
# cat gluster_pod/gluster-service.yaml
- port: 1
# oc create -f gluster_pod/gluster-service.yaml
service "glusterfs-cluster" created
# oc get service
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
glusterfs-cluster 172.30.251.13 <none> 1/TCP <none> 9m
kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP <none> 16d
STEP 2: Create an Endpoint for the gluster service
# cat gluster_pod/gluster-endpoints.yaml
- ip: 18.104.22.168
- port: 1
The ip here is the glusterfs cluster ip.
# oc create -f gluster_pod/gluster-endpoints.yaml
endpoints "glusterfs-cluster" created
# oc get endpoints
NAME ENDPOINTS AGE
glusterfs-cluster 22.214.171.124:1 3m
kubernetes 126.96.36.199:8053,188.8.131.52:8443,184.108.40.206:8053 16d
STEP 3: Create a PV for the gluster volume.
# cat gluster_pod/gluster-pv.yaml
Note : path here is the gluster volume name. Access mode specifies the way to access the volume. Capacity has the storage size of the GlusterFS volume.
# oc create -f gluster_pod/gluster-pv.yaml
persistentvolume "gluster-default-volume" created
# oc get pv
NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE
gluster-default-volume <none> 8Gi RWX Available 36s
STEP 4: Create a PVC for the gluster PV.
# cat gluster_pod/gluster-pvc.yaml
Note: the Developer request for 8 Gb of storage with access mode rwx.
# oc create -f gluster_pod/gluster-pvc.yaml
persistentvolumeclaim "glusterfs-claim" created
# oc get pvc
NAME LABELS STATUS VOLUME CAPACITY ACCESSMODES AGE
glusterfs-claim <none> Bound gluster-default-volume 8Gi RWX 14s
Here the pvc is bounded as soon as created, because it found the PV that satisfies the requirement. Now lets go and check the pv status
# oc get pv
NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE
gluster-default-volume <none> 8Gi RWX Bound default/glusterfs-claim 5m
See now the PV has been bound to “default/glusterfs-claim”. In this state developer has the Persistent Volume Claim bounded successfully, now the developer can use the pv claim like below.
STEP 5: Use the persistent Volume Claim in a Pod defined by the Developer.
# cat gluster_pod/gluster_pod.yaml
- name: mygluster
- mountPath: "/home"
- name: gluster-default-volume
The above pod definition will pull the humble/gluster-client image(some private image) and start init script. The gluster volume will be mounted on the host machine by the GlusterFS volume Plugin available in the kubernetes and then bind mounted to the container’s /home. So all the Kubernetes cluster nodes must have glusterfs-client packages.
Lets try running.
# oc create -f gluster_pod/fedora_pod.yaml
pod "mypod" created
# oc get pods
NAME READY STATUS RESTARTS AGE
mypod 1/1 Running 0 1m
Wow its running… lets go and check where it is running.
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ec57d62e3837 humble/gluster-client "/usr/sbin/init" 4 minutes ago Up 4 minutes k8s_myfedora.dc1f7d7a_mypod_default_5d301443-ec20-11e5-9076-5254002e937b_ed2eb8e5
1439dd72fb1d openshift3/ose-pod:v220.127.116.11 "/pod" 4 minutes ago Up 4 minutes k8s_POD.e071dbf6_mypod_default_5d301443-ec20-11e5-9076-5254002e937b_4d6a7afb
Found the Pod running successfully on one of the Kubernetes node.
On the host:
# df -h | grep gluster_vol
18.104.22.168:gluster_vol 35G 4.0G 31G 12% /var/lib/origin/openshift.local.volumes/pods/5d301443-ec20-11e5-9076-5254002e937b/volumes/kubernetes.io~glusterfs/gluster-default-volume
I can see the gluster volume being mounted on the host \o/. Lets check inside the container. Note the random number is the container-id from the docker ps command.
# docker exec -it ec57d62e3837 /bin/bash
[root@mypod /]# df -h | grep gluster_vol
22.214.171.124:gluster_vol 35G 4.0G 31G 12% /home
Yippy the GlusterFS volume has been mounted inside the container on /home as mentioned in the pod definition. Lets try writing something to it
[root@mypod /]# mkdir /home/demo
[root@mypod /]# ls /home/
Since the AccessMode is RWX I am able to write to the mount point.
That’s all Folks.