Kudos to the Ceph CSI community for all the hard work to reach this critical milestone. This is our first official release ( tracked @ https://github.com/ceph/ceph-csi/issues/353 ) and it is out on 12-Jul-2019. This is a huge release with many improvements in CSI based volume provisioning by making use of the latest Ceph release ( Nautilus ) for its use in production with Kubernetes clusters. One of the main highlights of this release is ceph subvolume based volume provisioning and deletion.
Highlights of this release:
*) CephFS subvolume/manager based volume provisioning and deletion.
*) E2E test support for PVC creation, App pod mounting.etc.
*) CSI spec v1.1 support
*) Added support for kube v1.15.
*) Configuration store change from configmap to rados omap
*) Mount options support for CephFS and RBD volumes
*) Move locks to more granular locking than CPU count based
*) Rbd support for ReadWriteMany PVCs for block mode
*) Unary plugin code for ‘CephFS and RBD’ drivers.
*) Driver name updated to CSI spec standard.
*) helm chart updates.
*) sidecar updates to the latest available.
*) RBAC corrections and aggregated role addition.
*) Lock protection for create,delete volume ..etc operations
*) Added support for external snapshottter.
*) Added support for CSIDriver CRD.
*) Support matrix table availability.
*) Many linter fixes and error code fixes.
*) Removal of dead code paths.
*) StripSecretInArgs in pkg/util.
*) Migration to klog from glog
……….
Many other bug fixes, code improvements, README updates are also part of this release. The container image is tagged with “v1.1.0” and its downloadable by docker pull quay.io/cephcsi/cephcsi:v1.1.0
We have also updated the support matrix for better notification of available CSI features and its status in upstream.
https://github.com/ceph/ceph-csi#Support-Matrix
We would like to thank the Rook team for unblocking the CSI project at various stages!
We are not stopping here, but moves forward at a good pace to catch up with our next ‘feature’ rich release tracked at https://github.com/ceph/ceph-csi/issues/393. If you would like to see some features or get some bug fixes done in the next release, please help us by mentioning it in the same release tracker.
We are also kickstarting upstream bug triage call from next week, so please be part of it. More details about this call is available @ https://github.com/ceph/ceph-csi/issues/463
Happy Hacking!
PS/NOTE: This release needs the latest Ceph Nautilus cluster to support cephfs subvolume provisioning and this version of the cluster is made available if you deploy CSI with Rook Master.
I have recently published a blog on how to deploy Ceph Cluster in a kube setup. If you don’t have this cluster up and running please refer this article. For this attempt we need below components/software deployed successfully in a setup. Kubernetes Ceph cluster Ceph CSI driver The first two deployments ( Kubernetes cluster and …
[Updated on 20-Jun-2020: Many changes in Rook Ceph in previous releases, so revisiting this blog article to accomodate the changes based on a ping in the slack 🙂 ] In this article we will talk about, how to deploy Ceph ( a software-defined storage) cluster using a Kubernetes operator called ‘rook’. Before we get into …
We are pleased to announce v1.0.0 (pre) release of GlusterFS CSI driver. The release source code can be downloaded from github.com/gluster/gluster-csi-driver/archive/1.0.0-pre.0.tar.gz. Compared to the previous beta version of the driver, this release makes Gluster CSI driver fully compatible with CSI spec v1.0.0 and Kubernetes release 1.13 ( kubernetes.io/blog/2018/12/03/kubernetes-1-13-release-announcement/ ) The CSI driver deployment can be …
I have been advising many users on various channels ( mail, slack..etc) on how to accomplish PVC migration or reattaching an existing PV to a new PVC for various use cases in the past. That said, the use cases involve scenarios like if the user wants to attach a new PVC to an older/existing PV or it could be that someone wants to migrate a PVC from one namespace to another. But, this hack/workaround always remained out of support contract and helped folks who wanted to achieve the end result in some manner, so keep it in mind before you attempt this.
The PVC PV binding always appears that a 1:1 mapping and at times users want to attach an existing PV to a new PVC which could be in another namespace.
Lets start: As you know, a bound PVC has a reclaimPolicy which is default to “delete”. If the PV which you want to attach to a new PVC is of “delete” policy you need to edit the PV spec and mark reclaimPolicy as “Retain”.
“persistentVolumeReclaimPolicy”: “Retain”,
Before you begin all of this process, lets backup an existing PVC yaml/json:
oc get pvc -o yaml > backup_pvc.yaml
As for any hacks on storage, I would recommend to backup the data on the volume which is mapped to PV. Data is critical always! so based on your criticality, back up it from the storage backend. The storage backend could be any and in my case it is GlusterFS.
Once the data is backed up, let’s delete the original PVC.
kubectl delete pvc pvcname
When you delete the PVC, the PV state should move along. It should soon transitioned to Released State. Wait for the PVC status to reflect “Released” and once its on “Released” state, edit the PV and delete claimRef field from PV spec/definition, which refers to now-deleted PVC.
Extract or fetch and keep the PV name for the future. We need that for the new PVC. Once we have it, create a new PVC in the desired namespace that refers to the volumeName field to the old PV name.
You are here, so there are two possibilities, either you already know about below terms/strings or you want to know more about these strings. In any case, I have to touch upon these strings before we proceed further. Custom Resource Definitions ( CRD) Custom Resources ( CR) Operators/Controllers Operator SDK Custome Resource Definitions/CRDs: In the …
What is this all about? I would like to keep a short blog article about glusterfs endpoint and service creation happening under the hood when you dynamically provision glusterfs volumes in kubernetes/Openshift cluster.
1) Provisioner Take Request
2) Call Heketi Volume Create API
3) Heketi Talk to Gluster and create a Volume
4) Upon successful completion of Volume creation, Provisioner name endpoint/service of form: glusterfs-dynamic-PVCNAME
5) Then provisioner try to create Endpoint/Service
6) Upon successful endpoint/service creation, PVC move to BOUND state
Let us pay attention to Step “4”, here, the endpoint and service name is created with a prefix “glusterfs-dynamic” to PVC name.
However if you create a PVC with latest Kubernetes Version ( 1.13) , you would see that the endpoint and services are in below format:
“claim1” PVC got UID “a71fa382-f77d-11e8-b229-005056a5c1c4” and endpoint/service “glusterfs-dynamic-a71fa382-f77d-11e8-b229-005056a5c1c4″ belong to this claim!!
Let us wrap it up. Why this change?
This change was introduced mainly because of:
1) There is an issue if you delete and recreate a PVC with the same name without any delay. Due to a race scenario, the second PVC creation would end up without endpoint and service..
2) If user want to create a PVC with a name in range of 45-63 characters, it was not possible to create endpoint and service due to the prefix ( glusterfs-dynamic) added to it. The API server will reject the lengthy endpoint and service name.
Disadvantages:
It was very easy to fetch an endpoint/service in a namespace with PVC name with naming format glusterfs-dynamic-pvcname, but now it need some more steps.
On occasions, an Openshift/Kubernetes admin/user wants to make a clone of a Persistent Volume (PV) in a cluster for satisfying some of the requirements of the application pod. The standard way of doing ‘cloning’ is still under development in Kubernetes, however, there are some techniques that can be used to create a clone of a PV. In this blog article, I would like to discuss one method which is nothing but taking a clone via an annotation in the Persistent Volume claim object, very convenient, isn’t it?
As a user you don’t need to know what’s happening in the backend, but you will be given a new PVC object which has the contents of the PVC you were referencing in the annotation.
Even though the process is detailed in the demo video, I would like to mention the core step here:
Suppose you have a PVC called claim1 and would like to take a clone of this volume, the only step you need to do here is to create a new PVC file with an annotation called "k8s.io/CloneRequest": "claim1" in it as shown below.
[root@node]# oc get pvc
VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE NAME STATUS
pvc-8cc1e066-32c0-11e8-915c-5254001e667e 1Gi RWX glusterfile 2m claim1 Bound
If your application pod use dynamically provisioned glusterfs volumes in your Kubernetes/openshift cluster, it could be that, the volumes are provisioned using intree gluster provisioner. Are you confused when I say “intree” ? by intree I am talking about the plugin/driver compiled and shipped with kubernetes/openshift release. While using intree drivers in the cluster, admin is not worried about how to deploy it, the reason being, by default it is running or registered with Kube API server and listening for persistent volume actions. However Kubernetes sig-storage is also maintaining an external-storage repo which hosts other controllers and provisioners.
External storage repo is not just hosting external provisioners, it also hosts other controllers like the snapshot controller. This repo is hosted under incubator and the beauty is that, the flexibility to update the code! If you ever contributed a patch to Kubernetes you would know how difficult it is :), you need to take care of many things including release timelines of kubernetes..etc.
We also had a requirement which has to be quickly satisfied for integrating gluster with one other important project, getting things done at dependent project’s timeline was bit difficult, I would explain what feature we were chasing in my next blog, till then let it hide. To achieve the same we started to implement an external file provisioner for gluster file. We already have an external gluster block provisioner in this same repo, if you are already using gluster block provisioner you know what is external provisioner and how to use it?
The external provisioner for gluster file was merged in external storage repo a couple of months back! You can pull this provisioner and run it in kubernetes/openshift as a pod! The workflow of creating and deleting the persistent volumes remains the same as you use intree plugins or drivers today. The only difference is that you need to mention or use ‘external provisioner’ name in the storage class instead of ‘intree provisioner name’
For example, in Kubernetes/openshift to provision gluster file volumes you would have used provisioner name kubernetes.io/glusterfs however in this case, the provisioner name is different and you could use gluster.org/glusterfile. The latter name or the external provisioner name is configurable and you can give any name. It’s a configuration parameter which has to be set in your pod template, the provisioner will listen for the requests which match the name specified in the storage class.
Below demo shows how to provision and delete volumes using this external file provisioner. All the artifacts which are used in this demo is available here . Follow this doc for file provisioner deployment .
Demo:
All good, you used external file provisioner and provisioned/deleted gluster file volumes, however, you may have a question comes up to your mind that, why we have one more provisioner for gluster file when we already have intree plugins for provisioner and delete volumes? The answer will be covered in the next blog post on this same topic 🙂
I will walk you through the setup to deploy GlusterFS containers in an atomic host environment with Flannel as overlay networking.
In this implementation, we will deploy 2 GlusterFS containers on two different atomic hosts respectively. The networking between the atomic hosts will be happening through the tunnel created by “Flannel” overlay networking. The “etcd” ( etcd is an open-source distributed key-value store that provides shared configuration and service discovery) service is used as a key-value store for this setup. The flannel daemon will be running in both atomic hosts. Flannel contact etcd server and fetch the networking configuration.
Once flanneld is configured in atomic hosts, the default docker network will fall into the same network of flannel.
The containers spawned on these atomic hosts get IP addresses from the same network of its flannel subnet. It assures the communication between the containers which runs on atomic hosts works.
In this configuration, we are trying to persist GlusterFs configuration data by exporting host filesystem to GlusterFS in containers. ie the directories ( ex: /etc/glusterfs, /var/lib/glusterd, /var/log/glusterfs) from atomic hosts are mounted in containers to make sure persistence of trusted pool metadata. Also, the container bind mount atomic host filesystem mount point ( for ex:/mnt/brick1 ) which serve as the brick for gluster volume created in this trusted pool. Once the gluster volume is created, the glusterfs clients will be able to mount the volume using FUSE and NFS protocols.
Setup:
*) CentOS 7.1 atomic hosts and the atomic hosts are deployed inside KVM VMs.
Below diagram gives more details about this setup.
NOTE: If you already have an atomic host setup, skip ‘Section 1’ and proceed from Section 2.
First of all, “Cloud-Init” has to be configured for atomic installation. Cloud-Init iso requires
{user,meta}-data files.
Create a meta-data file with your desired hostname and instance-id.
$ vi meta-data
instance-id: atomic-test1
local-hostname: atomictest1
Create a user-data file. The #cloud-config directive at the beginning of the file is mandatory, not a comment. If you have multiple admins and ssh keys you’d like to access the default user, you can add a new ssh-rsa line.
After creating the user-data and meta-data files, generate an ISO file. Make sure the user running libvirt has the proper permissions to read the generated image.
I: -input-charset not specified, using utf-8 (detected in locale settings)
Total translation table size: 0
Total rockridge attributes bytes: 331
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
183 extents written (0 MB)
NOTE: This example run on CentOS 7 atomic host, the qcow2 image of the same can be downloaded from https://wiki.centos.org/SpecialInterestGroup/Atomic/Download/
If you are creating atomic hosts in KVM VMs , please follow below process.
Creating with virt-manager
Here’s how to get started with Atomic on your machine using virt-manager on Linux. The instructions below are for running virt-manager on Fedora 21 or above. The steps may vary slightly when running older distributions of virt-manager.
Select File -> New Virtual Machine from the menu bar. The New VM dialog box will open.
Select the Import existing disk image option and click Forward. The next page in the New VM dialog will appear.
Click Browse. The Locate or create storage volume dialog will open.
Click Browse Local. The Locate existing storage dialog will open.
Navigate to the downloaded virtual machine file, select it, and click Open.
In the New VM dialog, select Linux for the OS type, Fedora 21 (or later) for the Version, and click Forward.
Adjust the VM’s RAM and CPU settings (if needed) and click Forward.
Select the checkbox next to Customize configuration before install and click Forward. This will allow you to add the metatdata ISO device before booting the VM.
Note: When running virt-manager on Red Hat Enterprise Linux 6 or CentOS 6, the VM will not boot until the disk storage format is changed from raw to qcow2.
Adding the CD-ROM device for the metadata source;
In the virt-manager GUI, click to open your Atomic machine. Then on the top bar click View > Details
Click on Add Hardware on the bottom left corner.
Choose Storage, and Select managed or other existing storage. Browse and select the init.iso image you created. Change the Device type to CD-ROM device. Click on Finish to create and attach this storage.
Then start the atomic installation, the cloud init will come into play and it will ask for “atomic host” login.
username: centos
password: atomic
Note: Above is based on the cloud-init configuration. If you have customized the cloud init configuration for different username and password, please supply the same.
Once you login in Server : Atomic host1:
[centos@atomictest1 ~]$ cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
By default etcd will be listening on “localhost” port 2379, make etcd to listen on all the interfaces, so that other atomic hosts can reach etcd and fetch flannel configuration data.
cu
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 222 100 222 0 0 1468 0 –:–:– –:–:– –:–:– 1470
{
“action”: “get”,
“node”: {
“createdIndex”: 3,
“key”: “/atomic01/network/config”,
“modifiedIndex”: 3,
“value”: “{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n”
}
}
NOTE: Its *not* required to configure flannel in ETCD server, however
in this setup we are configuring flannel in ETCD server so that if needed,
this server can be used as a client for gluster volumes. We can mount the gluster volume and perform tests.
Set the flanneld network configuration service file as shown below.
[centos@atomicetcd ~]$ cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD=”http://192.168.122.7:2379″
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY=”/atomic01/network”
# Any additional options that you want to pass
FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”
If both are in same network, we are good to proceed 🙂
Now, lets make sure the other 2 atomic hosts can connect and fetch the ETCD configuration data from atomicetcd server:
Make sure flanneld running on “atomictest{1,2} servers” contact “atomicetcd” server (192.168.122.7) to fetch the flannel configuration.
[centos@atomictest2 ~]$ cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD=”http://192.168.122.7:2379″
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY=”/atomic01/network”
# Any additional options that you want to pass
FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”
NOTE: Due to an issue with glusterd to form a trusted pool when flannel is configured as overlay networking solution, we need a hack in flannel configuration and docker configuration file for ip masquerading. Please note that, in above FLANNEL_OPTIONS value, “eth0” should be replaced with the network interface name of your atomic server and “-ip-masq” option should be set to ‘true’ to overcome above mentioned limitation. We also have to configure docker option as shown below.
Reboot ‘atomictest1’ and ‘atomictest2’ servers , once these servers are back, both ‘docker’ and ‘flanneld’ services should be up and running and should see ‘docker0’ and ‘flannel’ are in same network.
[centos@atomictest1 ~]$ systemctl status docker flanneld
docker.service – Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-flanneld-network.conf
/usr/lib/systemd/system/docker.service.d
└─flannel.conf
Active: active (running) since Sat 2015-09-26 10:57:12
…
flanneld.service – Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled)
Active: active (running) since Sat 2015-09-26 10:57:08 ……
[centos@atomictest1 ~]$ ifconfig |grep flags -A 2
docker0: flags=4099 mtu 1500
inet 10.0.80.1 netmask 255.255.255.0 broadcast 0.0.0.0
—
eth0: flags=4163 mtu 1500
inet 192.168.122.133 netmask 255.255.255.0 broadcast 192.168.122.255
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 908 100 908 0 0 5987 0 –:–:– –:–:– –:–:– 6013
{
“action”: “get”,
“node”: {
“createdIndex”: 5,
“dir”: true,
“key”: “/atomic01/network/subnets”,
“modifiedIndex”: 5,
“nodes”: [
{
“createdIndex”: 5,
“expiration”: “2015-09-27T10:38:13.318286933Z”,
“key”: “/atomic01/network/subnets/10.0.12.0-24”,
“modifiedIndex”: 5,
“ttl”: 84248,
“value”: “{\”PublicIP\”:\”192.168.122.7\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”96:1e:41:4a:aa:ce\”}}”
},
{
“createdIndex”: 6,
“expiration”: “2015-09-27T10:57:08.763771526Z”,
“key”: “/atomic01/network/subnets/10.0.80.0-24”,
“modifiedIndex”: 6,
“ttl”: 85384,
“value”: “{\”PublicIP\”:\”192.168.122.133\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”de:c1:3a:e4:64:fc\”}}”
},
{
“createdIndex”: 7,
“expiration”: “2015-09-27T11:09:55.54906845Z”,
“key”: “/atomic01/network/subnets/10.0.55.0-24”,
“modifiedIndex”: 7,
“ttl”: 86150,
“value”: “{\”PublicIP\”:\”192.168.122.188\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”c2:29:e0:13:d9:40\”}}”
}
]
}
}
Section 4: Run RHGS containers
Once flannel is configured and it’s up, we have to pull the RHGS image from Red Hat internal registry via docker pull command as shown below. In order to obtain this image, one has to: