We are pleased to announce v1.0.0 (pre) release of GlusterFS CSI driver. The release source code can be downloaded from github.com/gluster/gluster-csi-driver/archive/1.0.0-pre.0.tar.gz. Compared to the previous beta version of the driver, this release makes Gluster CSI driver fully compatible with CSI spec v1.0.0 and Kubernetes release 1.13 ( kubernetes.io/blog/2018/12/03/kubernetes-1-13-release-announcement/ ) The CSI driver deployment can be …
What is this all about? I would like to keep a short blog article about glusterfs endpoint and service creation happening under the hood when you dynamically provision glusterfs volumes in kubernetes/Openshift cluster.
1) Provisioner Take Request
2) Call Heketi Volume Create API
3) Heketi Talk to Gluster and create a Volume
4) Upon successful completion of Volume creation, Provisioner name endpoint/service of form:
`glusterfs-dynamic-PVCNAME`
5) Then provisioner try to create Endpoint/Service
6) Upon successful endpoint/service creation, PVC move to BOUND state
Let us pay attention to Step “4”, here, the endpoint and service name is created with a prefix “glusterfs-dynamic” to PVC name.
However if you create a PVC with latest Kubernetes Version ( 1.13) , you would see that the endpoint and services are in below format:
If I list endpoints and service in my system:
[terminal][root@Node ~]# kubectl get ep
NAME ENDPOINTS AGE
[root@Node ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
glusterfs-dynamic-8e155928-f77f-11e8-b229-005056a5c1c4 ClusterIP 172.31.221.49 <none> 1/TCP 3h
glusterfs-dynamic-a71fa382-f77d-11e8-b229-005056a5c1c4 ClusterIP 172.31.147.132 <none> 1/TCP 4h
heketi-db-storage-endpoints ClusterIP 172.31.202.20 <none> 1/TCP 5d
heketi-storage ClusterIP 172.31.183.17 <none> 8080/TCP 5d
[/terminal]
Here I would like to map the endpoint/service for claim1, How would I do that?
In short:
*) Fetch PVC UID
*) Look for this UID in endpoint/service output
1) There is an issue if you delete and recreate a PVC with the same name without any delay. Due to a race scenario, the second PVC creation would end up without endpoint and service..
2) If user want to create a PVC with a name in range of 45-63 characters, it was not possible to create endpoint and service due to the prefix ( glusterfs-dynamic) added to it. The API server will reject the lengthy endpoint and service name.
Disadvantages:
It was very easy to fetch an endpoint/service in a namespace with PVC name with naming format `glusterfs-dynamic-pvcname`, but now it need some more steps.
On occasions, an Openshift/Kubernetes admin/user wants to make a clone of a Persistent Volume (PV) in a cluster for satisfying some of the requirements of the application pod. The standard way of doing ‘cloning’ is still under development in Kubernetes, however, there are some techniques that can be used to create a clone of a PV. In this blog article, I would like to discuss one method which is nothing but taking a clone via an annotation in the Persistent Volume claim object, very convenient, isn’t it?
As a user you don’t need to know what’s happening in the backend, but you will be given a new PVC object which has the contents of the PVC you were referencing in the annotation.
Even though the process is detailed in the demo video, I would like to mention the core step here:
Suppose you have a PVC called `claim1` and would like to take a clone of this volume, the only step you need to do here is to create a new PVC file with an annotation called `”k8s.io/CloneRequest”: “claim1” ` in it as shown below.
[terminal]
[root@node]# oc get pvc
VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE NAME STATUS
pvc-8cc1e066-32c0-11e8-915c-5254001e667e 1Gi RWX glusterfile 2m claim1 Bound
If your application pod use dynamically provisioned glusterfs volumes in your Kubernetes/openshift cluster, it could be that, the volumes are provisioned using `intree` gluster provisioner. Are you confused when I say “intree” ? by `intree` I am talking about the plugin/driver compiled and shipped with kubernetes/openshift release. While using `intree` drivers in the cluster, admin is not worried about how to deploy it, the reason being, by default it is running or registered with Kube API server and listening for persistent volume actions. However Kubernetes sig-storage is also maintaining an external-storage repo which hosts other controllers and provisioners.
External storage repo is not just hosting `external` provisioners, it also hosts other controllers like the snapshot controller. This repo is hosted under incubator and the beauty is that, the flexibility to update the code! If you ever contributed a patch to Kubernetes you would know how difficult it is :), you need to take care of many things including release timelines of kubernetes..etc.
We also had a requirement which has to be quickly satisfied for integrating gluster with one other important project, getting things done at dependent project’s timeline was bit difficult, I would explain what feature we were chasing in my next blog, till then let it hide. To achieve the same we started to implement an external file provisioner for gluster file. We already have an external gluster block provisioner in this same repo, if you are already using gluster block provisioner you know what is external provisioner and how to use it?
The external provisioner for gluster file was merged in external storage repo a couple of months back! You can pull this provisioner and run it in kubernetes/openshift as a pod! The workflow of creating and deleting the persistent volumes remains the same as you use intree plugins or drivers today. The only difference is that you need to mention or use ‘external provisioner’ name in the storage class instead of ‘intree provisioner name’
For example, in Kubernetes/openshift to provision gluster file volumes you would have used provisioner name `kubernetes.io/glusterfs` however in this case, the provisioner name is different and you could use `gluster.org/glusterfile`. The latter name or the external provisioner name is configurable and you can give any name. It’s a configuration parameter which has to be set in your pod template, the provisioner will listen for the requests which match the name specified in the storage class.
Below demo shows how to provision and delete volumes using this external file provisioner. All the artifacts which are used in this demo is available here . Follow this doc for file provisioner deployment .
Demo:
All good, you used external file provisioner and provisioned/deleted gluster file volumes, however, you may have a question comes up to your mind that, why we have one more provisioner for gluster file when we already have intree plugins for provisioner and delete volumes? The answer will be covered in the next blog post on this same topic 🙂
I will walk you through the setup to deploy GlusterFS containers in an atomic host environment with Flannel as overlay networking.
In this implementation, we will deploy 2 GlusterFS containers on two different atomic hosts respectively. The networking between the atomic hosts will be happening through the tunnel created by “Flannel” overlay networking. The “etcd” ( etcd is an open-source distributed key-value store that provides shared configuration and service discovery) service is used as a key-value store for this setup. The flannel daemon will be running in both atomic hosts. Flannel contact etcd server and fetch the networking configuration.
Once flanneld is configured in atomic hosts, the default docker network will fall into the same network of flannel.
The containers spawned on these atomic hosts get IP addresses from the same network of its flannel subnet. It assures the communication between the containers which runs on atomic hosts works.
In this configuration, we are trying to persist GlusterFs configuration data by exporting host filesystem to GlusterFS in containers. ie the directories ( `ex: /etc/glusterfs, /var/lib/glusterd, /var/log/glusterfs`) from atomic hosts are mounted in containers to make sure persistence of trusted pool metadata. Also, the container bind mount atomic host filesystem mount point ( for ex:`/mnt/brick1 `) which serve as the brick for gluster volume created in this trusted pool. Once the gluster volume is created, the glusterfs clients will be able to mount the volume using FUSE and NFS protocols.
Setup:
*) CentOS 7.1 atomic hosts and the atomic hosts are deployed inside KVM VMs.
Below diagram gives more details about this setup.
NOTE: If you already have an atomic host setup, skip ‘Section 1’ and proceed from Section 2.
First of all, “Cloud-Init” has to be configured for atomic installation. Cloud-Init iso requires
{user,meta}-data files.
Create a meta-data file with your desired hostname and instance-id.
$ vi meta-data
instance-id: atomic-test1
local-hostname: atomictest1
Create a user-data file. The #cloud-config directive at the beginning of the file is mandatory, not a comment. If you have multiple admins and ssh keys you’d like to access the default user, you can add a new ssh-rsa line.
After creating the user-data and meta-data files, generate an ISO file. Make sure the user running libvirt has the proper permissions to read the generated image.
I: -input-charset not specified, using utf-8 (detected in locale settings)
Total translation table size: 0
Total rockridge attributes bytes: 331
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
183 extents written (0 MB)
[/terminal]
NOTE: This example run on CentOS 7 atomic host, the qcow2 image of the same can be downloaded from https://wiki.centos.org/SpecialInterestGroup/Atomic/Download/
If you are creating atomic hosts in KVM VMs , please follow below process.
Creating with virt-manager
Here’s how to get started with Atomic on your machine using virt-manager on Linux. The instructions below are for running virt-manager on Fedora 21 or above. The steps may vary slightly when running older distributions of virt-manager.
Select File -> New Virtual Machine from the menu bar. The New VM dialog box will open.
Select the Import existing disk image option and click Forward. The next page in the New VM dialog will appear.
Click Browse. The Locate or create storage volume dialog will open.
Click Browse Local. The Locate existing storage dialog will open.
Navigate to the downloaded virtual machine file, select it, and click Open.
In the New VM dialog, select Linux for the OS type, Fedora 21 (or later) for the Version, and click Forward.
Adjust the VM’s RAM and CPU settings (if needed) and click Forward.
Select the checkbox next to Customize configuration before install and click Forward. This will allow you to add the metatdata ISO device before booting the VM.
Note: When running virt-manager on Red Hat Enterprise Linux 6 or CentOS 6, the VM will not boot until the disk storage format is changed from raw to qcow2.
Adding the CD-ROM device for the metadata source;
In the virt-manager GUI, click to open your Atomic machine. Then on the top bar click `View > Details`
Click on `Add Hardware` on the bottom left corner.
Choose `Storage`, and `Select managed or other existing storage`. Browse and select the init.iso image you created. Change the Device type to CD-ROM device. Click on Finish to create and attach this storage.
Then start the atomic installation, the cloud init will come into play and it will ask for “atomic host” login.
username: centos
password: atomic
Note: Above is based on the cloud-init configuration. If you have customized the cloud init configuration for different username and password, please supply the same.
Once you login in Server : Atomic host1:
[terminal]
[centos@atomictest1 ~]$ cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
By default etcd will be listening on “localhost” port 2379, make etcd to listen on all the interfaces, so that other atomic hosts can reach etcd and fetch flannel configuration data.
cu
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 222 100 222 0 0 1468 0 –:–:– –:–:– –:–:– 1470
{
“action”: “get”,
“node”: {
“createdIndex”: 3,
“key”: “/atomic01/network/config”,
“modifiedIndex”: 3,
“value”: “{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n”
}
}
[/terminal]
NOTE: Its *not* required to configure flannel in ETCD server, however
in this setup we are configuring flannel in ETCD server so that if needed,
this server can be used as a client for gluster volumes. We can mount the gluster volume and perform tests.
Set the flanneld network configuration service file as shown below.
[centos@atomicetcd ~]$ cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD=”http://192.168.122.7:2379″
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY=”/atomic01/network”
# Any additional options that you want to pass
FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”
[/terminal]
If both are in same network, we are good to proceed 🙂
Now, lets make sure the other 2 atomic hosts can connect and fetch the ETCD configuration data from atomicetcd server:
Make sure flanneld running on “atomictest{1,2} servers” contact “atomicetcd” server (192.168.122.7) to fetch the flannel configuration.
[centos@atomictest2 ~]$ cat /etc/sysconfig/flanneld
# Flanneld configuration options
# etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD=”http://192.168.122.7:2379″
# etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_KEY=”/atomic01/network”
# Any additional options that you want to pass
FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”
NOTE: Due to an issue with glusterd to form a trusted pool when flannel is configured as overlay networking solution, we need a hack in flannel configuration and docker configuration file for ip masquerading. Please note that, in above FLANNEL_OPTIONS value, “eth0” should be replaced with the network interface name of your atomic server and “-ip-masq” option should be set to ‘true’ to overcome above mentioned limitation. We also have to configure docker option as shown below.
Reboot ‘atomictest1’ and ‘atomictest2’ servers , once these servers are back, both ‘docker’ and ‘flanneld’ services should be up and running and should see ‘docker0’ and ‘flannel’ are in same network.
[centos@atomictest1 ~]$ systemctl status docker flanneld
docker.service – Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-flanneld-network.conf
/usr/lib/systemd/system/docker.service.d
└─flannel.conf
Active: active (running) since Sat 2015-09-26 10:57:12
…
flanneld.service – Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled)
Active: active (running) since Sat 2015-09-26 10:57:08 ……
[centos@atomictest1 ~]$ ifconfig |grep flags -A 2
docker0: flags=4099 mtu 1500
inet 10.0.80.1 netmask 255.255.255.0 broadcast 0.0.0.0
—
eth0: flags=4163 mtu 1500
inet 192.168.122.133 netmask 255.255.255.0 broadcast 192.168.122.255
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 908 100 908 0 0 5987 0 –:–:– –:–:– –:–:– 6013
{
“action”: “get”,
“node”: {
“createdIndex”: 5,
“dir”: true,
“key”: “/atomic01/network/subnets”,
“modifiedIndex”: 5,
“nodes”: [
{
“createdIndex”: 5,
“expiration”: “2015-09-27T10:38:13.318286933Z”,
“key”: “/atomic01/network/subnets/10.0.12.0-24”,
“modifiedIndex”: 5,
“ttl”: 84248,
“value”: “{\”PublicIP\”:\”192.168.122.7\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”96:1e:41:4a:aa:ce\”}}”
},
{
“createdIndex”: 6,
“expiration”: “2015-09-27T10:57:08.763771526Z”,
“key”: “/atomic01/network/subnets/10.0.80.0-24”,
“modifiedIndex”: 6,
“ttl”: 85384,
“value”: “{\”PublicIP\”:\”192.168.122.133\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”de:c1:3a:e4:64:fc\”}}”
},
{
“createdIndex”: 7,
“expiration”: “2015-09-27T11:09:55.54906845Z”,
“key”: “/atomic01/network/subnets/10.0.55.0-24”,
“modifiedIndex”: 7,
“ttl”: 86150,
“value”: “{\”PublicIP\”:\”192.168.122.188\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”c2:29:e0:13:d9:40\”}}”
}
]
}
}
Section 4: Run RHGS containers
[/terminal]
Once flannel is configured and it’s up, we have to pull the RHGS image from Red Hat internal registry via docker pull command as shown below. In order to obtain this image, one has to:
Let’s start hacking!! Most of the time, one needs to know the heketi configuration and topology he is running with. I do this at times and today thought of sharing with a wider audience, there are more hacking that can be done once you got a handle of the things, but I put a disclaimer …
Why custom volume name for dynamically provisioned PVs? I asked the same question to community users who requested this feature and the answer was, it helps a lot to filter the gluster volume names which are serving as persistent volumes. True, if there are any number of volumes say ~1000 volumes in your cluster figuring …
Every Container or Cloud space storage vendor wants a standard interface for an unbiased solution development which then will not require them to have a non-trivial testing matrix . “Container Storage Interface” (CSI) is a proposed new Industry standard for cluster-wide volume plugins. CSI will enable storage vendors (SP) to develop a plugin once and …
While the kubernetes storage-sig keeps adding new features in each and every release, we also listen to our user feedback to improve the existing storage interfaces in solving the existing limitations. Previously, the admin had to do proper capacity planning in order to use persistent Volumes as the microservices or the application pods wanted to …
Since introduction, the dynamic provisioning feature of Kube storage defaulted with reclaim policy as `Delete`. It was not specific to glusterfs PVs rather common for all PVs which are dynamically provisioned in kube storage. However from kube >= v1.8, we could specify `retain` policy in storage class! A much-needed functionality in SC for different use …