ceph csi v3.0.0 released-(Snapshot, Clone, Multi arch, ROX…)

We are excited to announce the third major release of ceph-csi, v3.0.0 !!
The Ceph-CSI team is excited that it has reached the next milestone with the release of v3.0.0! [1]. This release is not limited to features – many critical bug fixes, documentation updates are also part of this release. This is another great release with many improvements for Ceph and CSI integration to use in production with Kubernetes/openshift clusters. Of all the many features and bug fixes here are just a few of the highlights.


New Features:

Create/Delete snapshot for RBD
Create PVC from RBD snapshot
Create PVC from RBD PVC
Add support for multiple CephFS subvolume groups
Multi Architecture docker images(amd64 and arm64)
Support ROX(ReadOnlyMany) PVC for RBD
Support ROX(ReadOnlyMany) PVC for CephFS

Enhancement:

Move to go-ceph binding from RBD CLI
Move to go-ceph binding from RADOS CLI
Add Upgrade E2E testing from 2.1.2 to 3.0.0
Update Sidecars to the latest version
Improve locking to create a parallel clone and snapshot restore
Simplify Error Handling
Update golangci-lint version in CI
Update gosec version in CI
Add support to track cephfs PVC and subvolumes
Introduce build.env for configuration of the environment variables
Update go-ceph to v0.4.0
Update E2E testing to test with latest kubernetes versions
Split out CephFS and RBD E2E tests
Integration with Centos CI to run containerized builds
Update Rook to 1.2.7 for E2E testing
Disable reflink when creating xfs filesystem for RBD
Replace klog with klog v2
Reduce RBAC for kubernetes sidecar containers
Add option to compile e2e tests in containerized
Add commitlint bot in CI
Add Stale bot to the repo
Add E2E and documentation for CephFS PVC
Update kubernetes dependency to v1.18.6

Bug Fix:

Fix issue in CephFS Volume Not found

Breaking Changes

Remove support for v1.x.x PVC
Remove support for Mimic
Snapshot Alpha is no longer supported

Lets touch upon some more of the very cool features introduced in this release:

“Snapshot and Clone functionality made available with RBD”:

Since v1.0.0 release of Ceph CSI we had RBD snapshot in place but we marked it as “Alpha” for few reasons. One of them being the snapshot support in Kubernetes upstream CSI driver was also evolving and stabilizing on the API side of things. Its not only that, we had an issue of image lockup when we try to consume or untangle parent source volume from the snapshot volume/object. So we did a revamp in this version and we are happy to say that, we have it solved and with this release . From now on, RBD snapshot should work smooth.

Its capable of:

*) Creating a snapshot from a RBD volume
*) Restoring to a new volume from the existing snapshot
*) Deletion of snapshot and parent volume objects independently.

We also enabled a cool functionality here with the Volume Cloning feature. Thats nothing but the capability of provisioning a volume from another or existing PVC source.

Dont confuse this with “Restore of a snapshot”

The main difference here is that, While you restore from a snapshot, for the new PVC, the referred Datasource is “Snapshot” and if you do clone operation, the “DataSource” is an existing PVC.


Multiple subvolumegroup support

When CephFS CSI driver create a volume, we default to a subvolumegroup called “csi” , We place our subvolumes in this group. I mean the backend CephFS volumes which map to the PVC in openshift or kubernetes namespace. With this release, you are allowed to specify multiple subvolume groups! This comes handy in some setups where you could have a segregation of the subvolumes for various purposes!

ROX support for both RBD and CephFS

ROX is nothing but an access mode which you could define while requesting a PVC from Kubernetes/Openshift. What that means to an end user is that, the workload get a READONLY share. In a plain dynamic provisioning workflow it does not make much sense. The reasoning is that, you are requesting a volume of size “X” and the backend driver provision it from the storage cluster, but its an “empty” volume and if you attach this to a workload as “READONLY” , in nutshell its an empty volume and you cant write the data to it!! So the question was “Whats the use of ROX volumes” ?

BUT, with the snap and clone use case there is a big use case behind it. That said, think about a scenario where you have a “VM template” which you want to consume it as READONLY image! For such use case ROX support add much value! and here we are with that support!

Mutli arch image support for Ceph CSI

Our community users want to use multi arch images for amd64 and arm64 ..etc. It was indeed available for last few versions, however the manifest files were not carrying this properly and it is corrected in this release and images are available in quay.io.

Updated sidecars & Kuberenetes dependency lift to 1.18.6

We have upgraded the CSI community provided sidecars to latest versions and also brought the Kubernetes dependency chain to 1.18.6 version. This itself bring many bug fixes, improvements, features..etc! I dont want to list them here as its really huge!

Performance improvements – Especially on go ceph bindings

We kept on improving the performance of CSI driver and especially its connection with backend cluster by making use of latest go-ceph version ( v4.0). We have seen great improvement in the backend connection workflow and overall in the life cycle of volume management.. So worth a mention here!

Code Cleanup, Better E2E, what not?

Great amount of code cleanup, E2E improvement , Documentation update …etc are part of this release!

…so on

The container image is tagged with “v3.0.0” and its downloadable by #docker pull ..

Kudos to the Ceph CSI community for all the hard work to reach this critical milestone!
The Ceph-CSI project ( https://github.com/ceph/ceph-csi/), as well as its thriving community, has continued to grow and we are happy to share that, this is our 11th release since Jul 12, 2019!!

We are not stopping here, rather marching towards v3.1.0 (https://github.com/ceph/ceph-csi/issues/1272) with some more feature enhancement as tracked in the release issue.
One of the very important feature we are targeting with v3.1.0 release is that, CephFS snapshot and Clone functionality, watch this space for more update!

Reach us at slack (https://cephcsi.slack.com) or at github: https://github.com/ceph/ceph-csi/

Happy Hacking!

[1]
Release Issue: https://github.com/ceph/ceph-csi/issues/865
ceph-csi v3.0.0 tag: https://github.com/ceph/ceph-csi/releases/tag/v3.0.0,
Release Images: https://quay.io/repository/cephcsi/cephcsi?tab=tags

Ceph-CSI v2.0.0 released with multi arch support, encryption, expansion..etc!

We are excited to announce the second major release of ceph-CSI, v2.0.0 !!

In the last few months, ceph-csi community has been tirelessly working to improve the project with many new features, bug fixes, usability improvement..etc. Some of the important features of this release include the capability of resizing RBD and CephFS volumes on demand, Encryption with LUKS support for RBD PVCs, Multi arch support (ceph-csi Arm64 image), Compatibility with kube 1.17, Upgraded sidecar containers..etc. This release is not limited features – many critical bug fixes, documentation updates are also part of this release [1][2].

[1]https://github.com/ceph/ceph-csi/issues/557
[2]https://github.com/ceph/ceph-csi/releases/tag/v2.0.0

An excerpt from the changelog:


Added dynamic resize support for CephFS PVCs
Added dynamic resize support for RBD PVCs
Added encryption with LUKS support for RBD PVCs
Mutli arch support ( ceph-csi Arm64 image)
Upgrade documentation from v1.2.2 to v2.0.0
Updated code base to kube v1.17
leader election enabled in deployment
Added Version flag to cephcsi
Removed Kubernetes 1.13.x support with 2.0.0 release
CSI: run all containers as privileged in daemonset pods
Upgrade: csi-attacher sidecar from v1.2.0 to v2.1.0
Upgrade: csi-snapshotter sidecar from v1.2.1 to v1.2.2
Upgrade: csi-node-driver-registrar sidecar from v1.1.0 to v1.2.0
Upgrade: csi-resizer from sidecar v0.3.0 to v0.4.0
Update csi-provisioner from sidecar v1.3.0 to v1.4.0
Remove deprecated containerized flag in rbd
Discard umount error if the directory is not mounted
Use EmptyDir to store provisioner socket
Add ContentSource to the CreateVolume response
Rbd: only load nbd module if not available yet
Enhance scripts to deploy ceph cluster using rook
Add e2e tests for RBD resizer
Update minikube to latest released version
Update golangci-lint version to v1.21.0
Fix to use kubectl create not kubectl apply in the e2e
Add volume size roundoff for expand request
Add E2E for cephfs resize functionality
Add Documentation for PVC resize
Fix block resize issue in RBD
Add 13.0.0 Mimic supported version to the readme
update Metrics supported version in Readme
Remove hard-coded UpdateStrategy from templates
Add E2E for block PVC resize
Enable logging in E2E if the test fails
Enable Block E2E for rbd
Add ID-based logging for ExpandVolume
Validate rbd image name in NodeExpand

There are some features we deferred from this release to next. Reach out to us via https://github.com/ceph/ceph-csi/issues/806 if you would like to see any feature/bugfix/update as part of the next ceph csi release. The CSI community has been growing since our first release ( v1.0.0) and we are glad to share that, many new contributors have come together to make ceph csi a production-grade CSI driver with this release!

Thanks to all!

Ceph CSI v1.2.0 , v1.2.1 releases and so on

If you care about CSI and Ceph plugin, you would have noticed a massive improvement that is going on in the ceph CSI repo for last 4/5 months!! We started to engage this repo heavily to address upstream user issues, many many bug fixes, improvements, streamlining the communication in the community, release planning and rolling out releases, integration with Rook project (https://rook.io/) to make it as default storage driver in Rook…etc. With all these efforts, we are hearing lots of very positive feedback from community users.

We were able to roll out the very first release of CSI – v1.1.0 around 3 months back with many changes and solid code in the project. We were continuing our efforts to release CSI v1.2.0 series and at present we have rolled out v1.2.0 and v1.2.1 !!

I would like to list down some of the highlights of these releases here:

Ceph CSI v1.2.0 Release

Release Issue: https://github.com/ceph/ceph-csi/issues/393

Changelog or highlights:

*) Cephfs: Use ceph kernel client if kernel version >= 4.17
*) implement grpc metrics for ceph-csi
*) Add xfs fstype as default type in storageclass
*) Add support to use ceph manager rbd command to delete an image
*) e2e: correct log format in execCommandInPod()
*) Add 'gosec' to the static-checks
*) switch to cephfs, utils, and csicommon to new loging system
*) utility to trace backend volume from RBD pvc
*) Implement context-based logging
*) implement klog wrapper
*) unmap rbd image if connection timeout.
*) start controller or node server based on config
*) fix: Adds liveness sidecar to v1.14+ helm charts
*) Prometheus liveness probe sidecar
*) Wrap error if failed to fetch mon
*) provisioners: add reconfiguring of PID limit
*) Use "rbd device list" to list and find rbd images and their device paths
*) Update Unstage transaction to undo steps done in Stage
*) Move mounting staging instance to a sub-path within staging path
*) e2e: do not fail to delete resources when "resource not found"
*) remove post validation of rbd device

Many other bug fixes, code improvements, README updates are also part of this release.


CSI v1.2.1 Release.

Release Issue # https://github.com/ceph/ceph-csi/issues/600

*) Change the recommended/default FS for RBD to ext4
*) Use nodiscard option while formatting RBD devices.
*) Use provisioner socket while probing liveness.
*) Reject request if the operation is in progress
*) Fix pod termination issue due to stale mount after node plugin restart.

….etc

As you can see in the changelog, a great amount of features, bug fixes.. etc are part of v1.2 release series.
We are not stopping here, rather marching towards v1.2’s next minor release without much delay. You can track the release items from https://github.com/ceph/ceph-csi/issues/639 .

Kudos to the Ceph CSI community for all the hard work to reach this critical milestone.

I would like to summarize this article by mentioning that, your participation is highly encouraged for future releases of Ceph CSI!

Talk to us via GitHub issues/PRs or Slack https://cephcsi.slack.com/ or other channels.

Happy hacking.

Deploy a ceph cluster using Rook (rook.io) in kubernetes

[Updated on 20-Jun-2020: Many changes in Rook Ceph in previous releases, so revisiting this blog article to accomodate the changes based on a ping in the slack 🙂 ] In this article we will talk about, how to deploy Ceph ( a software-defined storage) cluster using a Kubernetes operator called ‘rook’. Before we get into …

Read more

Gluster CSI driver 1.0.0 (pre) release is out!!

We are pleased to announce v1.0.0 (pre) release of GlusterFS CSI driver. The release source code can be downloaded from github.com/gluster/gluster-csi-driver/archive/1.0.0-pre.0.tar.gz. Compared to the previous beta version of the driver, this release makes Gluster CSI driver fully compatible with CSI spec v1.0.0 and Kubernetes release 1.13 ( kubernetes.io/blog/2018/12/03/kubernetes-1-13-release-announcement/ ) The CSI driver deployment can be …

Read more

How to reattach a PVC to an existing PV or migrate PVC from one namespace to another in Kubernetes/Openshift cluster

I have been advising many users on various channels ( mail, slack..etc) on how to accomplish PVC migration or reattaching an existing PV to a new PVC for various use cases in the past. That said, the use cases involve scenarios like if the user wants to attach a new PVC to an older/existing PV or it could be that someone wants to migrate a PVC from one namespace to another. But, this hack/workaround always remained out of support contract and helped folks who wanted to achieve the end result in some manner, so keep it in mind before you attempt this.

The PVC PV binding always appears that a 1:1 mapping and at times users want to attach an existing PV to a new PVC which could be in another namespace.

Lets start: As you know, a bound PVC has a reclaimPolicy which is default to “delete”. If the PV which you want to attach to a new PVC is of “delete” policy you need to edit the PV spec and mark reclaimPolicy as “Retain”.

“persistentVolumeReclaimPolicy”: “Retain”,

Before you begin all of this process, lets backup an existing PVC yaml/json:

oc get pvc -o yaml > backup_pvc.yaml

As for any hacks on storage, I would recommend to backup the data on the volume which is mapped to PV. Data is critical always! so based on your criticality, back up it from the storage backend. The storage backend could be any and in my case it is GlusterFS.

Once the data is backed up, let’s delete the original PVC.

kubectl delete pvc pvcname

When you delete the PVC, the PV state should move along. It should soon transitioned to Released State. Wait for the PVC status to reflect “Released” and once its on “Released” state, edit the PV and delete claimRef field from PV spec/definition, which refers to now-deleted PVC.

Extract or fetch and keep the PV name for the future. We need that for the new PVC. Once we have it, create a new PVC in the desired namespace that refers to the volumeName field to the old PV name.

For example:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: newclaim spec: accessModes: – ReadWriteMany resources: requests: storage: 5Gi storageClassName: glusterfs volumeMode: Filesystem volumeName: pvc-3466cff4-g4gb-12e9-962b-54009bg11116

The new PVC should bind to existing PV and volume should become available in new namespace.!!

glusterfs: Clone glusterfs volume (PV) in a kubernetes/openshift cluster via PVC annotation

On occasions, an Openshift/Kubernetes admin/user wants to make a clone of a Persistent Volume (PV) in a cluster for satisfying some of the requirements of the application pod. The standard way of doing ‘cloning’ is still under development in Kubernetes, however, there are some techniques that can be used to create a clone of a PV. In this blog article, I would like to discuss one method which is nothing but taking a clone via an annotation in the Persistent Volume claim object, very convenient, isn’t it?

As a user you don’t need to know what’s happening in the backend, but you will be given a new PVC object which has the contents of the PVC you were referencing in the annotation.

Even though the process is detailed in the demo video, I would like to mention the core step here:

Suppose you have a PVC called claim1 and would like to take a clone of this volume, the only step you need to do here is to create a new PVC file with an annotation called "k8s.io/CloneRequest": "claim1" in it as shown below.

[root@node]# oc get pvc VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE NAME STATUS pvc-8cc1e066-32c0-11e8-915c-5254001e667e 1Gi RWX glusterfile 2m claim1 Bound

[root@node]# cat claim1clone.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: claim1clone annotations: volume.beta.kubernetes.io/storage-class: “glusterfile” “k8s.io/CloneRequest”: “claim1” spec: accessModes: – ReadWriteMany resources: requests: storage: 1Gi

Cool ? Isnt it ? see it in action by demo video:

Demo

I hope this help some admins/users, if so please try it out.

ps # As always please ask your questions or provide feedback, I value it a lot!

[glusterfs] An external file provisioner to dynamically provision Gluster File Volumes.

If your application pod use dynamically provisioned glusterfs volumes in your Kubernetes/openshift cluster, it could be that, the volumes are provisioned using intree gluster provisioner. Are you confused when I say “intree” ? by intree I am talking about the plugin/driver compiled and shipped with kubernetes/openshift release. While using intree drivers in the cluster, admin is not worried about how to deploy it, the reason being, by default it is running or registered with Kube API server and listening for persistent volume actions. However Kubernetes sig-storage is also maintaining an external-storage repo which hosts other controllers and provisioners.

External storage repo is not just hosting external provisioners, it also hosts other controllers like the snapshot controller. This repo is hosted under incubator and the beauty is that, the flexibility to update the code! If you ever contributed a patch to Kubernetes you would know how difficult it is :), you need to take care of many things including release timelines of kubernetes..etc.

We also had a requirement which has to be quickly satisfied for integrating gluster with one other important project, getting things done at dependent project’s timeline was bit difficult, I would explain what feature we were chasing in my next blog, till then let it hide. To achieve the same we started to implement an external file provisioner for gluster file. We already have an external gluster block provisioner in this same repo, if you are already using gluster block provisioner you know what is external provisioner and how to use it?

The external provisioner for gluster file was merged in external storage repo a couple of months back! You can pull this provisioner and run it in kubernetes/openshift as a pod! The workflow of creating and deleting the persistent volumes remains the same as you use intree plugins or drivers today. The only difference is that you need to mention or use ‘external provisioner’ name in the storage class instead of ‘intree provisioner name’

For example, in Kubernetes/openshift to provision gluster file volumes you would have used provisioner name kubernetes.io/glusterfs however in this case, the provisioner name is different and you could use gluster.org/glusterfile. The latter name or the external provisioner name is configurable and you can give any name. It’s a configuration parameter which has to be set in your pod template, the provisioner will listen for the requests which match the name specified in the storage class.

Below demo shows how to provision and delete volumes using this external file provisioner. All the artifacts which are used in this demo is available here . Follow this doc for file provisioner deployment .

Demo:

All good, you used external file provisioner and provisioned/deleted gluster file volumes, however, you may have a question comes up to your mind that, why we have one more provisioner for gluster file when we already have intree plugins for provisioner and delete volumes? The answer will be covered in the next blog post on this same topic 🙂

Deploy GlusterFS containers in an atomic host environment with Flannel as overlay networking.

I will walk you through the setup to deploy GlusterFS containers in an atomic host environment with Flannel as overlay networking.

In this implementation, we will deploy 2 GlusterFS containers on two different atomic hosts respectively. The networking between the atomic hosts will be happening through the tunnel created by “Flannel” overlay networking. The “etcd” ( etcd is an open-source distributed key-value store that provides shared configuration and service discovery) service is used as a key-value store for this setup. The flannel daemon will be running in both atomic hosts. Flannel contact etcd server and fetch the networking configuration.

Once flanneld is configured in atomic hosts, the default docker network will fall into the same network of flannel.
The containers spawned on these atomic hosts get IP addresses from the same network of its flannel subnet. It assures the communication between the containers which runs on atomic hosts works.

In this configuration, we are trying to persist GlusterFs configuration data by exporting host filesystem to GlusterFS in containers. ie the directories ( ex: /etc/glusterfs, /var/lib/glusterd, /var/log/glusterfs) from atomic hosts are mounted in containers to make sure persistence of trusted pool metadata. Also, the container bind mount atomic host filesystem mount point ( for ex:/mnt/brick1 ) which serve as the brick for gluster volume created in this trusted pool. Once the gluster volume is created, the glusterfs clients will be able to mount the volume using FUSE and NFS protocols.

Setup:

*) CentOS 7.1 atomic hosts and the atomic hosts are deployed inside KVM VMs.

Below diagram gives more details about this setup.

NOTE: If you already have an atomic host setup, skip ‘Section 1’ and proceed from Section 2.

Section 1: Configuration of atomic hosts :

PLATFORM/ HOST OS :

[root@humble-server]# cat /etc/redhat-release Fedora release 22 (Twenty Two)

[root@humble-server]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.122.7 atomicetcd → Atomic ETCD server 192.168.122.133 atomictest1 → Atomic HOST1 192.168.122.188 atomictest2 → Atomic HOST2 [root@humble-server]#

First of all, “Cloud-Init” has to be configured for atomic installation. Cloud-Init iso requires {user,meta}-data files.

Create a meta-data file with your desired hostname and instance-id.

$ vi meta-data instance-id: atomic-test1 local-hostname: atomictest1

Create a user-data file. The #cloud-config directive at the beginning of the file is mandatory, not a comment. If you have multiple admins and ssh keys you’d like to access the default user, you can add a new ssh-rsa line.

$ vi user-data #cloud-config password: atomic ssh_pwauth: True chpasswd: { expire: False }

ssh_authorized_keys: – ssh-rsa … ..

After creating the user-data and meta-data files, generate an ISO file. Make sure the user running libvirt has the proper permissions to read the generated image.

[root@humble-server /]# genisoimage -output init.iso -volid cidata -joliet -rock user-data meta-data

I: -input-charset not specified, using utf-8 (detected in locale settings) Total translation table size: 0 Total rockridge attributes bytes: 331 Total directory bytes: 0 Path table size(bytes): 10 Max brk space used 0 183 extents written (0 MB)

NOTE: This example run on CentOS 7 atomic host, the qcow2 image of the same can be downloaded from https://wiki.centos.org/SpecialInterestGroup/Atomic/Download/

If you are creating atomic hosts in KVM VMs , please follow below process.
Creating with virt-manager
Here’s how to get started with Atomic on your machine using virt-manager on Linux. The instructions below are for running virt-manager on Fedora 21 or above. The steps may vary slightly when running older distributions of virt-manager.

Select File -> New Virtual Machine from the menu bar. The New VM dialog box will open.
Select the Import existing disk image option and click Forward. The next page in the New VM dialog will appear.
Click Browse. The Locate or create storage volume dialog will open.
Click Browse Local. The Locate existing storage dialog will open.
Navigate to the downloaded virtual machine file, select it, and click Open.
In the New VM dialog, select Linux for the OS type, Fedora 21 (or later) for the Version, and click Forward.
Adjust the VM’s RAM and CPU settings (if needed) and click Forward.
Select the checkbox next to Customize configuration before install and click Forward. This will allow you to add the metatdata ISO device before booting the VM.

Note: When running virt-manager on Red Hat Enterprise Linux 6 or CentOS 6, the VM will not boot until the disk storage format is changed from raw to qcow2.

Adding the CD-ROM device for the metadata source;

In the virt-manager GUI, click to open your Atomic machine. Then on the top bar click View > Details
Click on Add Hardware on the bottom left corner.
Choose Storage, and Select managed or other existing storage. Browse and select the init.iso image you created. Change the Device type to CD-ROM device. Click on Finish to create and attach this storage.

Then start the atomic installation, the cloud init will come into play and it will ask for “atomic host” login.

username: centos
password: atomic

Note: Above is based on the cloud-init configuration. If you have customized the cloud init configuration for different username and password, please supply the same.

Once you login in Server : Atomic host1:

[centos@atomictest1 ~]$ cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core)

[centos@atomictest1 ~]$ hostname atomictest1.localdomain

[centos@atomictest1 ~]$ ip a |grep inet inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host inet 192.168.122.133/24 brd 192.168.122.255 scope global dynamic eth0 inet6 fe80::5054:ff:fe49:3b95/64 scope link inet 172.17.42.1/16 scope global docker0

[centos@atomictest1 ~]$ sudo rpm-ostree upgrade Updating from: centos-atomic-host:centos-atomic-host/7/x86_64/standard No upgrade available. [centos@atomictest1 ~]$ rpm -qa |egrep ‘docker|flannel|atomic’ docker-1.7.1-108.el7.centos.x86_64 atomic-1.0-108.el7.centos.x86_64 flannel-0.2.0-10.el7.x86_64 …. [centos@atomictest1 ~]$ brctl show bridge name bridge id STP enabled interfaces docker0 8000.56847afe9799 no [centos@atomictest1 ~]$ ps aux |grep docker root 1929 0.0 1.2 289084 12644 ? Ssl 18:47 0:00 /usr/bin/docker -d –selinux-enabled –storage-driver devicemapper –storage-opt dm.fs=xfs –storage-opt dm.thinpooldev=/dev/mapper/atomicos-docker–pool –storage-opt dm.use_deferred_removal=true

Server: Atomic host 2

[centos@atomictest2 ~]$ cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core) [centos@atomictest2 ~]$ hostname atomictest2.localdomain [centos@atomictest2 ~]$ ip a |grep inet inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host inet 192.168.122.188/24 brd 192.168.122.255 scope global dynamic ens3 inet6 fe80::5054:ff:fec6:4241/64 scope link inet 172.17.42.1/16 scope global docker0 [centos@atomictest2 ~]$

Flannel requires etcd server , in this example we are configuring etcd in another atomic host which runs same CentOS atomic host image.

Section 2: Configuration of etcd server

Server : Atomic-etcd:

[centos@atomicetcd ~]$ cat /etc/redhat-release CentOS Linux release 7.1.1503 (Core)

[centos@atomicetcd ~]$ rpm -qa |grep etcd etcd-2.0.13-2.el7.x86_64

[centos@atomicetcd ~]$ ifconfig |grep inet inet 172.17.42.1 netmask 255.255.0.0 broadcast 0.0.0.0 inet 192.168.122.7 netmask 255.255.255.0 broadcast 192.168.122.255 inet6 fe80::5054:ff:fef6:5c5e prefixlen 64 scopeid 0x20 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10

[centos@atomicetcd ~]$ sudo systemctl start etcd [centos@atomicetcd ~]$ sudo systemctl status etcd etcd.service – Etcd Server Loaded: loaded (/usr/lib/systemd/system/etcd.service; disabled) Active: active (running) since Sat 2015-09-26 10:11:11 UTC; 5s ago Main PID: 10512 (etcd) CGroup: /system.slice/etcd.service └─10512 /usr/bin/etcd ….

Section 3: Configure flannel for overlay networking

Make sure “etcd” service has started successfully. Once ‘etcd’ is running, create a flannel configuration json file to feed etcd.

[centos@atomicetcd ~]$ cat flannel-config.json { “Network”: “10.0.0.0/16”, “SubnetLen”: 24, “Backend”: { “Type”: “vxlan”, “VNI”: 1 } }

By default etcd will be listening on “localhost” port 2379, make etcd to listen on all the interfaces, so that other atomic hosts can reach etcd and fetch flannel configuration data.

Default configuration of etcd looks like below:

[centos@atomicetcd ~]$ cat /etc/etcd/etcd.conf |grep -v “#” ETCD_NAME=default ETCD_DATA_DIR=”/var/lib/etcd/default.etcd” ETCD_LISTEN_CLIENT_URLS=”http://localhost:2379″ ETCD_ADVERTISE_CLIENT_URLS=”http://localhost:2379″


Change above file to reflect etcd server IP in client URLs.

[centos@atomicetcd ~]$ cat /etc/etcd/etcd.conf |grep -v “#” ETCD_NAME=default ETCD_DATA_DIR=”/var/lib/etcd/default.etcd” ETCD_LISTEN_CLIENT_URLS=”http://0.0.0.0:2379″ ETCD_ADVERTISE_CLIENT_URLS=”http://0.0.0.0:2379″


Set the network key in etcd server via curl.

[centos@atomicetcd ~]$ curl -L http://localhost:2379/v2/keys/atomic01/network/config -XPUT –data-urlencode value@flannel-config.json

{“action”:”set”,”node”:{“key”:”/atomic01/network/config”,”value”:”{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n”,”modifiedIndex”:3,”createdIndex”:3}} [centos@atomicetcd ~]$

Retrieve the data from etcd server to make sure its recorded properly.

[centos@atomicetcd ~]$ curl -L http://localhost:2379/v2/keys/atomic01/network/config | python -m json.tool

cu % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 222 100 222 0 0 1468 0 –:–:– –:–:– –:–:– 1470 { “action”: “get”, “node”: { “createdIndex”: 3, “key”: “/atomic01/network/config”, “modifiedIndex”: 3, “value”: “{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n” } }

NOTE: Its *not* required to configure flannel in ETCD server, however
in this setup we are configuring flannel in ETCD server so that if needed,
this server can be used as a client for gluster volumes. We can mount the gluster volume and perform tests.

Set the flanneld network configuration service file as shown below.

[centos@atomicetcd ~]$ cat /etc/systemd/system/docker.service.d/10-flanneld-network.conf [Unit] After=flanneld.service Requires=flanneld.service

[Service] EnvironmentFile=/run/flannel/subnet.env ExecStartPre=-/usr/sbin/ip link del docker0 ExecStart= ExecStart=/usr/bin/docker -d \ –bip=${FLANNEL_SUBNET} \ –mtu=${FLANNEL_MTU} \ $OPTIONS \ $DOCKER_STORAGE_OPTIONS \ $DOCKER_NETWORK_OPTIONS \ $INSECURE_REGISTRY

[centos@atomicetcd ~]$ cat /etc/sysconfig/flanneld # Flanneld configuration options # etcd url location. Point this to the server where etcd runs FLANNEL_ETCD=”http://192.168.122.7:2379″ # etcd config key. This is the configuration key that flannel queries # For address range assignment FLANNEL_ETCD_KEY=”/atomic01/network” # Any additional options that you want to pass FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”

[centos@atomicetcd glusterd]$ cat /etc/sysconfig/docker |grep -v “#”

OPTIONS=’–selinux-enabled –ip-masq=false’ DOCKER_CERT_PATH=/etc/docker

Make flannel, etcd to load at boot time.

[centos@atomicetcd ~]$ sudo systemctl daemon-reload [centos@atomicetcd ~]$ sudo systemctl enable flanneld [centos@atomicetcd ~]$ sudo systemctl enable etcd [centos@atomicetcd ~]$ sudo systemctl reboot


After reboot, check docker0 bridge and flannel network configuration and validate both are in same network configuration.

[centos@atomicetcd ~]$ ifconfig docker0: flags=4099 mtu 1500 inet 10.0.12.1 netmask 255.255.255.0 broadcast 0.0.0.0 …….

eth0: flags=4163 mtu 1500 inet 192.168.122.7 netmask 255.255.255.0 broadcast 192.168.122.255 ……

flannel.1: flags=4163 mtu 1450 inet 10.0.12.0 netmask 255.255.0.0 broadcast 0.0.0.0 ….

……..


If both are in same network, we are good to proceed 🙂
Now, lets make sure the other 2 atomic hosts can connect and fetch the ETCD configuration data from atomicetcd server:

FROM atomictest1:

[centos@atomictest1 ~]$ curl -L http://atomicetcd:2379/v2/keys/atomic01/network/config {“action”:”get”,”node”:{“key”:”/atomic01/network/config”,”value”:”{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n”,”modifiedIndex”:3,”createdIndex”:3}} [centos@atomictest1 ~]$

FROM atomictest2:

[centos@atomictest2 ~]$ curl -L http://atomicetcd:2379/v2/keys/atomic01/network/config {“action”:”get”,”node”:{“key”:”/atomic01/network/config”,”value”:”{\n\”Network\”: \”10.0.0.0/16\”,\n\”SubnetLen\”: 24,\n\”Backend\”: {\n\”Type\”: \”vxlan\”,\n\”VNI\”: 1\n }\n}\n\n”,”modifiedIndex”:3,”createdIndex”:3}} [centos@atomictest2 ~]$

Make sure flanneld running on “atomictest{1,2} servers” contact “atomicetcd” server (192.168.122.7) to fetch the flannel configuration.

[centos@atomictest2 ~]$ cat /etc/sysconfig/flanneld # Flanneld configuration options # etcd url location. Point this to the server where etcd runs FLANNEL_ETCD=”http://192.168.122.7:2379″ # etcd config key. This is the configuration key that flannel queries # For address range assignment FLANNEL_ETCD_KEY=”/atomic01/network” # Any additional options that you want to pass FLANNEL_OPTIONS=”–iface=eth0 -ip-masq=true”

NOTE: Due to an issue with glusterd to form a trusted pool when flannel is configured as overlay networking solution, we need a hack in flannel configuration and docker configuration file for ip masquerading. Please note that, in above FLANNEL_OPTIONS value, “eth0” should be replaced with the network interface name of your atomic server and “-ip-masq” option should be set to ‘true’ to overcome above mentioned limitation. We also have to configure docker option as shown below.

[centos@atomictest2 glusterd]$ cat /etc/sysconfig/docker |grep -v “#”

OPTIONS=’–selinux-enabled –ip-masq=false’ DOCKER_CERT_PATH=/etc/docker

Enable flanneld in both servers:

[centos@atomictest1 ~]$ sudo systemctl enable flanneld ln -s ‘/usr/lib/systemd/system/flanneld.service’ ‘/etc/systemd/system/docker.service.requires/flanneld.service’ [centos@atomictest1 ~]$

[centos@atomictest2 ~]$ sudo systemctl enable flanneld ln -s ‘/usr/lib/systemd/system/flanneld.service’ ‘/etc/systemd/system/docker.service.requires/flanneld.service’ [centos@atomictest2 ~]$

Repeat below steps in ‘atomictest 1’ and ‘atomictest 2’ :

[centos@atomictest2 ~]$ sudo mkdir -p /etc/systemd/system/docker.service.d/

[centos@atomictest2 ~]$ cat /etc/systemd/system/docker.service.d/10-flanneld-network.conf [Unit] After=flanneld.service Requires=flanneld.service

[Service] EnvironmentFile=/run/flannel/subnet.env ExecStartPre=-/usr/sbin/ip link del docker0 ExecStart= ExecStart=/usr/bin/docker -d \ –bip=${FLANNEL_SUBNET} \ –mtu=${FLANNEL_MTU} \ $OPTIONS \ $DOCKER_STORAGE_OPTIONS \ $DOCKER_NETWORK_OPTIONS \ $INSECURE_REGISTRY [centos@atomictest2 ~]$

Reboot ‘atomictest1’ and ‘atomictest2’ servers , once these servers are back, both ‘docker’ and ‘flanneld’ services should be up and running and should see ‘docker0’ and ‘flannel’ are in same network.

[centos@atomictest1 ~]$ systemctl status docker flanneld docker.service – Docker Application Container Engine Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled) Drop-In: /etc/systemd/system/docker.service.d └─10-flanneld-network.conf /usr/lib/systemd/system/docker.service.d └─flannel.conf Active: active (running) since Sat 2015-09-26 10:57:12 … flanneld.service – Flanneld overlay address etcd agent Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled) Active: active (running) since Sat 2015-09-26 10:57:08 ……

[centos@atomictest1 ~]$ ifconfig |grep flags -A 2 docker0: flags=4099 mtu 1500 inet 10.0.80.1 netmask 255.255.255.0 broadcast 0.0.0.0 — eth0: flags=4163 mtu 1500 inet 192.168.122.133 netmask 255.255.255.0 broadcast 192.168.122.255

— flannel.1: flags=4163 mtu 1450 inet 10.0.80.0 netmask 255.255.0.0 broadcast 0.0.0.0

— lo: flags=73 mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 [centos@atomictest1 ~]$

You can validate that, the atomic hosts subnets are allocated properly via below command from any of the nodes:

centos@atomicetcd ~]$ curl -L http://atomicetcd:2379/v2/keys/atomic01/network/subnets | python -m json.tool

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 908 100 908 0 0 5987 0 –:–:– –:–:– –:–:– 6013 { “action”: “get”, “node”: { “createdIndex”: 5, “dir”: true, “key”: “/atomic01/network/subnets”, “modifiedIndex”: 5, “nodes”: [ { “createdIndex”: 5, “expiration”: “2015-09-27T10:38:13.318286933Z”, “key”: “/atomic01/network/subnets/10.0.12.0-24”, “modifiedIndex”: 5, “ttl”: 84248, “value”: “{\”PublicIP\”:\”192.168.122.7\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”96:1e:41:4a:aa:ce\”}}” }, { “createdIndex”: 6, “expiration”: “2015-09-27T10:57:08.763771526Z”, “key”: “/atomic01/network/subnets/10.0.80.0-24”, “modifiedIndex”: 6, “ttl”: 85384, “value”: “{\”PublicIP\”:\”192.168.122.133\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”de:c1:3a:e4:64:fc\”}}” }, { “createdIndex”: 7, “expiration”: “2015-09-27T11:09:55.54906845Z”, “key”: “/atomic01/network/subnets/10.0.55.0-24”, “modifiedIndex”: 7, “ttl”: 86150, “value”: “{\”PublicIP\”:\”192.168.122.188\”,\”BackendType\”:\”vxlan\”,\”BackendData\”:{\”VtepMAC\”:\”c2:29:e0:13:d9:40\”}}” } ] } } Section 4: Run RHGS containers

Once flannel is configured and it’s up, we have to pull the RHGS image from Red Hat internal registry via docker pull command as shown below. In order to obtain this image, one has to:

glusterfs: “[heketi] failed to create volume: signature is invalid”

Oh. You got that error and want to know why? Short answer: The heketi credentials you submitted does not match with the credential specified in heketi configuration file. Long answer: 1) How did you deploy heketi ? As a service using service file ? If yes, check the entries in /etc/heketi/heketi.json file for admin password. …

Read more