Monday, November 25, 2013

Initial work on Gluster integration with CloudStack

Last week there was a CloudStack Conference at the Beurs van Belage in Amsterdam. I attended the first day and joined the Hackathon. Without any prior knowledge of CloudStack, I was asked by some of the Gluster community people to have a look at adding support for Gluster in CloudStack. An interesting topic, and of course I'll happily have a go at it.
CloudStack seems quite a nice project. The conference showed an awesome part of the community, loads of workshops and a surprising number of companies that sponsor and contribute to CloudStack. Very impressive!
One of the attendants at the CloudStack Conference was Wido den Hollander. Wido has experience with integrating CEPH in CloudStack, and gave an explanation and some pointers on how storage is implemented.

Integration Notes


It seems that the most useful way to integrate Gluster with CloudStack is to make sure libvirt know how to use a Gluster backend. Checking with some of my colleagues that are part of the group that support libvirt, quickly showed that libvirt knows about Gluster already (Add new net filesystem glusterfs).
This suggests that it should be possible to create a storage pool in libvirt that is hosted on a Gluster environment. A little trial and error shows that a command like this creates the pool:

# virsh pool-create-as --name primary_gluster --type netfs --source-host $(hostname) --source-path /primary --source-format glusterfs --target /mnt/libvirt/primary_gluster

The components that the above command uses, are:
  • primary_gluster: the name of the storage pool in libvirt
  • netfs: the type of the pool, netfs mounts the 'pool' under the given --target
  • $(hostname): one of the Gluster servers that is part of the Trusted Storage Pool that provides the Gluster volume
  • /primary: the name of the Gluster volume
  • /mnt/libvirt/primary_gluster: directory where libvirt will mount the Gluster volume
Creating a volume (a libvirt volume, which is a file on the Gluster volume) can be done through libvirt:

# virsh vol-create-as --pool primary_gluster --name virsh-created-vol.img --capacity 512M --format raw

This will create the file /mnt/libvirt/primary_gluster/virsh-created-vol.img and that file can be used as a storage backend for a virtual machine. An example of a snippet for the disk that can be attached to a VM:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source protocol='gluster' name='/primary/virsh-created-vol.img'>
        <host name='HOSTNAME' port='24007'/>
      <target dev='vda' bus='virtio'/>

There are some important prerequisites that need to be applied to the Gluster volume so that libvirt can start a virtual machine with the appropriate user. After setting these options on the Gluster volume and in /etc/glusterfs/glusterd.vol, a test virtual machine can get started. The log of the vm (/var/log/libvirt/qemu/just-a-vm.log) shows the QEMU command line, and this contains the path to the storage:

... /usr/libexec/qemu-kvm -name just-a-vm ... -drive file=gluster+tcp://HOSTNAME:24007/primary/virsh-created-vol.img,if=none,id=drive-virtio-disk0,format=raw,cache=none ...

Design Overview

When CloudStack utilized libvirt, it should be relatively straight forward to add support for Gluster in CloudStack. A diagram that shows the main interactions and their components looks like this:

                    |  CloudStack  |
                      |  libvirt  |
           |                               |
 .---------+----------.         .----------+----------.
 |  / storage pool /  |         |   virtual machine   |
 |  image management  |         |      management     |
 '---------+----------'         | / XML description / |
           |                    '----------+----------'
           V                               |
........................                   V
:     / vfs/fuse /     :     .............................
:  mount -t glusterfs  :     :    / QEMU + libgfapi /    :
:......................:     :  qemu file=gluster://...  :

The parts that are already functioning are these:
  • libvirt mounts a Gluster volume as a netfs/fuse-filesystem
  • create a XML definition for the disk and pass gluster:// on to QEMU

The actual development work will be in teaching CloudStack to intruct libvirt to use a Storage Pool backed by a Gluster Volume and attach disks to a virtual machine with the gluster protocol.

CloudStack Storage Subsystem modifications

Wido pointed out that most of the storage changes will be needed in the LibvirtStoragePoolDef and LibvirtStorageAdapter Java classes. Also the Storage Core would need to know about the new storage backend.
After some browsing and reading the sources, the needed modifications looked straight forward. The Gluster backend compares to the NFS backend, which can be used as an example.
Changing the code is an easy part, compared to testing it. Remember that I have no CloudStack background what so ever... Setting up a CloudStack environment to see if the modifications do anything, is far from trivial. Compared to the time I spend on changing the source code, trying to get a minimal test environment functioning took most of my time. At this moment, my patches are untested and therefore I have not posted them for review yet :-/

Setting up a CloudStack environment for testing

Some pointers to setup a development environment:
  • Building CloudStack manually (non RPMs)
  • maven 3.0.4 has been deprecated, use maven 3.0.5 instead
  • Installation Guide
  • RHEL6 requires the Optional Channel for jsvc from the jakarta-commons-daemon-jsvc package
  • install the cloudstack-agent (and -common) package
  • set guid and in /etc/cloudstack/agent/

Running the CloudStack Management server is easy enough when the sources are checked out and build. A command like this works for me:

# mvn -pl :cloud-client-ui jetty:run

To deploy the changes for the cloudstack-agent, I prefer to build and install RPMs. Building these is made easy by the packaging/centos63/ script:

# cd packaging/centos63 ; ./ ; cd -

This script and the resulting packages work well on RHEL-6.5.

Upcoming work

With the test environment in place, I can now start to make changes to the Management Server. The current modifications in the JavaScript code make it possible to select Gluster as a primary storage pool. Unfortunately, I'm no web developer and changing JavaScript isn't something I'm very good at. I will be hacking on it every now and then, and hope to be able to have something suitable for review soon.
Of course, any assistance is welcome! I'm happy to share my work in progress if there is an interest. No guarantees about any working functionality though ;-)