Sunday, July 26, 2015

Gluster News of week #29/2015

An other week has passed, and here is an other “Gluster Weekly News” post. Please add topics for the next post to the etherpad. Anything that is worth noting can be added, contributions from anyone are very much appreciated.
GlusterFS 3.5.5 landed in the Fedora 21 updates repository (moved out of updates-testing).

Fedora 23 has been branched from Rawhide and will contain GlusterFS 3.7. Previous Fedora releases will stick with the stable branches, meaning F22 keeps glusterfs-3.6 and F21 will stay with glusterfs-3.5.

Shared Storage for Containers in Cloud66 using Gluster.
Real Internet Solutions from Belgium (Dutch only website) started to deploy a Gluster solution for their multi-datacenter Cloud Storage products.

Wednesday the regular community meeting took place under the guidance of Atin. He posted the minutes so that everyone else can follow what was discussed.

Several Gluster talks have been accepted for LinuxCon/CloudOpen Europe in Dublin. The accepted talks have been added (with links) to our event etherpad. Attendees interested in meeting other Gluster community people should add their names to the list on the etherpad, maybe we can setup a Gluster meetup or something.

More Gluster topics have been proposed for the OpenStack Summit in Tokyo. Go to https://www.openstack.org/summit/tokyo-2015/vote-for-speakers/SearchForm and search for “gluster” to see them all. You can vote for talks you would like to attend.

GlusterFS 3.7.3 is going to be released by Kaushal early next week.

Documentation update describing the different projects for users, developers, website and feature planning. More feedback for these suggestions are very welcome.

Sunday, July 19, 2015

Gluster News of week #28/2015

Thanks to André Bauer for suggesting a "This week in Gluster" blog post series. This post is the 1st of its kind, and hopefully we manage to write something every week. Future blog posts are edited on a public etherpad where everyone can contribute snippets. Suggestions for improvement can be shared on the mailinglists or on the etherpad.

As every week on Wednesday, there was a Gluster Community Meeting. The minutes have been posted to the list. The next meeting happens on Wednesday, at 12:00 UTC in #gluster-meeting on Freenode IRC.

Proxmox installations of Debian 8 fail when the VM image is stored on a Gluster volume. After many troubleshooting steps and trying different ideas, it was found out that there is an issue with the version of Qemu delivered by Proxmox. Qemu on Proxmox 3.4, the Debian 8 kernel with virtio-disks and storage on Gluster do not work well together. Initially thought to be a Gluster issue, was identified to be related to Proxmox. In order to install Debian 8 on Proxmox 3.4, a workaround is to configure IDE/SATA disks instead of virtio, or use NFS instead of Qemu+libgfapi. More details and alternative workarounds can be found in the email thread.

On IRC, Jampy asked about a problem with Proxmox containers which have their root filesystem on Gluster/NFS. Gluster/NFS has a bug where unix-domain-sockets are created as pipes/fifos. This unsurprisingly causes applications of unix-domain-sockets to behave incorrectly. Bug 1235231 already was filed and fixed in the master branch, on Friday backports have been posted to the release-3.7, 3.6 and 3.5 branches. Next releases are expected to have the fix merged.

Atin sent out a call for "Gluster Office Hours", manning the #gluster IRC channel and announce who will (try to) be available on certian days/times. Anyone who is willing to man the IRC channel and help to answer (or redirect) questions of users can sign up.

From Douglas Landgrafs report of FISL16:
We also had a talk about Gluster and oVirt by Marcelo Barbosa, he showed how oVirt + Gluster is running in the development company that he works. In the end, people asked questions how he integrated FreeIPA with oVirt and how well is running Jenkins and Gerrit servers on top of oVirt. Next year Marcelo should take 2 slots for a similar talks, people are very interested in Gluster with oVirt and real use cases as demonstrated.
GlusterFS 3.5.5 and 3.6.4 have been released and packages for different distributions have been made available.

Our Jenkins instance now supports connecting over https, before only http was available. A temporary self-signed certificate is used, an official one has been requested from the certificate authority.

Manu has updated the NetBSD slaves that are used for regression testing with Jenkins. The slaves are now running NetBSD 7.0 RC1.

Another stable release, GlusterFS 3.5.5 is ready

Packages for Fedora 21 are available in updates-testing, RPMs and .debs can be found on the main Gluster download site.

This is a bugfix release. The Release Notes for 3.5.0, 3.5.1, 3.5.2, 3.5.3 and 3.5.4 contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.5 stable release.

Bugs Fixed:

  • 1166862: rmtab file is a bottleneck when lot of clients are accessing a volume through NFS
  • 1217432: DHT:Quota:- brick process crashed after deleting .glusterfs from backend
  • 1217433: glusterfsd crashed after directory was removed from the mount point, while self-heal and rebalance were running on the volume
  • 1231641: cli crashes when listing quota limits with xml output

Known Issues:

  • The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly:
    1. gluster volume set <volname> server.allow-insecure on
    2. restarting the volume is necessary
       gluster volume stop <volname>
       gluster volume start <volname>
      
    3. Edit /etc/glusterfs/glusterd.vol to contain this line:
       option rpc-auth-allow-insecure on
      
    4. restarting glusterd is necessary
       service glusterd restart
      
      More details are also documented in the Gluster Wiki on the Libgfapi with qemu libvirt page.
  • For Block Device translator based volumes open-behind translator at the client side needs to be disabled.
    gluster volume set <volname> performance.open-behind disabled
    
  • libgfapi clients calling glfs_fini before a successful glfs_init will cause the client to hang as reported here. The workaround is NOT to call glfs_fini for error cases encountered before a successful glfs_init. This is being tracked in Bug 1134050 for glusterfs-3.5 and Bug 1093594 for mainline.
  • If the /var/run/gluster directory does not exist enabling quota will likely fail (Bug 1117888).

Thursday, June 4, 2015

Stable releases continue, GlusterFS 3.5.4 is now available

GlusterFS 3.5 is the oldest stable release that is still getting updates. Yesterday GlusterFS 3.5.4 has been released, and the volunteering packagers have already provided RPM packages for different Fedora and EPEL versions. If you are running the 3.5 version on Fedora 20 or 21, you are encouraged to install the updates and provide karma.

Release Notes for GlusterFS 3.5.4

This is a bugfix release. The Release Notes for 3.5.0, 3.5.1, 3.5.2 and 3.5.3 contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.5 stable release.

Bugs Fixed:

  • 1092037: Issues reported by Cppcheck static analysis tool
  • 1101138: meta-data split-brain prevents entry/data self-heal of dir/file respectively
  • 1115197: Directory quota does not apply on it's sub-directories
  • 1159968: glusterfs.spec.in: deprecate *.logrotate files in dist-git in favor of the upstream logrotate files
  • 1160711: libgfapi: use versioned symbols in libgfapi.so for compatibility
  • 1161102: self heal info logs are filled up with messages reporting split-brain
  • 1162150: AFR gives EROFS when fop fails on all subvolumes when client-quorum is enabled
  • 1162226: bulk remove xattr should not fail if removexattr fails with ENOATTR/ENODATA
  • 1162230: quota xattrs are exposed in lookup and getxattr
  • 1162767: DHT: Rebalance- Rebalance process crash after remove-brick
  • 1166275: Directory fd leaks in index translator
  • 1168173: Regression tests fail in quota-anon-fs-nfs.t
  • 1173515: [HC] - mount.glusterfs fails to check return of mount command.
  • 1174250: Glusterfs outputs a lot of warnings and errors when quota is enabled
  • 1177339: entry self-heal in 3.5 and 3.6 are not compatible
  • 1177928: Directories not visible anymore after add-brick, new brick dirs not part of old bricks
  • 1184528: Some newly created folders have root ownership although created by unprivileged user
  • 1186121: tar on a gluster directory gives message "file changed as we read it" even though no updates to file in progress
  • 1190633: self-heal-algorithm with option "full" doesn't heal sparse files correctly
  • 1191006: Building argp-standalone breaks nightly builds on Fedora Rawhide
  • 1192832: log files get flooded when removexattr() can't find a specified key or value
  • 1200764: [AFR] Core dump and crash observed during disk replacement case
  • 1202675: Perf: readdirp in replicated volumes causes performance degrade
  • 1211841: glusterfs-api.pc versioning breaks QEMU
  • 1222150: readdirp return 64bits inodes even if enable-ino32 is set

Known Issues:

  • The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly:
    1. gluster volume set <volname> server.allow-insecure on
    2. restarting the volume is necessary
      gluster volume stop <volname>
      gluster volume start <volname>
    3. Edit /etc/glusterfs/glusterd.vol to contain this line:
      option rpc-auth-allow-insecure on
    4. restarting glusterd is necessary
      service glusterd restart
    More details are also documented in the Gluster Wiki on the Libgfapi with qemu libvirt page.
  • For Block Device translator based volumes open-behind translator at the client side needs to be disabled.
    gluster volume set <volname> performance.open-behind disabled
  • libgfapi clients calling glfs_fini before a successful glfs_init will cause the client to hang as reported here. The workaround is NOT to call glfs_fini for error cases encountered before a successful glfs_init. This is being tracked in Bug 1134050 for glusterfs-3.5 and Bug 1093594 for mainline.
  • If the /var/run/gluster directory does not exist enabling quota will likely fail (Bug 1117888).

Friday, May 15, 2015

GlusterFS 3.7.0 has been released, introducing many new features and improvements

Yesterday morning in Barcelona, the day after the Gluster Summit, GlusterFS 3.7.0 got released. Close to 600 bug reports, a mix of feature requests, enhancements and problems have been addressed in a little over 1220 patches since July last year.

This release marks the end of life of the maintenance of GlusterFS 3.4.x, users of the 3.4 stable version should consider upgrading.

The upcoming Fedora 23 (currently Rawhide) will be the first release that comes with 3.7.x by default, older Fedora releases will be kept on their stable Gluster versions. Packages for different distributions and versions will become available shortly on download.gluster.org. Once packages are available, announcements will be made on the Gluster Users list.

These Release Notes below have been put together with help from many engineers that worked on features and testing. You can also find the release notes in the repositories along with the rest of the code.

Major Changes and Features

Documentation about major changes and features is included in the doc/features/ directory of GlusterFS repository.

Bitrot Detection

Bitrot detection is a technique used to identify an “insidious” type of disk error where data is silently corrupted with no indication from the disk to the storage software layer that an error has occurred. When bitrot detection is enabled on a volume, gluster performs signing of all files/objects in the volume and scrubs data periodically for signature verification. All anomalies observed will be noted in log files.
For more information, refer here.

Multi threaded epoll for performance improvements

Gluster 3.7 introduces multiple threads to dequeue and process more requests from epoll queues. This improves performance by processing more I/O requests. Workloads that involve read/write operations on a lot of small files can benefit from this enhancement.
For more information refer here.

Volume Tiering [Experimental]

Policy based tiering for placement of files. This feature will serve as a foundational piece for building support for data classification.
For more information refer here.
Volume Tiering is marked as an experimental feature for this release. It is expected to be fully supported in a 3.7.x minor release.

Trashcan

This feature will enable administrators to temporarily store deleted files from Gluster volumes for a specified time period.
For more information refer here.

Efficient Object Count and Inode Quota Support

This improvement enables an easy mechanism to retrieve the number of objects per directory or volume. Count of objects/files within a directory hierarchy is stored as an extended attribute of a directory. The extended attribute can be queried to retrieve the count.
For more information refer here.
This feature has been utilized to add support for inode quotas.
For more details about inode quotas, refer here.

Pro-active Self healing for Erasure Coding

Gluster 3.7 adds pro-active self healing support for erasure coded volumes.

Exports and Netgroups Authentication for NFS

This feature adds Linux-style exports & netgroups authentication to the native NFS server. This enables administrators to restrict access to specific clients & netgroups for volume/sub-directory NFSv3 exports.
For more information refer here.

GlusterFind

GlusterFind is a new tool that provides a mechanism to monitor data events within a volume. Detection of events like modified files is made easier without having to traverse the entire volume.
For more information refer here.

Rebalance Performance Improvements

Rebalance and remove brick operations in Gluster get a performance boost by speeding up identification of files needing movement and a multi-threaded mechanism to move all such files.
For more information refer here.

NFSv4 and pNFS support

Gluster 3.7 supports export of volumes through NFSv4, NFSv4.1 and pNFS. This support is enabled via NFS Ganesha. Infrastructure changes done in Gluster 3.7 to support this feature include:
  • Addition of upcall infrastructure for cache invalidation.
  • Support for lease locks and delegations.
  • Support for enabling Ganesha through Gluster CLI.
  • Corosync and pacemaker based implementation providing resource monitoring and failover to accomplish NFS HA.
For more information refer the below links:
pNFS support for Gluster volumes and NFSv4 delegations are in beta for this release. Infrastructure changes to support Lease locks and NFSv4 delegations are targeted for a 3.7.x minor release.

Snapshot Scheduling

With this enhancement, administrators can schedule volume snapshots.
For more information, see here.

Snapshot Cloning

Volume snapshots can now be cloned to create a new writeable volume.
For more information, see here.

Sharding [Experimental]

Sharding addresses the problem of fragmentation of space within a volume. This feature adds support for files that are larger than the size of an individual brick. Sharding works by chunking files to blobs of a configurabe size.
For more information, see here.
Sharding is an experimental feature for this release. It is expected to be fully supported in a 3.7.x minor release.

RCU in glusterd

Thread synchronization and critical section access has been improved by introducing userspace RCU in glusterd

Arbiter Volumes

Arbiter volumes are 3 way replicated volumes where the 3rd brick of the replica is automatically configured as an arbiter. The 3rd brick contains only metadata which provides network partition tolerance and prevents split-brains from happening.
For more information, see here.

Better split-brain resolution

split-brain resolutions can now be also driven by users without administrative intervention.
For more information, see the 'Resolution of split-brain from the mount point' section here.

Geo-replication improvements

There have been several improvements in geo-replication for stability and performance. For more details, see here.

Minor Improvements

  • Message ID based logging has been added for several translators.
  • Quorum support for reads.
  • Snapshot names contain timestamps by default.Subsequent access to the snapshots should be done by the name listed in gluster snapshot list
  • Support for gluster volume get <volname> added.
  • libgfapi has added handle based functions to get/set POSIX ACLs based on common libacl structures.

Known Issues

  • Enabling Bitrot on volumes with more than 2 bricks on a node is known to cause problems.
  • Addition of bricks dynamically to cold or hot tiers in a tiered volume is not supported.
  • The following configuration changes are necessary for qemu and samba integration with libgfapi to work seamlessly:
      # gluster volume set <volname> server.allow-insecure on
    
    Edit /etc/glusterfs/glusterd.vol to contain this line: option rpc-auth-allow-insecure on
    Post 1, restarting the volume would be necessary:
      # gluster volume stop <volname> 
      # gluster volume start <volname>
    
    Post 2, restarting glusterd would be necessary:
      # service glusterd restart
    
    or
      # systemctl restart glusterd
    

Upgrading to 3.7.0

Instructions for upgrading from previous versions of GlusterFS are maintained on this wiki page.

Wednesday, May 6, 2015

1st beta of GlusterFS 3.5.4 is available for testing

You can download the packages (tarball and RPMs) for glusterfs-3.5.4beta1 from download.gluster.org.

23 bugs have been fixed. Once the fixes have been confirmed, we will do a real 3.5.4 (non-beta) release and update the packages in Fedora 20/21.

Release Notes for GlusterFS 3.5.4beta1

This is a bugfix release. The Release Notes for 3.5.0, 3.5.1, 3.5.2 and 3.5.3 contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.5 stable release.

Bugs Fixed:

  • 1092037: Issues reported by Cppcheck static analysis tool
  • 1101138: meta-data split-brain prevents entry/data self-heal of dir/file respectively
  • 1115197: Directory quota does not apply on it's sub-directories
  • 1159968: glusterfs.spec.in: deprecate *.logrotate files in dist-git in favor of the upstream logrotate files
  • 1160711: libgfapi: use versioned symbols in libgfapi.so for compatibility
  • 1161102: self heal info logs are filled up with messages reporting split-brain
  • 1162150: AFR gives EROFS when fop fails on all subvolumes when client-quorum is enabled
  • 1162226: bulk remove xattr should not fail if removexattr fails with ENOATTR/ENODATA
  • 1162230: quota xattrs are exposed in lookup and getxattr
  • 1162767: DHT: Rebalance- Rebalance process crash after remove-brick
  • 1166275: Directory fd leaks in index translator
  • 1168173: Regression tests fail in quota-anon-fs-nfs.t
  • 1173515: [HC] - mount.glusterfs fails to check return of mount command.
  • 1174250: Glusterfs outputs a lot of warnings and errors when quota is enabled
  • 1177339: entry self-heal in 3.5 and 3.6 are not compatible
  • 1177928: Directories not visible anymore after add-brick, new brick dirs not part of old bricks
  • 1184528: Some newly created folders have root ownership although created by unprivileged user
  • 1186121: tar on a gluster directory gives message "file changed as we read it" even though no updates to file in progress
  • 1190633: self-heal-algorithm with option "full" doesn't heal sparse files correctly
  • 1191006: Building argp-standalone breaks nightly builds on Fedora Rawhide
  • 1192832: log files get flooded when removexattr() can't find a specified key or value
  • 1200764: [AFR] Core dump and crash observed during disk replacement case
  • 1202675: Perf: readdirp in replicated volumes causes performance degrade

Known Issues:

  • The following configuration changes are necessary for 'qemu' and 'samba vfs plugin' integration with libgfapi to work seamlessly:
    1. gluster volume set <volname> server.allow-insecure on
    2. restarting the volume is necessary
      gluster volume stop <volname>
      gluster volume start <volname>
    3. Edit /etc/glusterfs/glusterd.vol to contain this line:
      option rpc-auth-allow-insecure on
    4. restarting glusterd is necessary
      service glusterd restart
    More details are also documented in the Gluster Wiki on the Libgfapi with qemu libvirt page.
  • For Block Device translator based volumes open-behind translator at the client side needs to be disabled.
    gluster volume set <volname> performance.open-behind disabled
  • libgfapi clients calling glfs_fini before a successful glfs_init will cause the client to hang as reported here. The workaround is NOT to call glfs_fini for error cases encountered before a successful glfs_init. This is being tracked in Bug 1134050 for glusterfs-3.5 and Bug 1093594 for mainline.
  • If the /var/run/gluster directory does not exist enabling quota will likely fail (Bug 1117888).

Sunday, March 1, 2015

Automatically subscribe RHEL systems for receiving updates and installing more packages

While fixing bugs and testing patches, I often use virtual machines running RHEL. These systems are short living, and normally do not survive a day or two. For most tests and development tries, I have little need to install additional packages or updates. An installation from the DVD contains all that is needed. Mostly...

To install additional packages or updates, it is needed to register the system to the Red Hat Customer Portal. The subscription-manager tool that is installed on all current RHEL systems can be used for that. For simple usage of the utility, a username and password is sufficient. Automating the subscribing process would require saving those credentials in a kickstart or ansible configuration, that's not what I want. Manually subscribing the VM when I need to was the annoying workaround.

A few weeks ago, I finally took the time to setup my automated RHEL installations to use subscription-manager for registering at the Red Hat Customer Portal. The Customer Portal offers the possibility to configure Activation Keys. The subscription-manager tool can use this key with a command like this:

# subscription-manager register \
        --org 123456 \
        --activationkey my-rhel-example-key
# subscription-manager attach --auto

The --org option seems required for me. I am not sure everyone needs that. The number (or name) can be found on an installed and registered system but executing:

# subscription-mananger identity

After subscribing like the above, it may well be that many repositories/channels get enabled. If you know which repositories you need, you can disable all repositories, and enable a select few:

# subscription-manager repos \
        --disable '*'
# subscription-manager repos \
        --enable rhel-7-server-rpms \
        --enable rhel-7-server-optional-rpms

At the moment, I have this done in the %post section of my kickstart configuration. I would prefer to set this up with the redhat_subscription ansible module, but the --org option is not available there (yet?).