Recovery from failures on GlusterFS
These are the types of failures that can occur from server crashes, disconnected peers or upgrade failures.
There is a basic Perl script to automatically repair the self-heal failures - gluster-heal.pl.
You will also need the attribute fixer tool gluster-xattr-clear.sh.
A Good Recovery
First, lets take a look at what non-errors look like in a Gluster Recovery.
A machine had failed, or a disk died or the network disconnected or whatever.
Now that the Brick(s) are back online we need to get them in sync.
You may need to run a rebalance
- but before that step I like to use the find
trick to re-stat everything and trigger self-heals.
I do this one depth at a time; tail the log file while executing the find.
You will see the self-heal triggered (shown below).
I slowly proceed through the depth of the tree, and I even pause a bit in between runs to give the Gluster servers time to settle.
I've had the situation where a find
of the whole tree would hang, and caused all kinds of problems; slow and steady.
tail -f /var/log/glusterfs.log & find /mnt/gluster -maxdepth 1 -noleaf find /mnt/gluster -maxdepth N -noleaf find /mnt/gluster -maxdepth N-1 -noleaf
no missing files - /gfs_root/d1234/d4321. proceeding to metadata check background gfid self-heal completed on /gfs_root/d1234/d4321 background gfid self-heal triggered. path: /gfs_root/d1234/d4321 background gfid self-heal triggered. path: /gfs_root/d1234/d4321 no missing files - /gfs_root/d1234/d4321. proceeding to metadata check no missing files - /gfs_root/d1234/d4321. proceeding to metadata check background gfid self-heal completed on /gfs_root/d1234/d4321 background gfid self-heal completed on /gfs_root/d1234/d4321 found anomalies in /gfs_root/d1234/d4321. holes=1 overlaps=0
It's showing some issues on directories that are in this volume. There will be lots of spew in the Brick specific logs on the Gluster servers.
Finding Issues
The Gluster team demonstrates that we should use the tool find
to crawl our gluster mount point and re-stat all the files.
Then gluster magic will happen and the gluster system will self-heal.
This is not always the case.
When using find
on a suspect gluster volume, it's best to start shallow and work your way down.
This will help identify sticky points before they become too serious.
Update the volume to at least the ERROR log level.
Then tail/grep the log file on the client looking for errors, while at the same time walking the gluster mount point.
The output is shown intented.
gluster volume VOLUME set diagnostics.client-log-level ERROR tail -f /var/log/gluster* |grep ' E ' & find /mnt/gluster -maxdepth 1 find /mnt/gluster -maxdepth 2 find /mnt/gluster -maxdepth 3 background entry self-heal failed on /file/12345/789/2013/fe/0a background entry self-heal failed on /file/12345/789/2013/fd/9f background entry self-heal failed on /file/12345/789/2013/f3/32
So now we have to doctor this directory up. Unmount the client, stop the volume, examine the bricks for this file and rsync if necessary. Now, clear the xattrs, start the volume, remount and start the find process again.
background entry self-heal failed on and Conflicting entries for
background entry self-heal failed on
For this one issue I make sure that I can find an authortative copy of the directories/files in question.
Then I use rsync
to replicate that over to another server in the GlusterFS pool.
Once that is finished, I then use clear the xattrs, restart the volume, re-mount and run a find.
background meta-data data entry missing-entry gfid self-heal failed
Conflicting entries for
Skipping entry self-heal because of gfid absence
VOLUME: path PATH on subvolume VOLUME No such file or directory
This can happen of there are not enough copies of a file to make the replica work properly, like only existing in one brick in a distribute-replicate system - it should be on at least two. The error message is telling you where GlusterFS is looking for the files, in REPLICATE_VOLUME_A
[afr-self-heal-common.c:1054:afr_sh_common_lookup_resp_handler] REPLICATE_VOLUME_A: path PATH on subvolume SUBVOLUME_B => -1 (No such file or directory)
Examine the GlusterFS configuration of this Volume, generally stored in /etc/glusterd/vols/NAME/NAME-fuse.vol
.
You will find REPLICATE_VOLUME_A name, and then it's sub volume SUBVOLUME_B, then manually sync files to that location, wipe xattrs and then find again.
Identify Failing Files
These file not found errors will show up $replica_count times in the logs. The example below has two replica failures, the Gluster volume is named 'gfsb'.
E [afr_sh_common_lookup_resp_handler] 0-gfsb-replicate-2: path /img/d1234/d5678/ab/08 on subvolume gfsb-client-5 => -1 (No such file or directory) E [afr_sh_common_lookup_resp_handler] 0-gfsb-replicate-2: path /img/d1234/d5678/ab/12 on subvolume gfsb-client-5 => -1 (No such file or directory) E [afr_sh_common_lookup_resp_handler] 0-gfsb-replicate-1: path /img/d1234/d5678/ab/08 on subvolume gfsb-client-2 => -1 (No such file or directory) E [afr_sh_common_lookup_resp_handler] 0-gfsb-replicate-1: path /img/d1234/d5678/ab/12 on subvolume gfsb-client-2 => -1 (No such file or directory)
Examining the volfile we find that gfsb-replicate-2 has two sub volumes, gfsb-client-4 and gfsb-client-5. Also, in question is gfsb-replicate-1 , sub-volume, gfsb-client-2. And gfsb-client-5 points to gfsm4:/opt/gluster/brick and gfsb-client-2 points to gfsm3:/opt/gluster/brick So the resolution here is rsync from a known good path, to the path in quesiton.
rsync -av /path/to/good/img/d1234/d5678/ab/08/ gfsm4:/opt/gluster/brick/img/d1234/d5678/ab/08/ rsync -av /path/to/good/img/d1234/d5678/ab/08/ gfsm3:/opt/gluster/brick/img/d1234/d5678/ab/08/
Now blank out the xattrs, on the server (make sure the volume is stopped!).
gfsm4: ~ # /opt/edoceo/gluster-xattr-clear.sh /data/gluster/brick/img/d1234/d5678/ab/08/ gfsm4: ~ # /opt/edoceo/gluster-xattr-clear.sh /data/gluster/brick/img/d1234/d5678/ab/08/
Now, restart the volume, remount from a client and run find . -noleaf
on that.