GlusterFS Geo-Replication

Geo-replication provides site-to-site master-slave replication

Geo-replication is not Replicated Volumes

Geo-replication is not for real-time, synchronous high-availablity clustering. This is a periodic background backup system for disaster recovery.

If you were feeling brave however you could use Relicated Volume across the globe.

Preparing Peers

Synchronise Watches

Make sure your system has python, ntp, ssh, rsync.

Use a service like ntpd to make sure all the systems have time that is only milliseconds different.

Allow Master peer SSH access to the remote Slave.

Geo-Replication Master

Configure a 6 peer, 6 brick, distributed replicated GlusterFS. The name gfsm is a GlusterFS Master and gfss is a Slave system.

for server in 1 2 3 4 5 6
do
    gluster peer add gfsm$server
done
gluster volume create BigData replica 3 trasnport tcp gfsm1:/brick gfsm2:/brick gfsm3:/brick gfsm4:/brick gfsm5:/brick gfsm6:/brick
gluster volume start BigData

Geo-Replication Slave

Configure sshd to only allow resitrcited access. The gfsd is a super user (UID=0)

useradd -c'GlusterFS Slave Daemon' -d /home/gfsd -g 0 -m -o -u 0 gfsd
cat /home/gfsd/.ssh/authorized_keys
command=/usr/libexec/glusterfs/gsyncd ssh-rsa....

On the Slave servers configure access

gluster volume geo-replication '/*' config allow-network ::1,127.0.0.1

Mountbroker

I don't know this one yet.

Starting Geo-Replication

On gfsm1 run the following.

gluster volume geo-replication BigData gfss1:/BigData.slave/ start



See Also

Geo-replication dos and dont's