Re: [Gluster-devel] Client side AFR race conditions?

From: Kevan Benson
Subject: Re: [Gluster-devel] Client side AFR race conditions?
Date: Tue, 06 May 2008 12:45:09 -0700
User-agent: Thunderbird (X11/20071220)

Martin Fick wrote:
--- Anand Babu Periasamy <address@hidden> wrote:

I really want to understand the issue and help you
out. We always have heated discussions even in our labs. We only take it positively :) Your feedback is

very valuable to us.

No prob!  I appreciate it.

Martin Fick wrote:
In other words, what prevents conflicts when client A & B both write to the same file? Could A's write to subvolume A succeed before B's write
to subvolume A, and at the same time B's write to
subvolume B succeed before A's write to subvolume

It seems to me the major problem here is the use of client side AFR. While it does have it's advantages, data integrity does not seem to be one of them. If you plan to use it as a storage system for a specific application, you can assess it's strengths (easy to configure, automatic failover and recovery) and weaknesses (hard to ensure back-end data integrity without locking) for that application. For a general purpose file system, it's weaknesses (poor data integrity without locking) probably far outweigh it's strengths.

One of the major strengths of glusterfs is it's configurability, so if data integrity without locking is really important to you, choose a configuration that can support this, such as server side AFR with some sort of failover solution to keep al clients always writing to a single server (you get a speed boost from this as well). The (coming) HA translator would be another (future) solution.

I'm not saying I don't want to see a more robust solution for client side AFR, just that each configuration has it's place, and client side AFR isn't currently (and may never be) capable of serving a share that requires high data integrity.

If you think fixing this current issue will solve your problems, maybe you haven't considered the implications of connectivity problems between some clients and some (not all) servers... Add in some clients with slightly off timestamps and you might have some major problems WITHOUT any reboots.


-Kevan Benson
-A-1 Networks

