Re: [Gluster-devel] HA failover question.

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] HA failover question.

From:	Kevan Benson
Subject:	Re: [Gluster-devel] HA failover question.
Date:	Wed, 17 Oct 2007 09:30:07 -0700
User-agent:	Thunderbird 2.0.0.6 (X11/20070728)

Chris Johnson wrote:

On Wed, 17 Oct 2007, Chris Johnson wrote:

     I think a light just came on.  I don't know how 'self healing'
would work on the client saide.  That sounds like a server side deal.
Is it possible to set up two AFR servers between two RAIDs and access
them in client side AFR mode?  Would that provide my failover and
'self-healing' when a failed node came back up?  Somehow I think the
servers would need to talk to each other to pull this off.  Or is AFR
something that can only run on one node between two file systems?
That wouldn't be as fault tolerant.

When the afr is run on the client, the self-heal is handled by theclient (I assume, I don't see how else it would work when the serversmay not even have access to each other). Here's my understanding(glusterfs team please correct me if I'm wrong):


1) File data operation requested from AFR share

2) AFR translator (on the clinet in this case) requests file informationfrom all it's subvolumes3) AFR translator aggregates the results and finds the latest version ofthe file

   a) retrieve latest version

b) If latest version isn't on all subvolumes, overwrite obsoleteversion with latest

   c) If file isn't shared on enough subvolumes, copy to new subvolumes

d) Remove extra copies of files if it's more than required by AFRspec? (glfs team care to comment on whether this happens?)

4) Read, write or append to file as requested in #1

An AFR subvolume can be any other defined volume (except for unifyvolumes, you can't afr unify volumes YET, seehttp://www.mail-archive.com/address@hidden/msg02161.html).That means you can AFR a local and remote volume together (as in some ofthe HA examples in the wiki), or multiple remote volumes (as in theexample posted to you earlier), or multiple local volumes (in the caseyou want the data stored on two physical disks.

You can also AFR other AFR volumes (I believe I read that, it shouldwork), so you could do a tiered AFR structure if you wanted so you don'thave one client or server responsible for writing lots of copies of afile if you want lots of copies (think binary tree).

Right now, for HA and ease of admin, I think a simple AFR handled on theclient is easiest. No unify. Unify will give you better performance,but at a cost of splitting your files up. Each server will have a fullset of files, it will just be split into two locations, making any sortof pre-population of files or direct access complicated withoutglusterfs. Locking may be problematic with this though, I'll be postingabout that shortly...

In short, for HA if you need the extra performance I suggest you use theconfig that Daniel posted before. If you just need HA and want easieradministration, just use a single AFR on the client. No carp, heartbeator ay type of shared IP should be required.


P.S.

The transport-timeout option in protocol/client is key to finding a goodfailover time for your cluster. When there's a failure the first writefrom a client will hang for the timeout period before finishing it'srequest with the available subvolumes. A filure mid-write just stallsthe same amount of time before finishing the write to the availablesubvolumes. It's extremely robust.


--

-Kevan Benson
-A-1 Networks

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] HA failover question., Chris Johnson, 2007/10/17
- Message not available
  - Re: [Gluster-devel] HA failover question., Daniel van Ham Colchete, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Chris Johnson, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Chris Johnson, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Kevan Benson <=
    - Re: [Gluster-devel] HA failover question., Chris Johnson, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Kevan Benson, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Chris Johnson, 2007/10/17
    - Re: [Gluster-devel] HA failover question., Kevan Benson, 2007/10/17

Prev by Date: Re: [Gluster-devel] afr logic
Next by Date: Re: [Gluster-devel] afr logic
Previous by thread: Re: [Gluster-devel] HA failover question.
Next by thread: Re: [Gluster-devel] HA failover question.
Index(es):
- Date
- Thread