[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Choice of Translator question

From: Gareth Bult
Subject: Re: [Gluster-devel] Choice of Translator question
Date: Thu, 27 Dec 2007 18:59:04 +0000 (GMT)

>I'm not sure.  It could very well depend on which version you are using, and 
>where you read that.  I'm sure some features listed in the wiki are only 
>implemented in the TLA releases until they put out the next point release.

Sure, however I have noticed that the documentation is not sync'd to the actual 
code very well.
It would be *really* nice to be pointed at a document that (for example) lists 
changes between 1.3.7 and TLA(s).
(so I could see if I needed to go to TLA to get a working config ..)

>Agreed, which is why I just showed the single file self-heal method, since in 
>your case targeted self heal (maybe before a full filesystem self heal) might 
>be more useful.

Sorry, I was mixing moans .. on the one hand there's no log hence no automatic 
detection of out of date files (which means you need a manual scan), and 
secondly, doing a full self-heal on a large file-system "can" be prohibitively 
"expensive" ...

I'm vaguely wondering if it would be possible to have a "log" translator that 
wrote changes to a namespace volume for quick recovery following a node 
restart. (as an option of course)

>I would expect AFR over stripe to replicate the whole file on inconsistent AFR 
>versions, but I would have though stripe over AFR would work, as the AFR 
>should only be seeing chunks of files.

Well .. it doesn't "seem" to, unless my config is horribly wrong to the extent 
that works and behaves normally, until going into self-heal mode ... (!)

>I don't see how the AFR could even be aware the chunks belong to the same 
>file, so how it would know to replicate all the chunks of a file is a bit of a 
>mystery to me.  I will admit I haven't done much with the stripe translator 
>though, so my understanding of it's operation may wrong.

Mmm, trouble is there's nothing definitive in the documentation either way .. 
I'm wondering whether it's a known critical omission which is why it's not been 
documented (!) At the moment stripe is pretty useless without self-heal (i.e. 
AFR). AFR is pretty useless without stripe for anyone with large files. (which 
I'm guessing is why stripe was implemented after all the "stripe is bad" 
documentation) If the the two don't play well and a self-heal on a large file 
means a 1TB network data transfer - this would strike me as a show stopper.

>Do you mean that a change to a stripe replicates the entire file?

During "normal" operation, stripe updates seem to work fine.
The problem is they don't seem to know what to update on a self heal, and as a 
result an entire stripe is copied on a self heal. I guess if you could generate 
a config file with '000's of stripes, this might produce "Ok" results with 
regards to self-heal times, i.e. generate '000's of stripes .. but for GB's 
over two stripes is a nightmare .. restarting glusterfsd following a config 
change means GB's of copy when the self-heal kicks in ...

>Understood.  I'll have to actually try this when I have some time, instead of 
>just doing some armchair theorizing.

Sure .. I think my tests were "proper" .. although I might try them on TLA just 
to make sure.

Just thinking logically for a second, for AFR to do chunk level self-heal, 
there must be a chunk level signature store somewhere.
... where would this be ?

>Well, it depends on your goal.  I only suggested rsync for when a node was 
>offline for quite a while, which meant a large number of stripe components 
>would have needed to be updates, requiring a long sync time.   If it was a 
>quick outage (glusterfs restart or system reboot), it wouldn't be needed.  
>Think of it as a jumpstart on the self-heal process without blocking.

Ok, goal #1 is not to have specific "rsync" configurations for each node (!)

>This, of course, was assuming that the stripe of AFR setup works.

Which I don't believe it does in this context ...

>Because I'm not a dev, and have no control over this.  ;)  Yes, I would like 
>this feature as well, although I can imagine a couple of snags that can make 
>it problematic to implement.

It's one of those .. if we crash a node once a month I can take a 50Gb 
self-heal hit .. but we're going to be changing the configs daily (or 
potentially more frequently) hence it's unsustainable to run without it.

I guess it comes down to the developer's aims .. (a) clustered fs for research 
or (b) clustered fs for day-to-day usage.

>Was this on AFR over stripe or stripe over AFR?

Logic told me it must be AFR over stipe, but I tries it both ways round ..

>The GlusterFS provided fuse is supposed to have some better default values for 
>certain variables relating to transfer block size or some such that optimize 
>it for glusterfs, and it's probably what they test against, so it's what I've 
>been using.

Sure, but I think this is performance rather than stability (?)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]