gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Config advice?


From: Harald Stürzebecher
Subject: Re: [Gluster-devel] Config advice?
Date: Tue, 13 Jan 2009 13:01:04 +0100

2009/1/13 David Braginsky <address@hidden>:
> Hey guys,
>
> I am trying to set up glusterfs on a few hundred nodes and was hoping for 
> some advice.
>
> I have a few hundred machines, spread across two datacenters, in racks of 40 
> nodes. These machines act as in-memory servers, but during startup need to 
> load their data off disk. Each server is replicated, so several machines 
> serve the same data. Therefore, on startup, each file will be loaded by 
> several machines at once. The data is written via appends to a set of log 
> files, which are periodically rotated.
>
> I don't know how well glusterfs would handle cross-datacenter writes, so my 
> plan is to have a separate fs on in each datacenter.

IIRC, AFR did not like network latency in the past.
If you have time and a spare machine in each datacenter I'd suggest
running some tests with server-side AFR across both datacenters:
http://www.gluster.org/docs/index.php/Setting_up_AFR_on_two_servers_with_server_side_replication
http://www.gluster.org/docs/index.php/AFR_(Automatic_File_Replication)_-_Things_to_keep_in_mind_and_gotchas
Things I'd consider:
- setting AFR option "read-subvolume" to the local volume to improve
read performance
- performance translators
- GlusterFS does not use any encryption

> I plan on using the same server machines to run the fs. My naïve approach is 
> to pick a replication factor (4), generate a bunch of AFR clusters (numNodes 
> / replicationFactor), then use DHT to map my data onto the AFT clusters. That 
> way each file will get mapped onto 4 machines, which should handle the read 
> throughput as well as failures. Does that make sense?
>
> Or would it be better to use AFT over DHT? Or HA in some capacity? Is there 
> any way to achieve rack affinity? It'd be nice if reads were done from the 
> local rack when possible.

AFR has an option called "read-subvolume":
http://www.gluster.org/docs/index.php/Understanding_AFR_Translator#General_options
That might be usable to achieve rack affinity.
Beware that AFR write performance is sometimes limited by network
bandwidth - the file has to be sent to all servers.

> I can imagine all of these setups, but some require more complicated config 
> files than others. Any advice would be appreciated.

1) Start simple, test performance. It might be good enough for your application.
2) Find bottleneck, make change, test performance.
3) If not satisfied and not frustrated to the point of giving up, go to 2)
4) Use the config that gives best performance.

I believe that starting at 1) is very important.


Harald Stürzebecher




reply via email to

[Prev in Thread] Current Thread [Next in Thread]