[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Handling huge number of file read requests

From: Amrik Singh
Subject: Re: [Gluster-devel] Handling huge number of file read requests
Date: Fri, 04 May 2007 10:55:38 -0400
User-agent: Thunderbird (Windows/20070221)

Sorry a correction there.... These are not 15 million files... That was a different data I got confused with.

The data set I am talking about is around 3.5 TB. I do not have the exact number of files here.

sorry for the confusion....


Amrik Singh wrote:
Well that makes things quiet clear. We are using gig/e. All the clients load different files. We have around 15-20 million files as of now. The situation that I described happens only at the peak load, that can vary from 1-4 hours at a time.

So I realize that we would need to distribute our files across lot more bricks. Does 18 * 8 means 144 or is it an expression that I did not get?

thanks a lot...


Anand Avati wrote:
 some quick math, you have 300 servers asking for 20-40 images (avg
30) each 2MB per second, your I/O aggregate I/O requirement is 18
GByte/sec. How many servers are you using with glusterfs to distribute
this load? It would be merciless to expect a single server to handle
this kind of load with *any* filesystem. Also what is the interconnect
you are using? if you are using gig/e, you need 18 * 8 nodes to handle
this load smoothly (lower number nodes result in such a less factor of

Also tell me the pattern of usage. do all the clients read differnt
files or same file? totally how many images do you have?

Looking forward to your answers


On 5/4/07, Amrik Singh <address@hidden> wrote:
Hi Guys,

We are hoping that glusterfs would help us in the particular problem
that we are facing with our cluster. We have a visual search application
that runs on a cluster with around 300 processors. These compute nodes
run a search for images that are hosted on an NFS server. In certain
circumstances all these compute nodes are sending requests for query
images at extremely high rates (20-40 images per second). When 300 nodes
send 20-40 requests per second for these images, the NFS server just
can't cope with it and we start seeing a lot of retransmissions and a
very high wait time on the server as well as on the nodes. The images
are sized at around 2MB each.

With the current application we are not in a position where we can
quickly change the way things are being done so we are looking for a
file system that can handle this kind of situation. We tried glusterfs
with the default settings but we did not see any improvement. Is there a
way to tune glusterfs to handle this kind of situation.

I can provide more details about our setup as needed.


Amrik Singh
Idée Inc.

Gluster-devel mailing list

Gluster-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]