[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Problem with autofs configuration - sometimes mount

From: Mark Mielke
Subject: Re: [Gluster-devel] Problem with autofs configuration - sometimes mount does not complete fast enough?
Date: Sun, 06 Sep 2009 23:59:31 -0400
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20090814 Fedora/3.0-2.6.b3.fc11 Thunderbird/3.0b3

On 09/06/2009 11:19 PM, Mark Mielke wrote:
On 09/06/2009 10:42 PM, Mark Mielke wrote:
This seems to happen about 50% of the time:

address@hidden ~]# ls /gluster/data
ls: cannot open directory /gluster/data: No such file or directory
address@hidden ~]# ls /gluster/data
00  15  32  47  64  07  24  41  56

My current guess is that GlusterFS is saying the mount is complete to AutoFS before the actual mount operation takes effect. 50% of the time GlusterFS is able to complete the mount before AutoFS let's the user continue, and all is well. The other 50% of the time, GlusterFS does not quite finish the mount, and AutoFS gives the user a broken directory.

I might try and prove this by adding a sleep 5 to /sbin/mount.glusterfs, although I do not consider this a valid solution, as it just reduces the effect of the race - it does not eliminate the race.

Uhh... Hmm... It already has a "sleep 3", and changing it to "sleep 5" does not reduce the frequency of the problem. Changing it to "sleep 10" also has no effect.

Why does it sometimes work and sometimes not?

I note that the fusermount from the FUSE libraries does not seem to have the same problem:

$ /stage/linux/fuse-2.7.4/example/fusexmp_fh /tmp/t ; ls /tmp/t
backup/ boot/ etc/ lib64/ media/ pccyber/ sbin/ stage/ usr/ backup2/ db/ home/ lost+found/ mnt/ proc/ selinux/ sys/ var/ bin/ dev/ lib/ mail/ opt/ root/ srv/ tmp/ www/

It works immediately. Compare this to:

address@hidden echo hi >/tmp/t/hi
address@hidden time /opt/glusterfs/sbin/glusterfs --volfile=/etc/glusterfs/gluster-data.vol /tmp/t ; ls /tmp/t ; sleep 1 ; ls /tmp/t /opt/glusterfs/sbin/glusterfs --volfile=/etc/glusterfs/gluster-data.vol /tmp/ 0.00s user 0.00s system 113% cpu 0.003 total
00  15  32  47  64  07  24  41  56
01  16  33  50  65  10  25  42  57
02  17  34  51  66  11  26  43  60
03  20  35  52  67  12  27  44  61
04  21  36  53  lost+found  13  30  45  62
05  22  37  54  14  31  46  63
06  23  40  55

Note that the first 'ls' returns 'hi', and a second later, 'ls' returns the glusterfs content.

For fusexmp, it appears to complete the mount before it returns. For glusterfs, it seems to complete the mount a short time after it completes.

I think this is where autofs is getting confused, and serving the handle to the directory to the client too early. It thinks glusterfs is done mounting, and gives the handle to the client, but this handle is broken and fails. Glusterfs completes the mount, and a short time later the lookups succeed. Adding 'sleep' in mount.glusterfs do not seem to be good enough - as 'sleep 1' and 'sleep 20' do not change the frequency. The existing 'sleep 3' in /sbin/mount.glusterfs should be completely unnecessary. Instead, we should figure out why GlusterFS cannot ensure the mount is in place before it returns?

I'm worn out investigating for today - hopefully somebody can help me? :-)


Mark Mielke<address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]