[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gluster-devel] AFR+locks bug?
From: |
Székelyi Szabolcs |
Subject: |
[Gluster-devel] AFR+locks bug? |
Date: |
Thu, 17 Jan 2008 18:08:22 +0100 |
User-agent: |
Mozilla-Thunderbird 2.0.0.9 (X11/20080103) |
Hi,
AFR with posix-locks behaves really strange nowadays... GlusterFS is a
fresh TLA checkout (patch-636), FUSE is brand the new 2.7.2-glfs8.
I have 4 servers with a 4-way AFR on each and features/posix-locks
loaded just above storage/posix bricks. On each AFR, one replica is the
local storage, the remaining 3 are on the other 3 servers.
The 4 AFR bricks are mounted on each server from 'localhost'.
The machines are freshly booted. Basic FS functions (ls, copy, cat) work
fine.
Now I run a distributed locking test using [1]. On the "master" locker I
get:
> # /tmp/locktests -n 10 -c 3 -f /mnt/glusterfs/testfile
> Init
> process initalization
> ....................
> --------------------------------------
>
> TEST : TRY TO WRITE ON A READ LOCK:==========
> TEST : TRY TO WRITE ON A WRITE LOCK:==========
> TEST : TRY TO READ ON A READ LOCK:==========
> TEST : TRY TO READ ON A WRITE LOCK:==========
> TEST : TRY TO SET A READ LOCK ON A READ LOCK:
After about 5 minutes, another
> RDONLY: fcntl: Transport endpoint is not connected
appears, and the locking processes exit on all slave servers, the master
blocks.
The mount point locks up. Even an `ls` from a different terminal seems
to block forever.
You can find my server config below. Client configs are simple, just a
protocol/client brick from localhost. I can provide server debug logs if
you need.
Any idea?
Thanks,
--
Szabolcs
[1] http://nfsv4.bullopensource.org/tools/tests_tools/locktests-net.tar.gz
My server config (from a single node, lu1):
volume data-posix
type storage/posix
option directory /srv/glusterfs
end-volume
volume data1
type features/posix-locks
subvolumes data-posix
end-volume
volume data2
type protocol/client
option transport-type tcp/client
option remote-host lu2
option remote-subvolume data2
end-volume
volume data3
type protocol/client
option transport-type tcp/client
option remote-host lu3
option remote-subvolume data3
end-volume
volume data4
type protocol/client
option transport-type tcp/client
option remote-host lu4
option remote-subvolume data4
end-volume
volume data-afr
type cluster/afr
subvolumes data1 data2 data3 data4
end-volume
volume server
type protocol/server
subvolumes data1 data-afr
option transport-type tcp/server
option auth.ip.data1.allow 10.0.0.*
option auth.ip.data-afr.allow 127.0.0.1,10.0.0.*
end-volume
- [Gluster-devel] AFR+locks bug?,
Székelyi Szabolcs <=