gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] bug with TLA 313?


From: Brent A Nelson
Subject: Re: [Gluster-devel] bug with TLA 313?
Date: Fri, 20 Jul 2007 17:34:23 -0400 (EDT)

I also get the following in the glusterfs.log:
2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror2: (path=/glusterfs/glusterfs-server.vol child=share2-0) op_ret=43 op_errno=2 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1: (path=/glusterfs/beast child=share1-1) op_ret=43 op_errno=2 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror3: (path=/glusterfs/glusterfs-client.vol.sample child=share3-1) op_ret=43 op_errno=2 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror7: (path=/glusterfs/glusterfs-server.vol.sample child=share7-1) op_ret=43 op_errno=2 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror0: (path=/glusterfs/share0 child=share0-0) op_ret=43 op_errno=2 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1: (path=/glusterfs child=share1-1) op_ret=-1 op_errno=61 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror7: (path=/glusterfs2 child=share7-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror5: (path=/glusterfs2 child=share5-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror6: (path=/glusterfs2 child=share6-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror4: (path=/glusterfs2 child=share4-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:574:afr_getxattr_cbk] mirror1: (path=/glusterfs child=share1-1) op_ret=-1 op_errno=61 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror7: (path=/glusterfs2 child=share7-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror5: (path=/glusterfs2 child=share5-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror6: (path=/glusterfs2 child=share6-1) op_ret=-1 op_errno=95 2007-07-20 17:33:04 E [afr.c:513:afr_setxattr_cbk] mirror4: (path=/glusterfs2 child=share4-1) op_ret=-1 op_errno=95

Thanks,

Brent

On Fri, 20 Jul 2007, Brent A Nelson wrote:

Copying from a local filesystem to the GlusterFS now works without issue, but copying from the GlusterFS to the GlusterFS still complains. See attached strace.

Note that my local filesystem is not mounted with the acl option, but the underlying mounts that make up my GlusterFS do have the acl mount option.

Thanks,

Brent

PS Are these fixes actually enabling support for ACLs? If they are, that's very cool and well ahead of the roadmap!

On Sat, 21 Jul 2007, Anand Avati wrote:

Brent,
there was a bug in setxattr, of the length getting calculated by -1 for
(non ascii) binary values of setxattr. can you please check if your cp goes
through now? I'm very sorry I am unable to test this ourselves since we dont
have a system which uses posix acls, though xattrs are now working fine on
binary data (before the fix it was working only for pure ascii data only)

thanks,
avati

2007/7/20, Brent A Nelson <address@hidden>:

Nope, it's still there.  Example strace snippet:

setxattr("/beast/glusterfs/beast", "system.posix_acl_access",

"\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x04\x00\xff\xff\xff\xff
\x00\x04\x00\xff\xff\xff\xff", 28, 0) = -1 EINVAL (Invalid argument)

It presumably should have returned EOPNOTSUPP (Operation not supported),
instead.

Thanks,

Brent

On Fri, 20 Jul 2007, Anand Avati wrote:

> Brent,
> there was a fix in fuse_setxattr in patch-325. please check if it fixes
> your issue. AFR was only reporting the errno's passing via it.
>
> thanks,
> avati
>
> 2007/7/20, Brent A Nelson <address@hidden>:
>>
>> I should point out that this was with the full (AFR/unify) setup, not
the
>> stripped-down setup.  I also get a lot of messages such as the
following
>> in /var/log/glusterfs/glusterfs.log:
>> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror4: (path=/usr0
>> child=share4-0) op_ret=-1 op_errno=22
>> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror0: (path=/usr0
>> child=share0-0) op_ret=-1 op_errno=22
>> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
>> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
>> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>>
>> Thanks,
>>
>> Brent
>>
>> On Thu, 19 Jul 2007, Brent A Nelson wrote:
>>
>> > Patch 322 seems to have fixed the stray ls errors, but not the cp -a
>> > complaints.  A "cp -a" strace is attached.
>> >
>> > Thanks,
>> >
>> > Brent
>> >
>> > On Wed, 18 Jul 2007, Brent A Nelson wrote:
>> >
>> >> Aha, it looks like GlusterFS is giving odd/varying error responses
to
>> >> queries for ACL information (I assume it should be giving an
"operation
>> not
>> >> supported" error).  This must be related to my previously reported
>> problem
>> >> copying from GlusterFS to GlusterFS where it was complaining about
>> >> preserving ACLs for every file copied.
>> >>
>> >> See attached strace.
>> >>
>> >> Thanks,
>> >>
>> >> Brent
>> >>
>> >> PS At least in this simple case where glusterfs is directly mounting
a
>> >> storage/posix, NFS reexport works fine. I haven't had a chance to
test
>> a
>> >> full setup with recent GlusterFS tlas, but I will once the ACL
glitch
>> is
>> >> squashed.
>> >>
>> >> On Wed, 18 Jul 2007, Anand Avati wrote:
>> >>
>> >>> Brent,
>> >>> very interesting diagnosis! is it possible for you to re-create the
>> 'posix
>> >>> only' setup (no server/client) and again do 'strace ls -ial /beast'
?
>> we
>> >>> are
>> >>> not able to reproduce this error at our setup.
>> >>>
>> >>> thanks
>> >>> avati
>> >>>
>> >>> 2007/7/17, Brent A Nelson <address@hidden>:
>> >>>>
>> >>>> Just a quick note that this doesn't seem to be any sort of
corruption
>> >>>> issue.  I completely emptied all my shares (even removing
lost+found)
>> and
>> >>>> my namespace and rsynced the corresponding AFR shares and
>> namespace.  The
>> >>>> only thing different between the AFRs would be ctimes.
>> >>>>
>> >>>> I restarted everything, and did:
>> >>>> ls -al /beast
>> >>>> ls: /beast: File exists
>> >>>> ls: /beast/.: File exists
>> >>>> total 8
>> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
>> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
>> >>>>
>> >>>> I also tried disabling readahead and writebehind (my only
performance
>> >>>> translators).  It didn't help.  Changing the unify from alu to rr
>> also
>> >>>> didn't help.
>> >>>>
>> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n mirror0 /beast"
to
>> >>>> mount a single AFR, no unify.  It STILL produces the same
messages.
>> >>>>
>> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n share0-0
/beast"
>> to
>> >>>> mount a simple, single share used as half of an AFR.  Same issue.
>> >>>>
>> >>>> I then stripped down a server to serve out one single
storage/posix
>> >>>> share,
>> >>>> with no posix locks (I wasn't using any other translators on the
>> server
>> >>>> side, apart from protocol/server, of course).  I mounted that
share
>> as in
>> >>>> the previous attempt.  No difference!
>> >>>>
>> >>>> So, this issue occurs even with just protocol/client,
>> protocol/server,
>> >>>> and
>> >>>> storage/posix in use.  As barebones as you can get.  Almost.
>> >>>>
>> >>>> One more try.  No glusterfsd, and glusterfs accesses a single
>> >>>> storage/posix directly:
>> >>>>
>> >>>> ls -al /beast
>> >>>> ls: /beast: File exists
>> >>>> ls: /beast/.: File exists
>> >>>> total 8
>> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
>> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
>> >>>>
>> >>>> No difference, even with just glusterfs directly accessing a
single,
>> >>>> local
>> >>>> storage/posix, with no other translators.  Spec is simply:
>> >>>>
>> >>>> volume share0
>> >>>>    type storage/posix                   # POSIX FS translator
>> >>>>    option directory /share0             # Export this directory
>> >>>> end-volume
>> >>>>
>> >>>> Ubuntu Feisty, Fuse 2.6.3.
>> >>>>
>> >>>> Any ideas?
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Brent
>> >>>>
>> >>>>
>> >>>> On Sat, 14 Jul 2007, Brent A Nelson wrote:
>> >>>>
>> >>>> > It's the same spec I was using previously (AFRed namespace
cache,
>> >>>> unified
>> >>>> > AFRs spread across four servers, posix-locks, readahead, and
>> >>>> writebehind).
>> >>>> > It's not just the top-level directory; it's everywhere.
>> >>>> >
>> >>>> > Thanks,
>> >>>> >
>> >>>> > Brent
>> >>>> >
>> >>>> > On Sat, 14 Jul 2007, Anand Avati wrote:
>> >>>> >
>> >>>> >> Brent,
>> >>>> >> this is strange, we are having patch-313 work pretty smooth so
>> far.
>> >>>> are
>> >>>> >> there any changes in your spec? is this behaviour seen only in
>> this
>> >>>> >> particular directory or 'anywhere' in general? please attach
your
>> spec
>> >>>> so
>> >>>> >> that we can try to reproduce it in our labs.
>> >>>> >>
>> >>>> >> thanks,
>> >>>> >> avati
>> >>>> >>
>> >>>> >> 2007/7/14, Brent A Nelson <address@hidden>:
>> >>>> >>>
>> >>>> >>> Updating to the latest TLA patch, I got odd issues just with
>> "ls":
>> >>>> >>>
>> >>>> >>> Example:
>> >>>> >>>
>> >>>> >>> ls -al /beast/
>> >>>> >>> ls: /beast/: No such file or directory
>> >>>> >>> ls: /beast/.: No such file or directory
>> >>>> >>> ls: /beast/lost+found: No such file or directory
>> >>>> >>> ls: /beast/usr0: No such file or directory
>> >>>> >>> ls: /beast/usr: No such file or directory
>> >>>> >>> total 32
>> >>>> >>> drwxr-xr-x  5 root root  4096 2007-07-13 16:18 .
>> >>>> >>> drwxr-xr-x 27 root root  4096 2007-06-25 18:34 ..
>> >>>> >>> drwx------  2 root root 16384 2007-06-25 17:08 lost+found
>> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr
>> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr0
>> >>>> >>>
>> >>>> >>> I have one machine that is no longer returning from an
"ls".  I
>> get
>> >>>> other
>> >>>> >>> messages sometimes, not just "No such file or directory", but
>> also
>> >>>> "Bad
>> >>>> >>> file descriptor" or even "File exists".  These extraneous
>> messages
>> >>>> are
>> >>>> >>> also occurring when copying from the GlusterFS to the
>> GlusterFS.  The
>> >>>> >>> files and directories mentioned do, in fact, exist, no matter
>> what
>> >>>> the
>> >>>> >>> extraneous error message says.
>> >>>> >>>
>> >>>> >>> Is there a known issue with the current patchset?
>> >>>> >>>
>> >>>> >>> Thanks,
>> >>>> >>>
>> >>>> >>> Brent
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> _______________________________________________
>> >>>> >>> Gluster-devel mailing list
>> >>>> >>> address@hidden
>> >>>> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>> >>>> >>>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> --
>> >>>> >> Anand V. Avati
>> >>>> >>
>> >>>> >
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Anand V. Avati
>> >
>>
>
>
>
> --
> Anand V. Avati
>




--
Anand V. Avati





reply via email to

[Prev in Thread] Current Thread [Next in Thread]