Re: [Gluster-devel] servers keep hanging with fuse errors

Date: Wed, 30 Jan 2008 11:31:14 +0100
yes, everything is up to date, i made a new install, with a brand new etch virtual machine and then i used a development repository to install gluster with "apt-get install"

these are all the packages i've got related to either gluster or fuse:

ii fuse-utils 2.7.2-glfs8 Filesystem in USErspace (utilities) ii glusterfs-client 1.3.8-TLA643 GlusterFS fuse client ii libfuse2 2.7.2-glfs8 Filesystem in USErspace library ii libglusterfs0 1.3.8-TLA643 GlusterFS libraries and translator modules

Do you think i should install form the source code and ./configure instead of apt-get install? May be i can recompile now? I don't know if from the repository this "--enable-kernel-module" thing is set or not

En/na Anand Avati ha escrit:
 is the fuse kernel module from 2.7.2glfs8 as well? (compiled with
./configure --enable-kernel-module and make install inside kernel/ subdir)
the kernel panic you've pasted was a well known issue with older fuse kernel
modules and it seems to me that your kernel module is still old.


2008/1/30, Jordi Moles <address@hidden>:

i'm running this versions of the packages now:

ii  fuse-utils
2.7.2-glfs8                              Filesystem in USErspace
ii  glusterfs-client
1.3.8-TLA643                             GlusterFS fuse client
ii  libfuse2
2.7.2-glfs8                              Filesystem in USErspace library

I also removed any kind of lock method from dovecot.conf and added the
lock function to the filesystem itself, on the server side, with the
"type features/posix-locks" option.
The thing is that now, dovecots work grate, but postfixs hang very
often, with the following error:


Unable to handle kernel paging request at 0000000000100108 RIP:
[<ffffffff88020838>] :fuse:request_end+0x45/0x109
PGD 1f29c067 PUD 1f327067 PMD 0
Oops: 0002 [1] SMP
Modules linked in: ipv6 fuse dm_snapshot dm_mirror dm_mod
Pid: 723, comm: glusterfs Not tainted 2.6.18-xen #1
RIP: e030:[<ffffffff88020838>]  [<ffffffff88020838>]
RSP: e02b:ffff88001ecb1d68  EFLAGS: 00010246
RAX: 0000000000200200 RBX: ffff88001e82af48 RCX: ffff88001e82af58
RDX: 0000000000100100 RSI: ffff88001e82af48 RDI: ffff88001f68d400
RBP: ffff88001f68d400 R08: 000000001f74ab40 R09: ffff88001e82b048
R10: 0000000000000008 R11: ffff88001ecb1cf0 R12: 0000000000000000
R13: ffff88001e82af80 R14: ffff88001ecb1df8 R15: 0000000000000001
FS:  00002ba3e961aae0(0063) GS:ffffffff804cd000(0000)
CS:  e033 DS: 0000 ES: 0000
Process glusterfs (pid: 723, threadinfo ffff88001ecb0000, task
Stack:  ffff88001e82af48 ffff88001f68d400 00000000fffffffe
ffff88001ecb1ef8 000000301fcb8180 00000156fffffff4 ffff88001f72f100
0000000000000015 0000000000000000 ffff88001ecb1e18 0000000041000a90
Call Trace:
[<ffffffff88021056>] :fuse:fuse_dev_readv+0x385/0x435
[<ffffffff802801d3>] do_readv_writev+0x271/0x294
[<ffffffff802274c7>] default_wake_function+0x0/0xe
[<ffffffff88021120>] :fuse:fuse_dev_read+0x1a/0x1f
[<ffffffff802804bc>] vfs_read+0xcb/0x171
[<ffffffff8028089b>] sys_read+0x45/0x6e
[<ffffffff8020a436>] system_call+0x86/0x8b
[<ffffffff8020a3b0>] system_call+0x0/0x8b

Code: 48 89 42 08 48 89 10 48 c7 41 08 00 02 20 00 f6 46 30 08 48
RIP  [<ffffffff88020838>] :fuse:request_end+0x45/0x109
RSP <ffff88001ecb1d68>
CR2: 0000000000100108
  postfix01gluster01 kernel: Oops: 0002 [1] SMP
kernel: CR2: 0000000000100108

<3>BUG: soft lockup detected on CPU#0!

Call Trace:
<IRQ> [<ffffffff80257f78>] softlockup_tick+0xd8/0xea
[<ffffffff8020f110>] timer_interrupt+0x3a9/0x405
[<ffffffff80258264>] handle_IRQ_event+0x4e/0x96
[<ffffffff80258350>] __do_IRQ+0xa4/0x105
[<ffffffff8020b0e8>] call_softirq+0x1c/0x28
[<ffffffff8020cecb>] do_IRQ+0x65/0x73
[<ffffffff8034a8c1>] evtchn_do_upcall+0xac/0x12d
[<ffffffff8020ac1e>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff803f1234>] .text.lock.spinlock+0x2/0x8a
[<ffffffff88020a4f>] :fuse:fuse_dev_writev+0xb8/0x31b
[<ffffffff88020cb2>] :fuse:fuse_dev_write+0x0/0x1f
[<ffffffff802800d7>] do_readv_writev+0x175/0x294
[<ffffffff88020cb2>] :fuse:fuse_dev_write+0x0/0x1f
[<ffffffff803efb3b>] schedule_timeout+0x1e/0xad
[<ffffffff803f0976>] __down_read+0x12/0xec
[<ffffffff80280695>] sys_writev+0x45/0x93
[<ffffffff8020a436>] system_call+0x86/0x8b
[<ffffffff8020a3b0>] system_call+0x0/0x8b


i set the log level for glusterfs on both nodes and clients to WARNING.
However, nothing was written on any of them :( .

I keep using virtual machines with xen 3.1 to test all this, but i
already tried with non-virtual environments and got the same errors.

Do you have any ideas?

Jordi Moles Blanco:

thanks for the details, i'll give it a try and come back to you to tell
whether has become stable or not.

Thank you very much.

2008/1/29, Anand Avati:

 you should really upgrade your fuse kernel module which will fix this
issue. please use the kernel module from -


2008/1/28, Jordi Moles <address@hidden>:


        i'm sorry but i can't get any newer version with the
        repositories you
        gave me, after apt-update, apt-upgrade says there's nothing to
        And i try, for example, to get the source code of every
        package, i get this

        fuse_2.5.3-4.4, which is even older than the one installed on
        my system.

