[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#9737: misc/timeout-group: spurious test failure on SLES 10.3 (coreut
From: |
Pádraig Brady |
Subject: |
bug#9737: misc/timeout-group: spurious test failure on SLES 10.3 (coreutils 8.14) |
Date: |
Thu, 03 Nov 2011 02:11:27 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 |
On 10/13/2011 11:27 PM, Voelker, Bernhard wrote:
> Pádraig Brady wrote:
>
>> On 10/13/2011 04:58 PM, Voelker, Bernhard wrote:
>>> reopen 9737
>>> thanks
>>>
>>> Pádraig Brady wrote:
>>>
>>>> Bah, this is just a racy test I think.
>>>> Hopefully the attached fixes it.
>>>
>>> Thank you for the patch.
>>>
>>> I tried it 16 times:
>>>
>> * 14x PASS, execution time real < 0.4s
>>>
>>> * 1x test failure (in the 5th run)
>>
>> So the command exited without receiving SIGINT.
>> Or perhaps the touch of the 'received.int' file
>> is being done asynch. Anything special about your
>> file system?
>
> It's a virtual host on a ESX server farm in our data center.
>
> address@hidden:~/berny/depot/coreutils-8.14/tests> uname -a
> Linux mchp320a 2.6.16.60-0.74.7-smp #1 SMP Fri Nov 26 09:16:10 UTC 2010
> x86_64 x86_64 x86_64 GNU/Linux
>
> address@hidden:~/berny/depot/coreutils-8.14/tests> df -h .
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/vg01-lvol0
> 50G 15G 33G 31% /user
>
> address@hidden:~/berny/depot/coreutils-8.14/tests> mount | grep /user
> /dev/mapper/vg01-lvol0 on /user type ext3 (rw,acl,user_xattr)
>
>>> * 1x the test lasted 20s (in the 16th run)
>>
>> But this one passed, which means the command
>> did receive the SIGINT, but then didn't exit?
>
> Sounds like one error is shadowing another.
>
>> I'm confused, sorry,
>> Pádraig.
>
> That's strange, indeed.
>
> I repeated the test with < 0.2 load 100 times:
> the run #5, #18, #28, #53, #58 and #71 resulted in FAIL as above,
> and the run #24 and #25 PASSed but took 20 seconds,
> all other PASSed within <=0.3s.
I reproduced this weirdness in OpenSuse 10.3 in a VM.
Much less frequently though.
Delays in 10 out of 2750
Signal handler call failure in 1 out of 2750
The delays might be due to bash, but I updated
to 4.2 and the issue still persists.
I suspect kernel issues too.
Anyway I've attached 2 patches to replace the previous one.
The first hopefully addresses any races in the test.
I don't think you hit any of these TBH.
The second should detect the signal issues and skip the test.
cheers,
Pádraig.
1-timeout-races.diff
Description: Text document
2-timeout-skips.diff
Description: Text document
- bug#9737: misc/timeout-group: spurious test failure on SLES 10.3 (coreutils 8.14),
Pádraig Brady <=