coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: false-positive failure of the root-removal test


From: Jim Meyering
Subject: Re: false-positive failure of the root-removal test
Date: Wed, 14 Oct 2015 11:40:01 -0700

On Wed, Oct 14, 2015 at 10:43 AM, Jim Meyering <address@hidden> wrote:
> Running a massively parallel "make very-expensive-check"
> (-j73 on a 48-core system), the rm/r-root.sh test would fail
> about 1-in-2 or 1-in-3 trials due to expiration of the 2-second
> timeout here:
>
> diff --git a/tests/rm/r-root.sh b/tests/rm/r-root.sh
> index c06332a..4e645e6 100755
> --- a/tests/rm/r-root.sh
> +++ b/tests/rm/r-root.sh
> @@ -88,7 +88,7 @@ exercise_rm_r_root ()
>      skip_exit='CU_TEST_SKIP_EXIT=1'
>    fi
>
> -  timeout --signal=KILL 2 \
> +  timeout --signal=KILL 5 \
>      env LD_PRELOAD=$LD_PRELOAD:./k.so $skip_exit \
>        rm -rv --one-file-system "$@" > out 2> err
>
> I made the above change and observed that the whole test then
> succeeded 6 times in a row. Then I read the comment above that change:
>
> # exercise_rm_r_root: shell function to test "rm -r '/'"
> # The caller must provide the FILE to remove as well as any options
> # which should be passed to 'rm'.
> # Paranoia mode on:
> # For the worst case where both rm(1) would fail to refuse to process the "/"
> # argument (in the cases without the --no-preserve-root option), and
> # intercepting the unlinkat(1) system call would fail (which actually already
> # has been proven to work above), and the current non root user has
> # write access to "/", limit the damage to the current file system via
> # the --one-file-system option.
> # Furthermore, run rm(1) via timeout(1) that kills that process after
> # a maximum of 2 seconds.
>
> So maybe compromise at 3 seconds (with that, it's passed 4 times so far)?
> Probably better still: I'll remember this and decrease -j's argument from
> 1+3N/2 to something slightly less abusive.

FYI, while trying to confirm that "3" is sufficient, I hit another failure,
but now in another race-susceptible test:

+ diff -u exp out
--- exp 2015-10-14 11:26:05.424685178 -0700
+++ out 2015-10-14 11:26:05.424685178 -0700
@@ -1 +0,0 @@
-line
+ fail=1
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ remove_tmp_
+ __st=1
+ cleanup_
+ :
+ cd /data/users/meyering/w/co/cu
+ chmod -R u+rwx /data/users/meyering/w/co/cu/gt-follow-stdin.sh.y0sA
+ rm -rf /data/users/meyering/w/co/cu/gt-follow-stdin.sh.y0sA
+ exit 1
FAIL tests/tail-2/follow-stdin.sh (exit status: 1)

So I'll just remember to use reduced parallelism for this task.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]