[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fftwf tests running for 25+ hours - is this normal?
From: |
Chris Marusich |
Subject: |
Re: fftwf tests running for 25+ hours - is this normal? |
Date: |
Tue, 15 Oct 2019 11:32:38 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
Hi Bengt and Matteo,
Bengt Richter <address@hidden> writes:
> Have you checked sensors for overheating that might induce CPU clock
> throttling?
Actually, yes, I happened to be watching dmesg output at the time, and I
did notice these messages (no similar messages have been printed since
then, which was many hours ago):
--8<---------------cut here---------------start------------->8---
[180270.045081] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 124733)
[180270.045567] mce: CPU1: Core temperature/speed normal
[180570.044352] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 134902)
[180570.044838] mce: CPU1: Core temperature/speed normal
[180875.663432] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 143897)
[180875.663918] mce: CPU1: Core temperature/speed normal
[181175.748616] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 153503)
[181175.749103] mce: CPU1: Core temperature/speed normal
[181476.496915] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 171377)
[181476.497401] mce: CPU1: Core temperature/speed normal
[181776.914264] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 185828)
[181776.914751] mce: CPU1: Core temperature/speed normal
[182076.914391] mce: CPU1: Core temperature/speed normal
[182377.322112] mce: CPU1: Core temperature above threshold, cpu clock
throttled (total events = 231578)
[182377.322598] mce: CPU1: Core temperature/speed normal
--8<---------------cut here---------------end--------------->8---
Is this throttling permanent, or is the throttling released after the
temperature returns to normal?
Matteo Frigo <address@hidden> writes:
> thanks for reporting this issue. (By the way, did the test ever
> finish?)
It's still running. It's been 38 hours now.
> Most likely, the problem is in FFTW and not in your machine. Would you
> mind reporting the exact configure options (I care specifically about
> --enable-sse2, --enable-avx, etc.) and the exact command line that is
> taking 25h?
These two options were included in the configure invocation.
> The issue is probably related to the funny integer 34320. The FFT
> algorithm depends on the factorization of the size, 34320 in this case,
> which is 34320=2*2*2*2*3*5*11*13. FFTW tries to find the optimal way to
> decompose 34320 into small problems that it can solve. For example, it
> may reduce a DFT of size 34320 into 34320/13 FFTs of size 13 followed by
> 13 FFTs of size 34320/13. Or it may reduce a DFT of size 34320 into
> 34320/11 FFTs of size 11 followed by 11 FFTs of size 34320/11. Because
> there are many distinct prime factors, the number of factorizations is
> really large and FFTW is pretty much trying them all. The problem is
> even harder once you consider the fact that some sizes are odd, and thus
> they don't naturally match the vector length of SSE/AVX instructions.
>
> FFTW has some heuristics to limit the search space in cases like this,
> but these heuristics were implemented in a simpler world (before AVX)
> and maybe they are not adequate any longer, so I want to look into this.
Thank you for taking a look! Here is how the test is being run.
Output of "pstree -lap 20098":
--8<---------------cut here---------------start------------->8---
make,20098 check -j 2
`-bash,20099 -c fail=; \\\012if (target_option=k; case ${target_option-} in
?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target
option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no;
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; };
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\\\[\\ \\\011]*)
bs=\\\\; sane_makeflags=`printf '%s\\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs
$bs\011]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf
'%s\\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test
$skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*)
continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';;
-*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l)
strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm])
skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*)
has_opt=yes; break;; esac; done; test $has_opt = yes); then \\\012
failcom='fail=yes'; \\\012else \\\012 failcom='exit 1'; \\\012fi;
\\\012dot_seen=no; \\\012target=`echo check-recursive | sed s/-recursive//`;
\\\012case "check-recursive" in \\\012 distclean-* | maintainer-clean-*)
list='support genfft kernel simd-support dft rdft reodft api libbench2 .
threads tests mpi doc tools m4' ;; \\\012 *) list='support kernel
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;;
\\\012esac; \\\012for subdir in $list; do \\\012 echo "Making $target in
$subdir"; \\\012 if test "$subdir" = "."; then \\\012 dot_seen=yes; \\\012
local_target="$target-am"; \\\012 else \\\012 local_target="$target";
\\\012 fi; \\\012 (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make
$local_target) \\\012 || eval $failcom; \\\012done; \\\012if test "$dot_seen"
= "no"; then \\\012 make "$target-am" || exit 1; \\\012fi; test -z "$fail"
`-bash,20267 -c fail=; \\\012if (target_option=k; case ${target_option-}
in ?) ;; *) echo "am__make_running_with_option: internal error: invalid"
"target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no;
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; };
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\\\[\\ \\\011]*)
bs=\\\\; sane_makeflags=`printf '%s\\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs
$bs\011]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf
'%s\\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test
$skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*)
continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';;
-*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l)
strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm])
skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*)
has_opt=yes; break;; esac; done; test $has_opt = yes); then \\\012
failcom='fail=yes'; \\\012else \\\012 failcom='exit 1'; \\\012fi;
\\\012dot_seen=no; \\\012target=`echo check-recursive | sed s/-recursive//`;
\\\012case "check-recursive" in \\\012 distclean-* | maintainer-clean-*)
list='support genfft kernel simd-support dft rdft reodft api libbench2 .
threads tests mpi doc tools m4' ;; \\\012 *) list='support kernel
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;;
\\\012esac; \\\012for subdir in $list; do \\\012 echo "Making $target in
$subdir"; \\\012 if test "$subdir" = "."; then \\\012 dot_seen=yes; \\\012
local_target="$target-am"; \\\012 else \\\012 local_target="$target";
\\\012 fi; \\\012 (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && make
$local_target) \\\012 || eval $failcom; \\\012done; \\\012if test "$dot_seen"
= "no"; then \\\012 make "$target-am" || exit 1; \\\012fi; test -z "$fail"
`-make,20268 check
`-make,20269 check-local
`-perl,20431 -w ./check.pl -r -c=30 -v --nthreads=2
/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/bench
`-bench,20572 -o nthreads=2 --verbose=1 --verify
//obcd34320 --verify //ibcd34320 --verify //ofcd34320 --verify //ifcd34320
--verify obcd34320 --verify ibcd34320 --verify ofcd34320 --verify ifcd34320
--verify okd1200e11 --verify ikd1200e11 --verify obr9x5x6v12 --verify
ibr9x5x6v12 --verify ofr9x5x6v12 --verify ifr9x5x6v12 --verify //obc9x5x6v12
--verify //ibc9x5x6v12 --verify //ofc9x5x6v12 --verify //ifc9x5x6v12 --verify
obc9x5x6v12 --verify ibc9x5x6v12 --verify ofc9x5x6v12 --verify ifc9x5x6v12
--verify ok65e01x36e10 --verify ik65e01x36e10 --verify obr8x9x3*17 --verify
ibr8x9x3*17 --verify ofr8x9x3*17 --verify ifr8x9x3*17 --verify //obc8x9x3*17
--verify //ibc8x9x3*17 --verify //ofc8x9x3*17 --verify //ifc8x9x3*17 --verify
obc8x9x3*17 --verify ibc8x9x3*17 --verify ofc8x9x3*17 --verify ifc8x9x3*17
--verify ok7o10x28e00v21 --verify ik7o10x28e00v21 --verify obr8x4x4x5v3
--verify ibr8x4x4x5v3 --verify ofr8x4x4x5v3 --verify ifr8x4x4x5v3 --verify
//obc8x4x4x5v3 --verify //ibc8x4x4x5v3
`-{bench},20586
--8<---------------cut here---------------end--------------->8---
For some reason, the pstree output seems to have a lot of backslash
escapes in it. That might be a little hard to read, so here is the same
thing from "ps --forest -ef", which for some reason does not have the
same number of backslash escapes:
--8<---------------cut here---------------start------------->8---
UID PID PPID C STIME TTY TIME CMD
[... omitted ...]
guixbui+ 20098 31417 0 Oct13 ? 00:00:00 \_ make check -j 2
guixbui+ 20099 20098 0 Oct13 ? 00:00:00 \_
/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash -c
fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo
"am__make_running_with_option: internal error: invalid" "target option
'${target_option-}' specified" >&2; exit 1;; esac; has_opt=no;
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; };
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\;
sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;;
esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed
"s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && {
skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt
'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O';
skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l';
skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT])
skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac;
done; test $has_opt = yes); then \ failcom='fail=yes'; \ else \
failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed
s/-recursive//`; \ case "check-recursive" in \ distclean-* |
maintainer-clean-*) list='support genfft kernel simd-support dft rdft reodft
api libbench2 . threads tests mpi doc tools m4' ;; \ *) list='support kernel
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;;
\ esac; \ for subdir in $list; do \ echo "Making $target in $subdir"; \ if
test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am";
\ else \ local_target="$target"; \ fi; \ (CDPATH="${ZSH_VERSION+.}:"
&& cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test
"$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail"
guixbui+ 20267 20099 0 Oct13 ? 00:00:00 \_
/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash -c
fail=; \ if (target_option=k; case ${target_option-} in ?) ;; *) echo
"am__make_running_with_option: internal error: invalid" "target option
'${target_option-}' specified" >&2; exit 1;; esac; has_opt=no;
sane_makeflags=$MAKEFLAGS; if { if test -z '0'; then false; elif test -n
'x86_64-unknown-linux-gnu'; then true; elif test -n '4.2.1' && test -n
'/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8'; then true; else false; fi; };
then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \?]*) bs=\\;
sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs?]*//g"`;;
esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed
"s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && {
skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt
'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O';
skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l';
skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT])
skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac;
done; test $has_opt = yes); then \ failcom='fail=yes'; \ else \
failcom='exit 1'; \ fi; \ dot_seen=no; \ target=`echo check-recursive | sed
s/-recursive//`; \ case "check-recursive" in \ distclean-* |
maintainer-clean-*) list='support genfft kernel simd-support dft rdft reodft
api libbench2 . threads tests mpi doc tools m4' ;; \ *) list='support kernel
simd-support dft rdft reodft api libbench2 . threads tests mpi doc tools m4' ;;
\ esac; \ for subdir in $list; do \ echo "Making $target in $subdir"; \ if
test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am";
\ else \ local_target="$target"; \ fi; \ (CDPATH="${ZSH_VERSION+.}:"
&& cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test
"$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail"
guixbui+ 20268 20267 0 Oct13 ? 00:00:00 \_ make
check
guixbui+ 20269 20268 0 Oct13 ? 00:00:00 \_
make check-local
guixbui+ 20431 20269 0 Oct13 ? 00:00:00 \_
perl -w ./check.pl -r -c=30 -v --nthreads=2
/tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/bench
guixbui+ 20572 20431 99 Oct13 ? 1-15:47:52
\_ /tmp/guix-build-fftwf-3.3.8.drv-0/fftw-3.3.8/tests/.libs/bench -o
nthreads=2 --verbose=1 --verify //obcd34320 --verify //ibcd34320 --verify
//ofcd34320 --verify //ifcd34320 --verify obcd34320 --verify ibcd34320 --verify
ofcd34320 --verify ifcd34320 --verify okd1200e11 --verify ikd1200e11 --verify
obr9x5x6v12 --verify ibr9x5x6v12 --verify ofr9x5x6v12 --verify ifr9x5x6v12
--verify //obc9x5x6v12 --verify //ibc9x5x6v12 --verify //ofc9x5x6v12 --verify
//ifc9x5x6v12 --verify obc9x5x6v12 --verify ibc9x5x6v12 --verify ofc9x5x6v12
--verify ifc9x5x6v12 --verify ok65e01x36e10 --verify ik65e01x36e10 --verify
obr8x9x3*17 --verify ibr8x9x3*17 --verify ofr8x9x3*17 --verify ifr8x9x3*17
--verify //obc8x9x3*17 --verify //ibc8x9x3*17 --verify //ofc8x9x3*17 --verify
//ifc8x9x3*17 --verify obc8x9x3*17 --verify ibc8x9x3*17 --verify ofc8x9x3*17
--verify ifc8x9x3*17 --verify ok7o10x28e00v21 --verify ik7o10x28e00v21 --verify
obr8x4x4x5v3 --verify ibr8x4x4x5v3 --verify ofr8x4x4x5v3 --verify ifr8x4x4x5v3
--verify //obc8x4x4x5v3 --verify //ibc8x4x4x5v3
--8<---------------cut here---------------end--------------->8---
The configure invocation, according to the config.log, was:
--8<---------------cut here---------------start------------->8---
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by fftw configure 3.3.8, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ ./configure
CONFIG_SHELL=/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash
SHELL=/gnu/store/vd5fcqira1q4ibq5q7bfdfpcmdyy6fxg-bash-minimal-5.0.7/bin/bash
--prefix=/gnu/store/w21mrlm5bl2q54x8xlij0cbf4pjzxcii-fftwf-3.3.8
--enable-fast-install --build=x86_64-unknown-linux-gnu --enable-single
--enable-shared --enable-openmp --enable-threads --enable-sse2 --enable-avx
--enable-avx2 --enable-avx512 --enable-avx-128-fma
ax_cv_c_flags__mtune_native=no
--8<---------------cut here---------------end--------------->8---
I've attached the config.log (config.log.gz) to this email, since it
might be helpful. I've also attached the build log
(fftwf-3.3.8.drv.bz2), which contains all the lines that the build logic
has printed up to this point.
Additionally, if you're familiar with Guix, you might find the package
definition to be helpful, as well. It is defined here:
https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/algebra.scm?id=6c50e1dc0625f89884cff40b22627091efa37708#n774
To be specific, I'm building fftwf. The fftwf package builds a
single-precision version of fftw by customizing the fftw package's
configure options a little bit. I'm building fftwf from a custom Guix
checkout with a few personal commits unrelated to fftw, but the commit
6c50e1dc0625f89884cff40b22627091efa37708 is similar enough that you
should be able to reproduce the issue with Guix if you run "guix pull
--commit=6c50e1dc0625f89884cff40b22627091efa37708" and then run "guix
build --keep-failed fftwf".
For now, I'll keep on waiting and see what happens... If any other info
would be helpful, let me know and I'll see what I can do!
Thank you for your help,
--
Chris
config.log.gz
Description: Binary data
fftwf-3.3.8.drv.bz2
Description: Binary data
signature.asc
Description: PGP signature
Re: fftwf tests running for 25+ hours - is this normal?, Matteo Frigo, 2019/10/15