octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compiling ATLAS with MinGW


From: Michael Goffioul
Subject: Re: Compiling ATLAS with MinGW
Date: Tue, 21 Feb 2012 21:05:14 +0000

On Sun, Feb 19, 2012 at 9:05 PM, Michael Goffioul
<address@hidden> wrote:
> I managed to complete the compilation process, but it wasn't really a
> success. I specified MinGW as main compiler, but kept cygwin-GCC as
> the XCC compiler. I had to hack a bit the Makefile in tune/sysinfo/,
> because some support executables are compiled with the ICC compiler
> (interface compiler), mingw-in-cygwin in this case, but are then
> called with cygwin-style paths and fail. I also noticed a similar
> issue that prevented a proper creation of files like
> atlas_dtrsmXover.h (empty files are created), although these errors
> were ignored by make.
>
> I created the DLL and tested the result with octave with this snippet:
>
> n=2000; A=randn(n); B=randn(n);tic; C=A*B; t=toc, MFLOPS=2*n^3/t*1e-6
>
> and verified that both CPU's were used (in the task manager). However
> it seems that the generated multi-threaded ATLAS is suboptimal and
> even with the 2 CPU's being used, it was slower than a single-threaded
> ATLAS. I don't know the reason, but it may be related to the fact that
> some files couldn't be properly generated. In the end, it seems like
> trying to compile ATLAS with MinGW is a dead-end...

I've been tricked by Intel HT technology....

After some ATLAS patching, I could eventually compile ATLAS with MinGW
and MSYS only (no cygwin required anymore), which allowed me to
compile a multi-threaded version pretty easily. However the initial
test I made on my main desktop showed that the MT version was slightly
slower than the ST version. But this is due to the fact that my CPU is
a P4 HT, which looks like a multi-CPU system, but it's actually not.
In the case of ATLAS, HT does not really help and makes things worse.

I've then tested the same binaries on a Core2 Duo, and this time I got
the expected result: the MT ATLAS was nearly 2x faster than the ST
one.

Michael.

PS: also I got some strange result when applying the same procedure on
a Intel Atom N270. When compiling ATLAS, surprizingly it does not
select SSE2 kernels for double-precision operation, but plain x87
ones. OTOH, it does select SSE kernels for single-precision
operations. The result is that the compiled ATLAS is almost 2.5x
faster for single operations than double operations. Any hint on why
x87 kernels appears faster than SSE2 kernels during ATLAS tuning are
welcome :)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]