help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strange FFT timing


From: Sergei Steshenko
Subject: Re: strange FFT timing
Date: Sun, 20 Apr 2008 14:07:10 -0700 (PDT)

--- Przemek Klosowski <address@hidden> wrote:

>  I would expect that FFT would be fasterst for datasets of the size
> that is a power of 2, but a quick test doesn't bear this out. I am
> using Octave Version 3.0.0 on Fedora 8 (2.6.24) runing on a 2.1 GHz
> Athlon XP 2800+ with 1GB of memory (no swap activity during the
> calculation). The 1,000,000 element FFT is almost twice as fast as
> 2^20 element FFT (see below for excerpt of the Octave session).
> 
> Anyone would venture what is going on? It could be cache aliasing
> (this Athlon has a 512kB cache). In any case, FFTW seems to be very
> good handling the non-power-of-2 sizes, and seems
> to be optimized for power-of-ten sizes.
> 
> v=rand(2^20,1);  tic;fft(v);toc
> Elapsed time is 0.339144 seconds.
> Elapsed time is 0.290032 seconds.
> Elapsed time is 0.283474 seconds.
> Elapsed time is 0.291376 seconds.
> Elapsed time is 0.285946 seconds.
> v=rand(1e6,1);   tic;fft(v);toc
> Elapsed time is 0.223884 seconds.
> Elapsed time is 0.179272 seconds.
> Elapsed time is 0.179442 seconds.
> Elapsed time is 0.184591 seconds.
> Elapsed time is 0.179012 seconds.
> 
> Similar results obtained with  v=rand(1e6,1);x=cputime;fft(v);cputime-x
> _______________________________________________
> Help-octave mailing list
> address@hidden
> https://www.cae.wisc.edu/mailman/listinfo/help-octave
> 

Well, there are too many factors involved.

I think 'octave' uses FFTW_ESTIMATE rather than FFTW_MEASURE to build plans, 
and it makes
sense since you won't want to wait minutes while the plan is being built.

I myself have noticed than FFTW size of 3 * 3 * (2 ** n) can be somewhat faster 
than
2 * 2 * 2 * (2 ** n), even though 9 > 8.

I also have Athlon.

I wrote and ad-hoc "C" program to benchmark FFTW - if you are interested, I can 
send it to
you.

Even though it's ad-hoc, it was meant to be written clearly and to be easy to 
modify.

Through it I've discovered that F2C is actually faster than F2HC, even though 
the documentation
claims it should be the opposite.

The bottom line - I am _not_surprised after all the benchmarking I've done.

Powers of 10 optimization is not surprising since 10 == 2 * 5, and 5 is among 
well supported
factors.

Regards,
  Sergei.

Applications From Scratch: http://appsfromscratch.berlios.de/


      
____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]