octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fwd: Re: rfftw slower than fftw for "bad" size arrays]


From: Dmitri A. Sergatskov
Subject: Re: [Fwd: Re: rfftw slower than fftw for "bad" size arrays]
Date: Fri, 30 Jan 2004 14:40:09 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040115

David Bateman wrote:
Ok, I've port Octave to also use FFTW 3.0.1, but I'm getting some odd results.
I attach a patch, and some rewritten test programs and some comparisons I've
done. Basically, in many cases I'm faster than both the old octave and matlab,
but there are some cases where it is slower. I'm really not sure what is the
issue, so any clues would be of assistance.


I have not tried the patch yet, but I run benchmarks from fftw2 and fftw3
packages(see below). It seems this problem is the fftw3 "feature" -- when run 
with
ESTIMATE flag it does it slower (sometimes) than fftw2.

It is not quite clear to me from the docs if using "wisdom" created with
more "patient" flags helps when you use ESTIMATE flag. It is definitely
possible:

http://www.fftw.org/fftw3_doc/Words-of-Wisdom-Saving-Plans.html#Words%20of%20Wisdom-Saving%20Plans

<<<<
  Wisdom is automatically used for any size to which it is applicable, as long
  as the planner flags are not more "patient" than those with which the wisdom
  was created. For example, wisdom created with FFTW_MEASURE can be used if you
  later plan with FFTW_ESTIMATE or FFTW_MEASURE, but not with FFTW_PATIENT.
>>>>

One possibility is to have say "make wisdom_measure" rule in octave makefile
which will create wisdom file (it will take a while, so it should not be in
default make, I would guess).


Cheers
David


Regards,
Dmitri.

p.s.: Here is some benchmark numbers (forward/inplace xform):

fftw2 (complex 512x512):

SPEED TEST: 512x512, FFTW_FORWARD, in place, generic (That is what Octave would 
use)
time for one fft: 147.481125 ms (562.595844 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 159.972742

 SPEED TEST: 512x512, FFTW_FORWARD, in place, specific
time for one fft: 48.231141 ms (183.987200 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 489.164463

fftw3 (complex 512x512):
address@hidden tests]$ ./bench -oestimate -s 512x512
Problem: 512x512, setup: 342.00 us, time: 229.17 ms, ``mflops'': 102.95   
(XXXXX)

address@hidden tests]$ ./bench -s 512x512 (!! This is using default MEASURE 
flag !!)
Problem: 512x512, setup: 827.87 ms, time: 96.72 ms, ``mflops'': 243.94

address@hidden tests]$ ./bench -oexhaustive -s 512x512
Problem: 512x512, setup: 340.86 s, time: 47.56 ms, ``mflops'': 496.06

address@hidden tests]$ ./bench -opatient -s 512x512
Problem: 512x512, setup: 33.65 s, time: 50.51 ms, ``mflops'': 467.06
============

fftw2 (real 521x521 <-- prime size):

SPEED TEST: 521x521, FFTW_FORWARD, in place, generic
time for one fft: 444.606000 ms (1.631683 us/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 27.664384

(specific gives the same results)

fftw3:
address@hidden tests]$ ./bench -s r521x521
Problem: r521x521, setup: 1.08 s, time: 75.10 ms, ``mflops'': 163.1

address@hidden tests]$ ./bench -oestimate -s r521x521
Problem: r521x521, setup: 1.09 ms, time: 81.23 ms, ``mflops'': 150.8

address@hidden tests]$ ./bench -oexhaustive -s r521x521
Problem: r521x521, setup: 3.08 s, time: 71.80 ms, ``mflops'': 170.59







reply via email to

[Prev in Thread] Current Thread [Next in Thread]