octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

reduction funs optimizations + min/max question


From: Jaroslav Hajek
Subject: reduction funs optimizations + min/max question
Date: Fri, 13 Feb 2009 21:22:59 +0100

hi,

this changeset: http://hg.savannah.gnu.org/hgweb/octave/rev/53b4fdeacc2e
reimplements the reduction and cumulative reduction "cores" sum, prod,
sumsq, cumsum and cumprod
for better performance.

A short benchmark (set n to suitable value):
n = 5e3;
c = {1,single(1),1+1i,single(1+1i)};
for i = 1:4
  a = c{i}*rand (n);
  disp (typeinfo (a));
  tic; b = sum (a); toc
  tic; b = sum (a, 2); toc
  tic; b = prod (a); toc
  tic; b = prod (a, 2); toc
  tic; b = sumsq (a); toc
  tic; b = sumsq (a, 2); toc
  tic; b = cumsum (a); toc
  tic; b = cumsum (a, 2); toc
  tic; b = cumprod (a); toc
  tic; b = cumprod (a, 2); toc
endfor

on Core 2 @ 2.83 GHz, g++ -O3 -funroll-loops -march=native,
I get with a recent tip:

matrix
Elapsed time is 0.177577 seconds.
Elapsed time is 0.408842 seconds.
Elapsed time is 0.190334 seconds.
Elapsed time is 0.42727 seconds.
Elapsed time is 0.167967 seconds.
Elapsed time is 0.400041 seconds.
Elapsed time is 0.273645 seconds.
Elapsed time is 0.734546 seconds.
Elapsed time is 0.301021 seconds.
Elapsed time is 0.747994 seconds.
float matrix
Elapsed time is 0.185524 seconds.
Elapsed time is 0.357071 seconds.
Elapsed time is 0.180458 seconds.
Elapsed time is 0.358543 seconds.
Elapsed time is 0.164953 seconds.
Elapsed time is 0.345552 seconds.
Elapsed time is 0.218582 seconds.
Elapsed time is 0.539203 seconds.
Elapsed time is 0.231107 seconds.
Elapsed time is 0.547846 seconds.
complex matrix
Elapsed time is 0.22692 seconds.
Elapsed time is 0.496821 seconds.
Elapsed time is 0.405787 seconds.
Elapsed time is 0.663733 seconds.
Elapsed time is 0.401873 seconds.
Elapsed time is 0.824625 seconds.
Elapsed time is 0.62486 seconds.
Elapsed time is 1.12527 seconds.
Elapsed time is 0.855924 seconds.
Elapsed time is 1.35541 seconds.
float complex matrix
Elapsed time is 0.247731 seconds.
Elapsed time is 0.490363 seconds.
Elapsed time is 0.449638 seconds.
Elapsed time is 0.647107 seconds.
Elapsed time is 0.455856 seconds.
Elapsed time is 0.744101 seconds.
Elapsed time is 0.409643 seconds.
Elapsed time is 0.85453 seconds.
Elapsed time is 0.667114 seconds.
Elapsed time is 1.24475 seconds.

and with the new patch:

matrix
Elapsed time is 0.033428 seconds.
Elapsed time is 0.0368621 seconds.
Elapsed time is 0.057929 seconds.
Elapsed time is 0.0453949 seconds.
Elapsed time is 0.0323532 seconds.
Elapsed time is 0.0359769 seconds.
Elapsed time is 0.12749 seconds.
Elapsed time is 0.140864 seconds.
Elapsed time is 0.152115 seconds.
Elapsed time is 0.150073 seconds.
float matrix
Elapsed time is 0.0392909 seconds.
Elapsed time is 0.019315 seconds.
Elapsed time is 0.0417621 seconds.
Elapsed time is 0.0207441 seconds.
Elapsed time is 0.028121 seconds.
Elapsed time is 0.018044 seconds.
Elapsed time is 0.070262 seconds.
Elapsed time is 0.070369 seconds.
Elapsed time is 0.0884211 seconds.
Elapsed time is 0.0732369 seconds.
complex matrix
Elapsed time is 0.071722 seconds.
Elapsed time is 0.074126 seconds.
Elapsed time is 0.26618 seconds.
Elapsed time is 0.21809 seconds.
Elapsed time is 0.0640812 seconds.
Elapsed time is 0.069073 seconds.
Elapsed time is 0.412128 seconds.
Elapsed time is 0.429754 seconds.
Elapsed time is 0.529651 seconds.
Elapsed time is 0.524663 seconds.
float complex matrix
Elapsed time is 0.0584619 seconds.
Elapsed time is 0.0509369 seconds.
Elapsed time is 0.343382 seconds.
Elapsed time is 0.312339 seconds.
Elapsed time is 0.0347629 seconds.
Elapsed time is 0.040071 seconds.
Elapsed time is 0.206728 seconds.
Elapsed time is 0.216046 seconds.
Elapsed time is 0.452326 seconds.
Elapsed time is 0.42148 seconds.

the relative speed-ups, measured as (old_time / new_time - 1) * 100%

  431%
 1009%
  229%
  841%
  419%
 1012%
  115%
  421%
   98%
  398%
  372%
 1749%
  332%
 1628%
  487%
 1815%
  211%
  666%
  161%
  648%
  216%
  570%
   52%
  204%
  527%
 1094%
   52%
  162%
   62%
  158%
  324%
  863%
   31%
  107%
 1211%
 1757%
   98%
  296%
   47%
  195%

n-d arrays seem to yield even slightly better results.
I would like to also address min & max. I'm not, however, entirely
certain about the expected behaviour of min/max
w.r.t NaN and NA. Currently, the following holds:
max(NA, NaN) = max (NaN, NA) = NA
max([NA, NaN]) = NaN
max([NaN, NA]) = NA

so, it seems that the behaviour is pretty much arbitrary, i.e. NA is
not treated as "weaker" or "stringer" kind of NaN.
Matlab has no NA, so that's no source of authority here. So, is there
anything else to follow?

cheers

-- 
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]