Re: A faster sum

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A faster sum

From:	Schloegl Alois
Subject:	Re: A faster sum
Date:	Sat, 04 Jun 2005 22:51:46 +0200
User-agent:	Internet Messaging Program (IMP) 4.1-cvs

There is a simple explanation for why sum is currently slow.  It's
related to the fact that it needs to work across any dimentsion for
N-d arrays.  Probably the current code for that is not so good.  We
can probably optimize a few simple cases.  Or maybe someone will have
a smarter way to do what we need, perhaps by doing something more
intelligent with strides in a single loop or something.  I don't have
time to think about how to fix the problem just now, but the code to
look at is in the macro MX_ND_REDUCTION in liboctave/mx-inlines.cc.

jwe

I was looking into this in more detail. When running the followingscript using SUMSKIPNAN.OCT compiled fromhttp://cvs.sourceforge.net/viewcvs.py/*checkout*/octave/octave-forge/extra/NaN/sumskipnan.cc

f = { 'sum(X)', 'sum(X,1)', 'sum(X,2)', 'sumskipnan(X)','sumskipnan(X,1)', 'sumskipnan(X,2)','ones(1,n)*X','X*ones(n,1)'};

K =repmat(NaN,length(f),16);
for N=[12],%0:13];
       n = ceil(2^N*1.4);
       X = rand(n);
       for k = 1:length(f),
               t = cputime;
               s = eval(f{k});
               K(k,N+1) = cputime-t;
               fprintf(1,'%2i %2i\t%15s\t%5.2f\n',k,n,f{k},K(k,N+1)/K(1,N+1));
       end;
end;

one gets this result. 1 5735          sum(X)  1.00
2 5735        sum(X,1)  1.00
3 5735        sum(X,2)  6.33
4 5735   sumskipnan(X)  1.11
5 5735 sumskipnan(X,1)  1.11
6 5735 sumskipnan(X,2)  5.17
7 5735     ones(1,n)*X  0.25
8 5735     X*ones(n,1)  0.70

SUMKSKIPNAN.OCT does not use the macro MX_ND_REDUCTION. Despite, itshows that slowdown for DIM=2; A similar effect is shown in Matlab.Here are the results: 1 5735 sum(X) 1.00

2 5735        sum(X,1)  0.98
3 5735        sum(X,2) 12.68
4 5735   sumskipnan(X)  1.52
5 5735 sumskipnan(X,1)  1.52
6 5735 sumskipnan(X,2) 14.62
7 5735     ones(1,n)*X  0.56
8 5735     X*ones(n,1)  0.64

We can also see that the matrix multiplication is much faster forDIM=2. Hence, the slowdown can not only be explained only by aninefficent implementation of SUM or SUMSKIPNAN. This suggests that theslow down is caused mainly by the cache lines. Using ATLAS/BLASimplementing SUM could provide a significant performance gain. However,I'm still wondering how SUMSKIPNAN could be implemented efficientlyusing ATLAS or BLAS. Any ideas ?



Alois



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------

[Prev in Thread]

Current Thread

[Next in Thread]

Re: A faster sum, Schloegl Alois <=

Prev by Date: Re: SWIG for Octave: Octave-SWIG.
Next by Date: testing for uint8: a solution
Previous by thread: SWIG for Octave: Octave-SWIG.
Next by thread: testing for uint8: a solution
Index(es):
- Date
- Thread