octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: reduction funs optimizations + min/max question


From: Jaroslav Hajek
Subject: Re: reduction funs optimizations + min/max question
Date: Mon, 16 Feb 2009 12:40:07 +0100

On Mon, Feb 16, 2009 at 9:23 AM, Jaroslav Hajek <address@hidden> wrote:
> On Fri, Feb 13, 2009 at 9:22 PM, Jaroslav Hajek <address@hidden> wrote:
>> hi,
>>
>> this changeset: http://hg.savannah.gnu.org/hgweb/octave/rev/53b4fdeacc2e
>> reimplements the reduction and cumulative reduction "cores" sum, prod,
>> sumsq, cumsum and cumprod
>> for better performance.
>>
>
>
> Following similar ideas, I optimized also min/max reductions and
> any/all. A simplistic benchmark follows, as usual
> (set N to suitable value).
>
> n = 5e3;
> a = rand (n);
> tic; b = max (a); toc
> tic; b = min (a); toc
> tic; b = max (a, [], 2); toc
> tic; b = min (a, [], 2); toc
> tic; [b, i] = max (a); toc
> tic; [b, i] = min (a); toc
> tic; [b, i] = max (a, [], 2); toc
> tic; [b, i] = min (a, [], 2); toc
>
> tiny = a < 1e-2;
> huge = ! tiny;
> clear a;
>
> tic; any (tiny); toc
> tic; any (tiny, 2); toc
> tic; all (huge); toc
> tic; all (huge, 2); toc
>
> with a recent tip, I get:
>
> Elapsed time is 0.195431 seconds.
> Elapsed time is 0.19919 seconds.
> Elapsed time is 0.335377 seconds.
> Elapsed time is 0.353903 seconds.
> Elapsed time is 0.194535 seconds.
> Elapsed time is 0.199214 seconds.
> Elapsed time is 0.333951 seconds.
> Elapsed time is 0.353886 seconds.
> Elapsed time is 0.242775 seconds.
> Elapsed time is 0.244407 seconds.
> Elapsed time is 0.149176 seconds.
> Elapsed time is 0.150418 seconds.
>
> with the new patches, I get:
>
> Elapsed time is 0.0466709 seconds.
> Elapsed time is 0.0463569 seconds.
> Elapsed time is 0.0438602 seconds.
> Elapsed time is 0.0425069 seconds.
> Elapsed time is 0.0478208 seconds.
> Elapsed time is 0.0480652 seconds.
> Elapsed time is 0.03636 seconds.
> Elapsed time is 0.0462041 seconds.
> Elapsed time is 0.0008111 seconds.
> Elapsed time is 0.0244672 seconds.
> Elapsed time is 0.000770807 seconds.
> Elapsed time is 0.0245111 seconds.
>
> and the relative speed-ups (the usual definition):
>
>  319%
>  330%
>  665%
>  733%
>  307%
>  314%
>  818%
>  666%
> 29832%
>  899%
> 19253%
>  514%
>
> One more note: as you can notice, the column-oriented any/all are much
> faster than row-oriented.
> That's because the column-oriented versions use short-circuiting while
> the row-oriented do not.
> In the row-reduction any/all case, there is a trade-off between
> working by columns in a cache-coherent manner
> (that's what the current version does) and sacrificing
> short-circuiting or working by rows to get short-circuiting and
> sacrifice cache-coherency.
>
> comments?
>
> cheers
>
> --

OK so the last thing optimized in this respect is the non-native sum
of logical values (i.e. their counting),
which will now be done using the fast codes and without a copy.

Simplistic benchmark again:

n = 5e3;
a = rand (n) <= 0.5;

tic; sum (a); toc
tic; sum (a, 2); toc

recent tip (prior to reduction optimizations):

Elapsed time is 0.290955 seconds.
Elapsed time is 0.527403 seconds.

with the new patch:

Elapsed time is 0.0102119 seconds.
Elapsed time is 0.0187449 seconds.

i.e. speed-ups by 2749% and 2713%. Note that this includes an internal
conversion from ints to doubles (counting using int is apparently
twice as fast as counting directly using double including the
conversion, but this is no big surprise). However, it's the result
(which is one dimension smaller) that is converted, not the source.

The new codes have apparently made a few macros orphaned, so I'll try
to eliminate them.

cheers

-- 
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]