octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

indexing improvements - advice wanted


From: Jaroslav Hajek
Subject: indexing improvements - advice wanted
Date: Sat, 25 Oct 2008 21:12:30 +0200

Hi all,

I'm currently working on a patch improving the indexing & indexed
assignment as well as cleaning up the
related interface in Array<T>.
A short summary of intended changes:

1. idx_vector class is almost completely rewritten. Instead of storing
all possible information at once and using conditionals, polymorphism
is used to represent different types of indexing (colon, range,
scalar, vector). Several redundant operations were removed, resulting
in less overhead when creating idx_vector classes, and caching is done
in constructors.
A pleasant consequence is that const idx_vector& arguments can be used
rather than writable references. Another improvement is that
internally, zero-based index arrays (Array<octave_idx_type>) are
converted to idx_vector without actually copying the data.

2. index & assign methods now take a completely different approach.
Instead of the previous looping over all elements using
increment_index, a recursive procedure is used (this should be more
amenable to possible parallelization in the future). The bottom-level
loops are always specialized for each idx-vector type, translating the
operation into std::copy, std::reverse_copy, std::fill or a fast loop.
A similar approach is taken for resize, which optimally translates to
sequence of std::copy and std::fill operations.

3. before doing the actual work, reductions are applied to multiple
indices, in an attempt to reduce the dimensionality of the operation.
For instance, A(:,:,a:b) is automatically translated into a single
contiguous range, and is thus done by a single std::copy. Similarly,
A(i,j,:,k) is translated to a single range and is thus done using a
single loop (as if using sub2ind).

The current state of the patch is attached. To illustrate the
performance improvements, I tried the following code:
n = 500;
a = ones (n,n,n);
tic; b = a(5:end-5); toc
tic; b = a(:,:,5:end-5); toc
a = ones (2, 3, n^3 / 6);
tic; for i = 1:5; b = a(2,2,:); endfor; toc
a = ones (n,n,n);
tic; a(5:end-5) = 1; toc
tic; a(:,:,5:end-5) = 1; toc
b = zeros (n,n,n-9);
tic; a(:,:,5:end-5) = b; toc
tic; a (n + 10, n + 10, n + 10) = 2; toc

(set n to a suitable value according to available memory)

using recent tip build, I got:
Elapsed time is 2.09076 seconds.
Elapsed time is 38.0217 seconds.
Elapsed time is 34.0547 seconds.
Elapsed time is 0.756007 seconds.
Elapsed time is 32.9049 seconds.
Elapsed time is 33.4742 seconds.
Elapsed time is 7.66801 seconds.

whereas using the new patch, I got:
Elapsed time is 1.25063 seconds.
Elapsed time is 1.34406 seconds.
Elapsed time is 2.45575 seconds.
Elapsed time is 0.509049 seconds.
Elapsed time is 0.490327 seconds.
Elapsed time is 0.761768 seconds.
Elapsed time is 1.46046 seconds.

So, as you can see, the performance speed-ups range from about 80% to
as much as 60x in some cases.

If you try the patch, you can notice a number of test failures in the
test suite. Apart from a couple of persisting bugs (e.g. cell2mat for
some reason cannot concatenate some arrays), some failures are caused
by changes in error mesages
(in an attempt for invalid resize-on-assigment, it is now resize who
gripes, so the message is different).
A number of other failing testcases are triggered by behaviour that
currently works in Octave but fail in Matlab. For instance,
this currently works in Octave but not in Matlab:
x = ones (2,0);
x([],:,:) = ones (2,0);

I believe Matlab is right here because the rule of thumb for
assignment is that, omitting singleton dimensions on the right and
scalar subscripts at the left,  the dimensions of LHS and rhs should
match. Here, obviously the first index has extent zero while the
leding rhs dim is 2, so that results in a mismatch.

So, my questions (or opinion inquiries) are:

1. how much is backward compatibility of error messages desirable? In
the current patch, I let Array<T>::resize () gripe when invalid
resizing is requested (e.g. A(10) = 1 with A = ones(3)). However, the
error message changes in such case; otherwise, an argument to resize
would be needed to distinguish whether it's called from
Array<T>::assign or from elsewhere. This is certainly possible, but is
it worth cluttering interfaces by optional arguments just to keep
error messages backward compatible? (note that they're by no means
Matlab compatible, but that's not even reasonable to want)

2. what about the failing tests relying on Matlab incompatible (and
slightly illogical) behaviour? Should the tests be fixed, or should I
try to adapt the code so that it behaves backward in a
backward-compatible manner? Note that these are rather corner cases,
quite unlikely to occur in real code.

and, of course, if anyone discovers the source of the concatenation
bug, I'll be grateful :)


PS: the attachment is > 100 K, so you can get it here:
http://artax.karlin.mff.cuni.cz/~hajej2am/ulozna/array-new.diff


cheers

-- 
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]