Re: [Bug-apl] Potential performance optimisation tested

On 15 February 2014 03:18, Juergen Sauermann <address@hidden> wrote:

Hi,

I have added the proposed optimisation, SVN 123.
Hope it does not break anything.

/// Jürgen

On 02/02/2014 03:35 PM, Frederick H. Pitts wrote:

Elias,

This approach to optimizating Gnu APL sounds very reasonable to me. I
hope it succeeds. May Conway's "Game of Life" go faster. :-)

Fred

On Sun, 2014-02-02 at 21:19 +0800, Elias Mårtenson wrote:

I made a test implementation of a performance optimisation for GNU
APL. The short story is that it seems to work for limited tests, but
there may be issues with my approach and I don't want to spend any
more time on it unless Jürgen agrees it's a good idea. After all, I
may have completely misunderstood the architecture of GNU APL and my
approach may be fundamentally broken.

When doing some benchmarking I noticed a lot of time was spent
allocating, initialising and freeing Value instances. I realised that
many such allocations are not actually needed. Take the following
_expression_ for example:

,1+2×foo

In this case, the following things happen:
1. The multiplication function (implemented by eval_skalar_AB)
creates a new array and fills it with the result of the
operation (in this case, bif_multiply) on each cell.
2. The addition function is then called, resulting in an
identical flow as in the multiplication above.
3. The ravel function is then called (Bif_COMMA::ravel),
resulting in yet another array being created which is simply
copied from B. Note that the ravel is identical between Z and
B in this case.
From a performance perspective, I saw a particular low-hanging fruit
in the ravel call in the last point. Since the value that came from
the addition is never used anywhere else, one can simply change the
shape of B in-place and return it immediately.

My experiment involved creating a new VF (value flag) value; VF_temp
that indicates that the value may be reused. This flag is then set for
all values that are only used once (for example, the return value from
a primitive or user function). If a value needs to be preserved, it is
copied, and the flag is not set.

What all of this means is that values that are returned from a
function, can be reused if it makes sense. There are several cases
where this is the case. For example:
* The comma function (as discussed above)
* Reshape, especially when the shape of Z matches that of B
* Scalar functions. If the sizes of the arguments of plus, for
example, matches, then A or B can be reused if they are marked
as temporary.
* The each operator can (maybe?) push the result straight into
the source array.
I've only done very basic testing, but the ravel operator on a large
array went from taking a second or so to pretty much zero. Same should
be true for reshape, although that one crashes for me right now. :-)
Scalar functions still have to process the elements, so their timing
can't go to zero, but at least the should save the time spent setting
up the result array and creating new values.

Please let me know what you think.

Elias

From:	Elias Mårtenson
Subject:	Re: [Bug-apl] Potential performance optimisation tested
Date:	Sat, 15 Feb 2014 19:54:12 +0800