[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance optimization (allocation inside a for loop)

From: Jaroslav Hajek
Subject: Re: Performance optimization (allocation inside a for loop)
Date: Sat, 4 Apr 2009 08:47:03 +0200

On Thu, Apr 2, 2009 at 10:06 PM, John W. Eaton <address@hidden> wrote:
> On  2-Apr-2009, r wrote:
> | I think chunks of ~5% of the structure size (perhaps more for smaller
> | arrays) would be affordable by anyone and would effectively fix these
> | allocation issues in practical applications. Whether it is worth the
> | effort is a different question.
> If you think this is important, how about submitting a patch?
> jwe

I gave the idea a second thought - maybe it has some merits. Sometimes
you really want to use an array as a stack in Octave to avoid going
through a loop twice, so Octave can try to optimize such usage.

For everyone interested, try the attached patch.

Summary: The operations

array(end+1) = something (push)


array(end) = []  (pop)

are optimized (the usage of "end" is not necessary), reusing the
existing mechanism for shallow slices.
When a "push" operation is detected, existing over-allocated space
will be reused if possible, otherwise a new array will be
over-allocated twice or by 1024 elements, whichever is smaller. The
"pop" operation simply adjusts the array size.
Since the existing shallow slicing mechanism is reused, no extra data
is introduced. Also, economization will happen as with orphaned
shallow slices, so that the array will be automatically stripped off
any possible over-allocated space when you store it into a variable,
cell or struct element, or return it from a function.
Hence the dangling memory will normally only exist locally.
To facilitate this change, economization no longer happens on every
assignment (which is probably not important anyway).

Here's a simple benchmark:

for i = 1:1e5
  a(i) = i;
a = [];
for i = 1:1e5
  a(i) = i;
  a(i) = [];
  a(i) = i+1;

with recent tip:

Elapsed time is 11.0845 seconds.
Elapsed time is 16.2406 seconds.

with current patch:

Elapsed time is 0.380752 seconds.
Elapsed time is 1.01877 seconds.

a final question: maybe the maximum over-allocation size should be configurable?


RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic

Attachment: stack.diff
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]