[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance optimization (allocation inside a for loop)

From: r
Subject: Re: Performance optimization (allocation inside a for loop)
Date: Thu, 2 Apr 2009 00:18:28 +0100

This is a general problem. It doesn't just affect scripting languages
- you may hit this issue equally well in C/C++.

A common strategy is to allocate a bit more space than currently
needed, so that malloc/realloc doesn't need to be called every single
time the user appends data to the structure. Even better, this margin
can be made dependent on the current structure size (e.g. proportional
to n or to log(n)). This second scheme results in better
performance/memory occupation ratio.

I'm (now) perfectly OK with preallocating memory manually, but it took
me a lot of time to figure out where the bottleneck is. BTW, low "for"
loop performance is commonly blamed on the lack of a JIT compiler, but
I wouldn't be surprised if at least some of the test cases were
actually suffering from issues like memory allocation. After all, JIT
compilers don't typically speed up evaluation by more than a constant


On Wed, Apr 1, 2009 at 10:35 PM, James Sherman Jr. <address@hidden> wrote:
> I'm sure someone will answer this in a more technically precise way, but I
> believe the answer is, in short, no.  Essentially since octave is a script,
> there is no way for octave to "know" that you are going to eventually want a
> vector of size n.  It basically sees (in the first case)
> allocate room for a vector of size 1
> write vector of size 1
> allocate room for a vector of size 2
> copy previous vector
> write new value at end of vector
> and so on.  The second case is:
> allocate room for a vector of size n
> write value in first element
> write value in second element
> and so on.  At least this is my understanding.
> James
> On Wed, Apr 1, 2009 at 4:15 PM, r <address@hidden> wrote:
>> I've recently hit a performance problem with a "for" loop producing
>> vectors of data. Consider the following (deliberately simple) example:
>> function retval = test(n)
>>  # retval = zeros(1, n);
>>  for n = [1:n]
>>    retval(n) = n;
>>  endfor
>> endfunction
>> octave:26> tic;test(10000);toc
>> Elapsed time is 0.8 seconds.
>> octave:27> tic;test(100000);toc
>> Elapsed time is 72 seconds.
>> So the complexity is O(n^2).
>> The same function with a preallocated retval vector:
>> function retval = test2(n)
>>  retval = zeros(1, n);
>>  for n = [1:n]
>>    retval(n) = n;
>>  endfor
>> endfunction
>> has a complexity of O(n):
>> octave:29> tic;test2(10000);toc
>> Elapsed time is 0.16 seconds.
>> octave:30> tic;test2(100000);toc
>> Elapsed time is 1.9 seconds.
>> Is it possible to adjust the Octave's allocation algorithm so that it
>> could allocate larger chunks of data (or growing chunks of data)?
>> Regards,
>> -r.
>> _______________________________________________
>> Help-octave mailing list
>> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]