bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Performance issue when manipulating arrays


From: Elias Mårtenson
Subject: Re: [Bug-apl] Performance issue when manipulating arrays
Date: Fri, 25 Apr 2014 14:01:46 +0800

Actually, no. I don't actually do that. I only resize the array one every 1000 lines (configurable). Also, the time is not spent there.

As I mentioned, I ran it under Callgrind, and the time spent allocating arrays is actually minimal. What does take time is the 2.2 billion cell allocations and the 50 million calls to Value::clone(). Most of these calls clone a value that that is immediately discarded afterwards.

The solution is to avoid cloning of values that are not stored (that's the core of the "temp" idea). Right now the temp system is only used in some very specific cases, but once that can be used for Value::clone() is when we'll see the big performance boosts.

Regards,
Elias


On 25 April 2014 13:53, David B. Lamkins <address@hidden> wrote:
Given a quick read, I get the impression that you're still incrementally
extending the length of the result. This is, by definition, an O(n^2)
operation. There's a lot of catenation in your code; that'll almost
certainly involve copying.

Try this instead:

1. Get the size of the file to be read.
2. Preallocate a vector large enough to hold the entire file.
3. Read the file (I'm assuming that the file_io won't let you read it
all at one go) by chunks and use indexed assignment to copy each chunk
into its position in the preallocated vector.



On Fri, 2014-04-25 at 00:21 +0800, Elias Mårtenson wrote:
> In writing a function that uses lib_file_io to load the content of an
> entire file into an array of strings, I came across a bad performance
> problem that I am having trouble narrowing down.
>
>
> Here is my current version of the
> function: https://github.com/lokedhs/apl-tools/blob/e3e81816f3ccb4d8c56acc8e4012d53f05de96d6/io.apl#L8
>
>
> The first version did not do blocked reads and resized the array after
> each row was read. That was terribly slow, so I preallocate a block of
> 1000 elements, and resize every 1000 lines, giving the version you can
> see linked above.
>
>
> I was testing with a text file containing almost 14000 rows, and on my
> laptop it takes many minutes to load the file. One would expect that
> the time taken to load such a small file should not take any
> noticeable time at all.
>
>
> One interesting aspect of this is that it takes longer and longer to
> load each row as the loading proceeds. I have no explanation as to why
> that is the case. It's not the resizing that takes time, I was
> measuring the time taken to load a block of rows excluding the array
> resize.
>
>
> Any ideas?
>
>
> Regards,
> Elias



reply via email to

[Prev in Thread] Current Thread [Next in Thread]