[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: New importdata function testing
From: |
Rik |
Subject: |
Re: New importdata function testing |
Date: |
Mon, 22 Oct 2012 08:05:18 -0700 |
On 10/22/2012 05:51 AM, Jordi Gutiérrez Hermoso wrote:
> On 21 October 2012 12:07, Rik <address@hidden> wrote:
>> 10/20/12
>>
>> Erik,
>>
>> I did just a small test with importdata and it doesn't seem to work as
>> expected.
>>
>> For a file, I used import.tst containing
>>
>> 1,2,3
>> 4,5,6
>>
>> And then in Octave, I used
>> importdata ('import.tst', ',')
>> warning: unrecognized escape sequence '\S' -- converting to 'S'
> Oops, my bad:
>
> http://hg.savannah.gnu.org/hgweb/octave/rev/9a455cf96dbe#l2.365
>
>> I am also concerned that the implementation reads the entire file into a
>> string and then uses a number of for loops and regexp which will be slow in
>> Octave. I did a benchmark with the following:
>>
>> x = rand (1e4, 10);
>> dlmwrite ('tst.csv', x, ',')
>> tic; y = dlmread ('tst.csv', ','); toc
>> Elapsed time is 0.209933 seconds.
>> tic; y = importdata ('tst2.csv', ','); toc
>> Elapsed time is 3.2 seconds.
>>
>> I believe it would be faster to have importdata check the header lines
>> only and then pass off the work to dlmread if possible. dlmread is written
>> in C++ and, per the benchmarking above, is very fast.
> It would be preferrable if we could write some minimum common subset
> of this family of functions as a C++ function and leave the rest in
> m-file language. I consider writing code in C++ a last resort for
> optimisation at the very high cost of making the code less
> understandable for most people. Many of our users are scared by C++,
> but any Octave user understands the m-file language.
I think that is why my proposal would make sense. The parsing of the
header lines could be done with an m-file script because there won't be
much work to do there, and then reading could be passed off to dlmread
which is already a core Octave and Matlab function. I don't propose
writing any more C++ if it can be avoided. On that note, there has been
talk of having a C++ version of textscan. When that is done a number of
these functions could switch to relying on that function because it is the
most general and can accept mixed numeric and text data.
--Rik
- New importdata function testing, Rik, 2012/10/21
- Re: New importdata function testing, Philip Nienhuis, 2012/10/21
- Re: New importdata function testing, Jordi Gutiérrez Hermoso, 2012/10/22
- Re: New importdata function testing,
Rik <=
- Re: New importdata function testing, Philip Nienhuis, 2012/10/22
- Re: New importdata function testing, Jordi Gutiérrez Hermoso, 2012/10/22
- Re: New importdata function testing, Philip Nienhuis, 2012/10/22
- Re: New importdata function testing, Juan Pablo Carbajal, 2012/10/22
- Re: New importdata function testing, Philip Nienhuis, 2012/10/22
Re: New importdata function testing, Rik, 2012/10/22