[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How do you select only specific rows based on the values in a specif
Re: How do you select only specific rows based on the values in a specific column?
Sun, 26 Oct 2014 09:40:27 +0100
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
On 10/26/2014 02:29 AM, Thompson, Robert M - (rmt1) wrote:
I maybe mistaken, but I don't think what you want to do is possible:
either you import the whole lot and then let Octave parse the content,
which will be fast but you have to import everything. Or you import one
row at a time C-style (fopen, fscanf, fclose) and test it, which has no
memory overload but is very slow.
I have a huge source file of a million lines, like: (cartographic data)
0.015625 89.996094 0.018000
0.046875 89.996094 0.018000
0.078125 89.996094 0.018000
I was using C to pare the source file down into a smaller file based on values
in first and second column.
The evaluation was like, e.g., keep this row if column 1 is greater than 0.20000
and column 2 is >= 89.00000.
But now I want to cut out the C middleman and import the million-line source
file directly into Octave.
But also select only the rows with first or second columns matching criteria,
before I consume great amounts of memory on records I will not be using.
If what you have is a million rows, I would go for the first option.
C-style reading is only worth it if the file is small. Octave has many
import functions, each suitable to particular context. If what you have
is a file that only has numerical data and is in ascii then I would
a = dlmread(XYZ);
If it takes a lot of time, then try breaking the original file into
chunks and import each at a time or other import functions (check the io
package, I found out that csv2cell was amazingly fast).
After a is loaded into octave, then use Doug's suggestion to truncate
the desired rows.