[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Import large field-delimited file with strings and numbers

From: Philip Nienhuis
Subject: Re: Import large field-delimited file with strings and numbers
Date: Mon, 8 Sep 2014 09:49:36 -0700 (PDT)

Joao Rodrigues wrote
>>> I need to import a large CSV file with multiple columns with mixed
>>> string and number entries, such as:
>>> field1, field2, field3, field4
>>> A,        a,        1,       1.0,
>>> B,        b,        2,        2.0,
>>> C,        c,        3,        3.0,
>>> and I want to pass this on to something like
>>> cell1 ={[1,1] = A; [2,1] = B; [3,1] = C};
>>> cell2 ={[1,1] = a; [2,1] = b; [3,1] = c};
>>> arr3 =[1 2 3]';
>>> arr4 =[1.0 2.0 3.0]';
>>> furthermore, some columns can be ignored, the total number of entries is
>>> known and there is a header.
> <snip>
> Yet, csv2cell is orders of magnitude faster. I will break the big file 
> into chunks (using fileread, strfind to determine newlines and fprintf) 
> and then apply csv2cell chunk-wise.

Why do you need to break it up using csv2cell? AFAICS that reads the entire
file and directly translates the data into "values" in the output cell
array, using very little temporary storage (the latter quite unlike
It does read the entire file twice, once to assess the required dimensions
for the cell array, the second (more intensive) pass for actually reading
the data.

BTW, on my TODO list is an old (2+ years?) entry: adding a "headerlines"
parameter to csv2cell..... (but OK my TODO list is looong)


View this message in context:
Sent from the Octave - General mailing list archive at

reply via email to

[Prev in Thread] Current Thread [Next in Thread]