help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to import string and numerical data with arbitrary number of columns


From: João Rodrigues
Subject: How to import string and numerical data with arbitrary number of columns
Date: Sun, 01 Sep 2013 12:30:01 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130803 Thunderbird/17.0.8

Hi

I need to import a series of data files with several string columns and one final column with floats. E.g.,

str1 str2 1.234
str3 str4 5.678
...

[a,b,c] = textread(filename,"%s %s %f") looks like a good idea but the number of string columns is different in each file and there are many such files, so it is out of the question to use a different textread statement (with the appropriate number of output variables and format) for each file.

I can do this in several ways, but they all take a lot of time. (There may be some spelling errors in the code below, I just wanted to give the general idea.)

Let nstr be the number of string columns of the current file:

%%%%%%%%%%%

1) Using C-style file input:

    fid = fopen(filename);
    k = 0;
    while ~feof(fid)
        k = k + 1;
        for i = 1 : nstr
            tmp = fscanf(fid,"%s","C");
            strres{k,i} = tmp;
        endfor
        tmp = fscanf(fid,"%f","C");
        numres(k,1) = tmp;
    endwhile
    fclose(fid)

    (uses loops, which are not recommended)

%%%%%%%

2) Using eval:

    tmpstr1 = "[";
    tmpstr2 = "\"";
    for i = 1 : nstr
        tmpstr1 = strcat(tmpstr1,"tmp",num2str(i),",");
        tmpstr2 = strcat(tmpstr2,"%s ");
    endfor
    tmpstr1 = strcat(tmpstr1,"]");
    tmpstr2 = strcat(tmpstr1,"%f\"");
    eval([tmpstr1,"=textread(",filename,",",tmpstr2,");"]);

    then use eval again to assign the tmpstr's to strres and numres.

(the point is to first use a loop to generate the strings with the format and list of output files, then apply textread inside eval).

%%%%%%%%

3) Using reshape and cellfun:

    tmp = textread(filename,"%s");
    tmp = reshape(tmp,[nstr+1,length(tmp)/(nstr+1)])';

    numres = cellfun(@str2num,tmp(:,nstr+1));
    strres = tmp(:,1:nstr);

(read all data as a string cell vector, then reshape and cut out the numerical vector. The cell fun operation takes a lot of time.)

%%%%%%%%%%

What I really wanted was an alternative to textread that would do something like:

res = textread(filename,"%s %s %f")

and would create res as a cell whose columns were the different objects returned by textread (in this case two string cell vectors and one numerical vector).

Can anyone suggest a faster and cleaner method?
Thanks
Joao



reply via email to

[Prev in Thread] Current Thread [Next in Thread]