help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to load data files with mixed str & numerical data with headers


From: Philip Nienhuis
Subject: Re: How to load data files with mixed str & numerical data with headers "*.txt.gz" or "*.xml.tgz"
Date: Mon, 20 Aug 2012 15:45:52 -0700 (PDT)

Shoumei wrote
> 
>  http://octave.1599824.n4.nabble.com/file/n4642993/GSE1-2.txt GSE1-2.txt 
> The extracted data files are txt files suffixed with ".soft' .  I need to
> extract from the files the data matrixices of usually repeated 44Kxn
> tables/matrices with a start line ' !series_matrix_table_begin' and end
> with ' !series_matrix_table_end'. The string data are 44Kx1 and the
> numerical data are 44x(n-1). I could possibly ignore the other ! comments.
> I knew the exact matrix size for each repeated matrixices.
> The example file attached included data of two samples (GSM1&GSM2), each
> with a 2x2 matrix .
> 

That file format looks easy to read. Dataloggers we use at work yield more
or less the same file structure (header followed by data sections between
"something-like-begin-data" and "something-like-end-data" lines) and we have
several Matlab scripts for reading those into a struct. 
The file header is parsed line by line into separate fields, the data
sections into dedicated data fields (often numeric arrays, sometimes cell
arrays).



> The complete txt file could not be loaded in excel or word. 
> 

What MS-Office version did you use?
As I wrote, LibreOffice Calc 3.4 and later versions should be able to read
very very big files. But your computer RAM may be a limiting factor.



> The other tables in the sample file between"!platform_table_begin" and
> "!platform_table_end"  are info related to each ID/string-This info I
> could process from other files so are ignored for the time being.
> When I dealt with small txt files I usually opened with excel and save the
> string (ID) as csv files and the numerical data as txt files. I had
> trouble comparing the strings with other string files if I save the whole
> file as csv and used "csv2cell' to load the data.
> 

You should save only the individual data sections as .csv and read them with
csv2cell. Concatenating the sections can be done into a struct or so.



> I was not able to use"xlsread' with mixed string and numerical xls data
> initially so I did not persist in using it.
> 

Remarkable .... spreadsheets are just made for that. But they are usually
fairly inefficient as far as RAM uage is concerned (because of a.o., the
formatting overhead).

Philip



--
View this message in context: 
http://octave.1599824.n4.nabble.com/How-to-load-data-files-with-mixed-str-numerical-data-with-headers-txt-gz-or-xml-tgz-tp4642969p4642997.html
Sent from the Octave - General mailing list archive at Nabble.com.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]