|
From: | Markus Bergholz |
Subject: | Re: xlsread in Octave 3.6.4 |
Date: | Wed, 4 Sep 2013 02:54:07 +0200 |
<moved from help-octave to octave-maintainers ML>
Markus Bergholz wrote:
On Mon, Sep 2, 2013 at 11:38 AM, Markus Bergholz <address@hidden
<snip><snip, see thread on help-octave ML>
>>>> Markus Bergholz wrote
>>>> > I haven't follow this thread and it's issue, but
i've wrote a
>>>> xlsxread
>>>> > function whitch don't need java.
>>>> > but it's very very rudimentary, works just with
linux and is a
>>>> quick&dirty
>>>> > write-down.
>>>> > furthermore, you have to remove the string-analyse
part, if your
>>>> sheet
>>>> > don't contain strings.
>>>> > but maybe it helps someone else or someone want to
improve it or
>>>> someone
>>>> > rewrite it in c/c++ as oct file, to get it even
faster than
>>>> matlab (for me
>>>> > it's still faster than the java stuff atm).
Good work Markus.
i've made a few quick and dirty changes, change to gpl licence and
commit the broken range part too.
https://github.com/markuman/xlsxread
it's now plattform indepentend and - once again - faster than before
(~58 seconds). now it's nearly twice as fast as matlab (~110 seconds).
enough time to waste it for ranges, strings etc in future.
here comes version 0.6 - https://github.com/markuman/xlsxread
* strings and calculations are now replaced with NaN (without any speed
losses!)
* testet with a excel 2007 and excel for mac 2011 file (example files
are added)
* it's using now nested functions. this should be easier to ingetrate it
in octave-io
ranges and empty columns still don't work!
Anyway, sorry to come up with a few more potential gotchas:
- Interesting would be if your code properly handles merged and hidden cells. I don't know what they look like in raw OOXML.
- Does OOXML have repeated-rows and repeated-columns "folding"?
E.g., ODS1.2 has the table:TableNumberRowsRepeated and table:TableNumberColumnsRepeated tags.
It would be really good to have a Java-free (and ActiveX-free) spreadsheet reading capability in Octave, even if only a basic one.
Sergei suggested a Perl-based solution; but Perl would still be a dependency, not all systems have Perl installed (e.g., Windows).
You've made a first try for OOXML; I have a basis for decoding ODS lying around, it doesn't work at all yet but might not need undue amounts of attention.
You made the vital piece: unzipping the spreadsheet file to disk.
For inclusion in the OF io package (in a later stage, first try to get your version robust and fail-safe) I'd suggest to see how the various "interfaces" are built and called in the OF io package.
For follow-up I'd suggest to move this discussion to the maintainers list. I've swapped help-octave into octave-maintainers.
Oh BTW another idea (that I explored in 2009 but couldn't get to work at the time):
There is a binary (compiled) xmlread function, currently it is in the io package. Maybe with a proper "template" it could just read the worksheets into a struct in RAM, faster than regexp can decode it. The missing piece is the "template". (sorry for my lack of XML proficiency & lingo) Unfortunately that xmlread is very tersely, if not badly documented.
However there are xml toolboxes around that could be gotten to work in Octave.
Philip
[Prev in Thread] | Current Thread | [Next in Thread] |