octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #53459] io package (2.4.10), function xlsread,


From: Philip Nienhuis
Subject: [Octave-bug-tracker] [bug #53459] io package (2.4.10), function xlsread, results in unexpected error reading Excel (xlsx) file
Date: Sun, 25 Mar 2018 16:46:28 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0 SeaMonkey/2.48

Follow-up Comment #7, bug #53459 (project octave):

@Markus:
Good that you could help Werner (= anonymous?), thanks.

In the .xlsx archive I couldn't find any logic on how to connect the various
chart- and worksheets to the actual sheets together with clues about their
content, other than the order in [Content_Types].xml.  xl/workbook.xml
contains the links but not the sheet types.
So I guess a dive into the ECMA specs is required to find out how do it the
right way. AFAICS a large part of the logic had better be implemented in
__OCT_spshopen__.m

OTOH your patch may be good enough. Invoking the r:Id's may be good but I'd
like to be sure; in the past I've tried several ways, none of them proved
robust.

Anyway I have no time until after Easter to have a good look at this. 


Note that also xlsfinfo.m chokes on the file.
And once again: care is needed for properly writing out the files esp.
updating the various internal pointers when writing.


@comment #6:
.ods is a very inefficient and non-linear file format. Parsing it with regexps
isn't doable so it is parsed/read row by row, cell by cell, unlike .xlsx where
entire worksheets can be parsed with just one regexp. So little wonder that
.xslx is so much faster.


[A little OT]:
During OctConf I talked to a few other developers and I got some support for
my opinion:
* There are multiple mature options for reading/writing spreadsheet files,
they only need little glue to get them working in Octave; in fact all of the
reliable and supported ones have since long been implemented in the io
package. The common thing is that they are all depending on a Java JRE;
* The only reason to invest in the OCT interface is that many core devs don't
want a dependency on Java;
* Which implies that much developer time has to be spent on reinventing the
wheel, i.e. parsing and digging around in the low-level innards of .xlsx and
.ods archives (gnumeric is relatively simple).

I sometimes think that the OCT interface had better be limited to simple .xlsx
and .ods files and more demanding users had better be referred to the
available Java-based interfaces.

All of which doesn't mean that I'm against the OCT interface. But its
maintainability gets harder and harder.


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?53459>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]