Re: xlsread in Octave 3.6.4

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xlsread in Octave 3.6.4

From:	Markus
Subject:	Re: xlsread in Octave 3.6.4
Date:	Sun, 03 Nov 2013 11:54:15 +0100
User-agent:	Roundcube Webmail/0.9.5

Am 2013-11-02 23:43, schrieb Philip Nienhuis:

Markus wrote:

Am 2013-11-02 21:11, schrieb Philip Nienhuis:

Hi Markus,

Markus wrote:
<snip>

I've pushed everything to my github repo[1] again (just the relevant
files for io).
hopefully i don't broke it now (That is quite possible) because icopy
and paste a lot.
Furthermore i've added some nice benchmarks at the end of the
README.md[2]

cheers
markus


[1] https://github.com/markuman/xlsxread
[2] https://github.com/markuman/xlsxread#benchmark
It seems you didn't fix the latest sources that I committed butrather
patched a local source?
But never mind, I again reformatted to Octave coding standards(wasn'tmuch work) and re-added the xls.limits assignment needed to avoidcell
errors in xls2oct.
I also had to (again) fix undefined references in __OCT_xlsx2oct__.m:
L. 117, 126: refs to "@__colnr", should be to "@__col_str_to_number"-
do you still have __colnr.m in your source tree?


uh, maybe you're right. i develope with single file xlsxread.m and i
guess i made often copy & paste errors.


As regards your remark "setup is a nightmare": All true.

But I made a function to make it easier for users, did you catchthat?

(chk_spreadsheet_support.m); it checks all interfaces and adds all
required Java classes to the javaclasspath. All you have to do is
download them, put them together into one subdir and tell it to
chk_spreadsheet_support.m.


yes, i did it with chk_spreadsheet_support. refering to
https://wiki.archlinux.org/index.php/Octave#Reading_Microsoft_Excel_Spreadsheets

but xlsread failed as long as JExcelAPI .jar file was added to java
path. even when i force another interface.


Hey.... I thought I had fixed that bug a while ago... I'll check later.

<snip>

Before hitting "Send": I'm really impressed. Your xlsxread stuff is
really blazingly fast. testOOXML sheet #2 takes 2.5 secs with OCT,
12.2 with POI, 34+ with UNO (LibreOffice 4.1.2) (all on WinXP)


and matlab run out of memory (on linux 8GB ram). this is the most
impressiv part for me.


On my Win7 64b 8 GB RAM box, Octave & COM (=Excel 2013) takes just 6,5
secs to read example.xlsx. I can't test xlsxread (I have no
Octave-3.6.4 installed there yet); but UNO (LibreOffice 4.0) took a
leisurely 780 secs (15 mins). (Java) POI bailed out (a bug in old
io-1.2.2 I think).

xlsxread test would be realy interesting on the same PC (What CPU do youhave in this PC?).

next weeks i'll try implement to read just a range from xlsx files.


I figured that a temporary kludge could be to just read all data
(xlsxread is fast enough), and then just select the requested
rectangle from the full data matrix.

This way is the fastest for matlab on linux too and it could beinternaly managed the same way in xlsxread.But there is one way to do it even faster. A function which generate acorresponding regexp pattern. There are some python codes in the wwwdoing this. I start month ago such function (__getrange.m attached), butit failed (just valid for a range between A1 to Z9), is very worse andnot trivial (for me).

BYW I think reading strings would also be nice. I already had a brief
look on how to do that (it is a 2 or 3-step procedure as the strings
are kept in some other sheet). Vectorizing the procedure will be the
real challenge.

Step 1 is to identifiy the string position (It's one more regexp line iguess). Step 2 is reading all strings (the other xml file) and fill theoutput cell at the given position with the strings. I would do this withregexp again :)


The io package spreadsheet I/O stuff supports a SPSH_OPTS argument
(actually a struct with fields that can be true (1) or false (0)). For
strings we can add a new field "oct_get_strings" or so and in the
_OCT_xlsx2oct__.m file we can have some if-endif's that test it and if
"yes"/true also read the strings. That way we can keep the current
speed if only numeric data are requested.

I'll take a look into it in the next weeks. I have to spend more timefor university stuff at the moment.



cheers
markus

--
https://github.com/markuman
Jabber: address@hidden

__getrange.m
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

Re: xlsread in Octave 3.6.4, Philip Nienhuis, 2013/11/01
- Re: xlsread in Octave 3.6.4, Markus, 2013/11/01
- Re: xlsread in Octave 3.6.4, Philip Nienhuis, 2013/11/02
  - Re: xlsread in Octave 3.6.4, Markus, 2013/11/02
    - Re: xlsread in Octave 3.6.4, Philip Nienhuis, 2013/11/02
    - Re: xlsread in Octave 3.6.4, Markus <=
    - Re: xlsread in Octave 3.6.4, Philip Nienhuis, 2013/11/03
    - Re: xlsread in Octave 3.6.4, Markus, 2013/11/08

Prev by Date: Re: 'make check' errors
Next by Date: Re: 'make check' errors for Mac OS X
Previous by thread: Re: xlsread in Octave 3.6.4
Next by thread: Re: xlsread in Octave 3.6.4
Index(es):
- Date
- Thread