[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow
From: |
Dan Sebald |
Subject: |
[Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow |
Date: |
Sat, 25 Nov 2017 15:04:32 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 |
Follow-up Comment #42, bug #51871 (project octave):
Attached are a few more incremental versions,
speed-up-load-ascii-v6.patch (uses istream is.get(), clear string)
speed-up-load-ascii-v6v2.patch (uses istream is.get(), overwrite string)
speed-up-load-ascii-v7.patch (uses istream is.read(,1))
The v6v2 version yields the same time as the v6 version, so that confirms what
you said about the string clear() function done in a fairly efficient way. (I
actually had done this comparison before your last post.)
Speed comparisons:
current octave: 3.7360, 3.7440
octave + speed-up-load-ascii-v5.patch: 1.3640
octave + speed-up-load-ascii-v6.patch: 2.1980
octave + speed-up-load-ascii-v7.patch: 3.1040
The above also agrees with your comment about the standard getline() being
optimal. Note above how version v5 is clearly much faster. No surprise, as
calling a routine get() for individual characters is going to have overhead.
But version v5 doesn't handle all EOL characters.
In version v7 I used read(,1) instead of get(). That slows down, but it is
obvious why. Although read(,1000) is much faster than call get() 1000 times,
read(,1) has the extra overhead of a second input variable on the stack; it
has to be slower than get().
So far, then, version v6 is the benchmark of fastest while still handling all
EOL.
I thought to pursue reading in data at bigger hunks, and then search for EOL
characters. It seems too clumsy though, so I hesitate. I then thought to
perhaps go back to FILE * and lower level C-like I/O, but that messes far too
much with other text/matrix/etc. code which is based on istream objects.
The following reference suggests an option that seems much more
straightforward and efficient:
https://stackoverflow.com/questions/13995971/using-get-line-with-multiple-types-of-end-of-line-characters
The idea would be to create a "filter buffer stream" for which the istream is
buffer passes through. That filter buffer stream will convert all 0x0A, 0x0D,
0x0D-0x0A characters to the native '\n' character, *then* we can use getline()
just as it is. That is
istream is --> filter istream fs --> fs.getline()
That seems the most efficient and elegant solution, doesn't it? At least in
principle. I'm going to try coding that. If it doesn't work, then version v6
it is, I guess.
(file #42485, file #42486, file #42487)
_______________________________________________________
Additional Item Attachment:
File name: speed-up-load-ascii-v6.patch Size:11 KB
File name: speed-up-load-ascii-v6v2.patch Size:11 KB
File name: speed-up-load-ascii-v7.patch Size:11 KB
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?51871>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Rik, 2017/11/08
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/08
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/08
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/08
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Rik, 2017/11/23
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/24
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/24
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/25
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, count, 2017/11/25
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow,
Dan Sebald <=
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/26
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/26
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Rik, 2017/11/27
- [Octave-bug-tracker] [bug #51871] loading '-ascii' format files is slow, Dan Sebald, 2017/11/27