octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #52116] Textscan filepointer bug is still pres


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #52116] Textscan filepointer bug is still present in Windows build
Date: Wed, 27 Sep 2017 05:52:21 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0

Follow-up Comment #10, bug #52116 (project octave):

I may have found the bug, please test the attached patch.

This very subtle change takes some thought.  As I mentioned, this delimited
stream has to "put back" data into the conventional input stream, i.e.,
readjust the pointer, when it is done and there is unused data in its local
buffer.  Currently that pointer is recorded from the input stream every time
some data is read from the input stream.  But that is not correct because the
input stream position is far ahead of the current delimited buffer's pointer,
on average.

As an example, let's say the BUF_SIZE is N characters, something much larger
than the data expected to be read, e.g. 1024.  buf_in_file starts at 0, then
we read 8 bytes, etc.  The sequence of events would be


                   buf_in_file    iostream pos
initialize              0              0
refresh_buf             0              N
proc 8 bytes            8              N
refresh_buf             N              N+8
read 6 bytes            N+6            N+8
refresh_buf             N+8            N+8+6
read 9 bytes            N+8+9          N+8+6
refresh_buf             N+8+6          N+8+6+9
read 3 bytes            N+8+6+3        N+8+6+9
refresh_buf             N+8+6+9        N+8+6+9+3


At this point, we restore the stream position after having read only 26 bytes.
 But the buf_in_file has a value of N+8+6+9, a value much greater than 26.  So
that is where the error is.

Note that if we pick N larger than the amount of data in the file, we luck out
because the input stream is only read the first time and successive times
eof() tests true, i.e.,


                   buf_in_file    iostream pos
initialize              0              0
refresh_buf             0              filelen < N
proc 8 bytes            8              filelen
refresh_buf             8              filelen
read 6 bytes            8+6            filelen
refresh_buf             8+6            filelen
read 9 bytes            8+6+9          filelen
refresh_buf             8+6+9          filelen
read 3 bytes            8+6+9+3        filelen
refresh_buf             8+6+9+3        filelen


So, the change should fix the bug.  But there are probably still other bugs. 
One thing is that because of the buffering of data there is an inherent
assumption that a float value in ASCII form will be less than 80 characters,
but someone could have test representations of numbers with more characters
even though they wouldn't fit into a double float.  In order to handle such a
thing in buffered data, the low-level deciphering of floats needs to be done
at the same level.  That is we can't run scanf on something like


[123.456789101112] [41516171819 321.4567545667567...


where that first portion is length N=16 buffer and there is still a portion of
ascii characters to come in from the stream.  I'm not going to concern myself
with that though.  This delimited stream is much too complex to experiment
with.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?52116>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]