octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #54661] textscan() continues from next line if


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #54661] textscan() continues from next line if line ends with delimiter
Date: Sun, 16 Sep 2018 18:12:51 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0

Follow-up Comment #8, bug #54661 (project octave):

Yeah, there could be something going on with the buffer action.  The code has
a lot of "attempt to process then put back to previous state if failed" kind
of thing.  For example, automatically increment the line number, but then if
EOLstop is specified decrement it back to where it was.  I might have prefered
if a string buffer were used to store the ASCII hunk between two delimiters,
then process, i.e., ask, Is this segment empty?  Can a floating point be
extracted?  Is there character garbage left after the number conversion? 
Etc.

I've made some adjustments which seem for the better, but then it always seems
to fail one of the BIST tests.  It's as if there are some contradicting test
cases.  For example, this


%! str = "12.234e+2,34, \n12345.789-9876j,78\n,10|3";
%! c = textscan (str, "%10.2f %f", "delimiter", ",", "collectOutput", 1,
%!                    "expChars", "e|");
%! assert (c, {[1223, 34; 12345.79-9876j, 78; NaN, 10000]}, 1e-6);


treats the \n as a whitespace, but I think that comes about simply because of
the "skip whitespace" routine/scan.  Whereas if one were to treat \n as a
delimiter the third field would be empty.

Matlab documentation doesn't give any examples of poorly-formatted lines.  It
does specify though that '\n' and '\r' may be delimiters.  (And that by
default '\n', '\r' and "\n\r" are EOL symbols.)  If one is allowed to specify
those characters as delimiters, I'd think that by default they are not
delimiters and shouldn't be in the list.

I'll keep looking at this a bit.  It seems to me from a programming standpoint
that the code isn't keeping track of the presence of delimiters.  That would
be an important piece of information as to whether a field is empty or has
failed for another reason.  For example, the OP's example clearly should have
empty fields that get transformed to "EmptyValue".  If the example were
missing a comma and have too few delimiters, then that would be an error.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?54661>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]