octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #41579] textscan: file offset after partial re


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #41579] textscan: file offset after partial read differs from Matlab
Date: Mon, 23 Oct 2017 14:18:58 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0

Follow-up Comment #11, bug #41579 (project octave):

I took a quick look at this issue.  I've found the line of code responsible
for advancing the filepointer past the EOL when, in the example, the first
group of string variables are read.  In the diff hunk below, I put an "#if 0"
around the line in question:


diff --git a/libinterp/corefcn/oct-stream.cc
b/libinterp/corefcn/oct-stream.cc
--- a/libinterp/corefcn/oct-stream.cc
+++ b/libinterp/corefcn/oct-stream.cc
@@ -1603,6 +1603,7 @@ namespace octave
     int len = out.length ();
     int used = 0;
     int ch;
+fprintf(stderr, "GETLINE\n");
     while ((ch = get_undelim ()) != delim
            && ch != std::istream::traits_type::eof ())
       {
@@ -2685,6 +2686,7 @@ namespace octave
               }
 
             row_idx(0) = row;
+fprintf(stderr, "CALL read_format_once\n");
             err = read_format_once (is, fmt_list, out, row_idx, done_after);
 
             if ((err & ~1) > 0 || ! is || (lines >= ntimes && ntimes > -1))
@@ -3505,12 +3507,14 @@ namespace octave
         elem = fmt_list.next ();
         char *pos = is.tellg ();
 
+#if 0
         // FIXME: these conversions "ignore delimiters".  Should they
include
         // delimiters at the start of the conversion, or can those be
skipped?
         if (elem->type != textscan_format_elt::literal_conversion
             // && elem->type != '[' && elem->type != '^' && elem->type !=
'c'
             )
           skip_delim (is);
+#endif
 
         if (is.eof ())
           {


The skip_delim() is being called *after* the read of the variable.  (And I
believe there are plenty of delimiter skipping actions before reading a
variable as well.)  By my way of thinking, the ML behavior doesn't do anything
after a read, and commenting out the line of code as above results in Octave
behaving like ML *for this example*.

Looking closely at skip_delim(),


         if (is_delim (c1) || c1 == eol1 || c1 == eol2)


the EOL characters are considered delimiters in this context.  I guess that is
fine.

I'm not so sure it is as simple as dropping the lines associated with the
FIXME above, because obviously they were there and modified for some reason
and I am wondering what breaks by dropping that line.  Also, I think we need
to cover all cases here and consider what happens when, say, the textscan()
command asks for fewer columns than what exists in the file.  Say, for
example, there are five columns in the data file and textscan only asks to
read three of them:

v1 v2 v3 V4 V5

Do the fourth and fifth columns then get treated as the first and second
variables of the next line?  That is, once in data scanning mode (as opposed
to headerline skipping) is the file treated like EOL characters are irrelevant
and just another white space or delimiter?  Consider adding Rik to the list
for his attention to detail.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?41579>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]