xtextscan [WAS: Re: strread.m]

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

xtextscan [WAS: Re: strread.m]

From:	Philip Nienhuis
Subject:	xtextscan [WAS: Re: strread.m]
Date:	Thu, 04 Aug 2011 23:38:40 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6

John W. Eaton wrote:

On  3-Aug-2011, Philip Nienhuis wrote:

|>  I will probably try to write textscan in C++.  It's up to you whether
|>  you want to continue fixing problems in strread, but given the
|
| Do you have a time schedule in mind?
| That would help me make a better decision of what to do.

I started working on it yesterday.  So far I've only implemented the


Magnificent.

Are you planning to get it finished before Octave 3.4.3?

Just today I prepared a fix for bug #33876 along the lines I sketchedyesterday... never mind.

part that decodes the format.  I'll try for at least some of the
conversions today.  Then I may need help in figuring out how to
properly return the variables that are read from the file.  Then we
will also need to handle the parameter/value options.


Whitespace and delimiter processing was a bit of sorting out.

There are also some "implicit" options, like presence of a trailing "\n"in the input stream.

Once you get the format string properly parsed I suppose it is fairlystraightforward to match it to the input stream.

But just FYI, here is some ML r2007a behavior that I find peculiar:

Assume an input string '54321a'.

Applying a format string like '%f321a' it turns out that Matlab prefersto interpret it as '%f32', ignores the digits in the literal and alsothe trailing "a", yielding 54321 (class single).

If you do
  c = textscan ('54321a', '%f321a', 'returnonerror' 0)

it emerges that ML first parses the number as far as it can, rather thanfirst analyzing the trailing literal to see where the numeric field issupposed to end.To read the field as a double you'd need '%f 321a' (yielding 54321), orif you'd rather expect 54, use '%2f64321a'.

Another one:
  c = textscan ('54321a', '%2f64') gives {54; 32; 1}

(Given field width is ignored for the last number which is reported asOK. "'returnonerror', 0" shows that ML complains about row 4, the "a")

I find this behavior (a.o., mixing up a literal if it starts withdigits, and lax interpretation of user-specified field width) a bitinconsistent from a user point of view - of course from a programmersPOV it may just be obvious although I don't see it.

These examples do show that setting the returnonerror parameter to falseis vital for understanding what ML does.


The point here:

I assume (that is, I hope) you have a clearer view of this than me, butIMO we should be wary of striving for ML compatibility so much that wewander into various degrees of bug-for-bug compatibility.

Or should we call it "surprise-for surprise" compatibility?

The diffs below are what I have now.  You can do things like

   fid = fopen ("any-existing-file");
   xtextscan (fid, "any format here for testing")

and xtextscan will display the components of the format.


I can't comment as this is the Octave dialect of C++ :-)  (beyond me)
Thank you anyway.

Philip

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Release goals for 3.6, (continued)
- Re: Release goals for 3.6, Konstantinos Poulios, 2011/08/03

Prev by Date: Re: Binary distribution
Next by Date: Re: strread.m
Previous by thread: Re: strread.m
Next by thread: Re: strread.m
Index(es):
- Date
- Thread