monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] basic_io inventory


From: Christian Ohler
Subject: Re: [Monotone-devel] basic_io inventory
Date: Fri, 11 May 2007 17:58:52 +0200

Thomas Moschny, 2007-04-30:

Christian Ohler wrote:
I don't think this is the only valid way of looking at basic_io.
Stanzas are separated by an empty line; what's wrong with (or "not
stable" about) relying on that?  You're suggesting to discard this
information that is already available at the syntactical level just to
reconstruct it later at a semantical level.

Because that information is redundant, and the definitive source is the semantic, not the syntactic level. Graydon pointed out in the IRC session you mentioned (and which is linked on the BasicIoFormalization wiki page) that the spaces and newlines are only there to make basic_io more readable by humans. However, once introduced, spacing needed to be standardized in order to make hashing stable. But in principle, basic_io would still carry the same information with most of whitespace chars removed (of course you would need at least one separating ws between a key token with no string or id token following and the next key token).

The blank lines that separate stanzas are redundant, but, for basic_io as generated by monotone, they are a documented feature. Thus, parsers can legitimately rely on them.

The main advantage of not relying on whitespace is some amount of robustness against whitespace munging. However, whitespace is still significant inside "", so changes in whitespace will, in general, change the semantics of the basic_io stream anyway.

At this stage of xmtn's development, I even think it's desirable to generate an error message when the amount of whitespace does not match monotone's specification, since this may indicate a bug somewhere.

However, the main advantage of having xmtn's parser rely on the blank lines that separate stanzas is that it can build a representation of each stanza with no information about which type of data the basic_io stream represents. This allows a more convenient API. The data structure that xmtn's parser returns can be processed with a pattern-matching mechanism to identify the type of the stanza and extract data from it.


xmtn currently depends on stanzas being separated by an empty line as
well as on the order of lines within each stanza.

And I guess it also depends on each stanza itself being subdived by newlines.

Yes. Actually, xmtn even depends on the fact that there is no trailing whitespace at the ends of lines. Since there is no specification of the format, I decided to make the strongest possible assumptions about the format that would still permit all the output monotone generates in my test cases, and write a parser that exploits these assumptions wherever it can benefit from them.

Christian.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]