help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-greedy wildcard possible? (Long)


From: Laurence Finston
Subject: Re: Non-greedy wildcard possible? (Long)
Date: Wed, 19 May 2004 00:04:32 +0200
User-agent: IMHO/0.98.3+G (Webmail for Roxen)

-------------------
> Laurence Finston wrote:
> 
> > I've found that it's worthwhile to use Flex when the actions perform
> > more complicated processing and when there's no need to perform the
> > kind of parsing Bison does.  I also found that simply putting together
> > tokens and returning them doesn't require the power of Flex.
> 
> Depends on what tokens are. If they are simply sequences of
> characters from some class, flex might be overkill. If the length of
> the sequence depends on the content (say, `=', `>' and `>=') or
> tokens have a more complicated structure (say, floating point
> notation), regexes can be handy. OTOH, there are cases (such as
> Magnus' perhaps) where regexes are not even sufficient ...
> 

I agree that regexps can be handy. I just don't need them for the scanner and
parser that I'm working on now.
As far as Magnus' problem is concerned, I've tried to explain why I think his
approach won't work.  In addition, the Flex manual states that the use of
trailing context in the regexps will slow the scanner down considerably. There
is no away around this, it is implicit in the nature of regular expression
matching.  Actually, my scanner handles both multi-character operators and
floating point notation. I don't take any credit for this, though, since I've
simply used the method Knuth developed for Metafont. 

> > The main reason I stopped using it was that it swallowed look-ahead
> > tokens.  I don't know why, and it seemed like too much trouble to find
> > out, so I'm not claiming that it's a bug in Flex.
> 
> I'm not sure exactly what you mean. flex has no concept of
> look-ahead tokens -- it just delivers tokens one by one. Do you mean
> look-ahead characters? Or do you mean that the contents of yytext
> get overwritten if you don't strdup() them? (A common problem, but
> not a bug.)
> 

If my input was `point p;' and my rule was
`<type> <variable> <semi-colon>', the scanner returned
`<type>' and `<variable>', but the semi-colon was lost.  If my input was
`point p ;', then it wasn't.  The first input also worked if the rule
was changed to `<type> <variable_with_semi-colon>'.  I fiddled quite a bit
with the regexps and I couldn't get it to work.  It may have to do with the
C++ scanner classes, and the problem may not have occurred if I had tried it
with the most recent beta-version of Flex, but I've decided not to use Flex
for this part of my package anymore (I do use it for a couple of other
things).

> 
> The grammar, and also the language, are still well-defined, but the
> set of possible parsings for some inputs may be ambiguous -- which
> matters, of course, for semantic values. In GLR, such cases can be
> resolved using `dprec'. (But that's not always necessary when using
> GLR. Sometimes GLR in effect just provides for larger look-ahead, so
> grammars that are unambiguous, but not LALR(k) can be parsed.)
> 

Thank you for your explanation.  Would you please tell me what the 
abbreviations "LR", "GLR", and "LALR" stand for?  I would have thought that it
wouldn't make sense to speak of a "grammar" apart from a "set of parsing
rules", i.e., that the terms would be equivalent. I would also have thought
that ambiguity in the rules would imply that there could be more than one path
through the rules for one or more inputs. If the result of parsing can be said
to be a "sentence" in the language defined by the syntactic rules and the
semantic actions, then the choice of path through the rules would affect both
the syntax and the semantics of the "sentence".  However, I'm not familiar
with the literature on the subject, so it's quite possible that this doesn't
fit in with "informed opinion".  I haven't _had_ to read up on it, although
I'm sure it would be interesting and worthwhile, because I find that Bison
works very well (and has a good manual, too).

Laurence



reply via email to

[Prev in Thread] Current Thread [Next in Thread]