help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finding out when a token is consumed


From: Frank Heckenbach
Subject: Re: Finding out when a token is consumed
Date: Sat, 10 May 2003 04:18:40 +0200

Hans Aberg wrote:

> At 14:45 +0200 2003/05/09, Frank Heckenbach wrote:
> >> The unusual is that we do not get to know if this language you are working
> >> with has a given grammar or whether it just has the ability to throw in
> >> directives anywhere, as you claim. In the latter case, the hope diminishes
> >> rapidly.
> >
> >The latter. Of course, as I said, one can also describe this by a
> >grammar, even a context-free one, but it seems an ugly way to do
> >it.
> 
> My own hunch is that one should try to work with respect to the grammar
> directly, and then working out from there trying to simplify whenever
> possible. Then you will see what will work and what will not. (See other
> suggestion below, though.) If one is using context switches, it will depend
> on the parsing algorithm when a lookahead is used or not, or so I think.

Both on the algorithm and the specific situation. LALR(1) needs one
look-ahead token in some states and none in others. GLR can use
arbitrarily many "look-ahead" tokens (if I may call them such -- I
mean the tokens read while the parser is split; since no semantic
actions are performed at this time, they appear like look-ahead to
the actions). So anything that depends on the presence or absence of
look-ahead won't work in general, I'm aware of this ...

> One idea that strikes me with your example, though, is that you might stamp
> your lexemes with types in the lexer already. In your example, you had
>     a+b#+c
> where # is a directive that takes effect on lexemes after the #. You then
> stamp it say as semantics, so that the sequence that the lexer produces
> will be
>   lexeme  type/effect  semantics
>     a     NUMBER       value of a
>     +     "+"
>     b     NUMBER       value of b
>     #     set #
>     +     "+"
>     c     NUMBER       value of #c
> When your "+" action combines the value of a+b with the value of #c, it
> should recognize that it is a #c and not a c.
> 
> This should then work.

I think it would work in theory. But there are practical problems.
I'd have to attach the state of all directives (there are quite a
few of them) with each token, as opposed to the (genrally rather
rare) changes of directives. And pass all this to the code that does
the real work (which is, for the most part, in other source files
called by the parser). All in all, I'd have to pass much information
around all the time which is both inefficient and, again, extra code
to write in many places ...

> > And since you mentioned
> >alternative parsing methods in your previous mail -- it will fail,
> >e.g., in a GLR parser.
> 
> I do not think of GLR and such in this context.

Not in this context (I didn't mean to imply this). I might decide to
use GLR for other reasons, and of course I'd prefer my directive
handling mechnism not to break then.

Frank

-- 
Frank Heckenbach, address@hidden
http://fjf.gnu.de/
GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]