help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finding out when a token is consumed


From: Frank Heckenbach
Subject: Re: Finding out when a token is consumed
Date: Thu, 8 May 2003 20:38:44 +0200

Hans Aberg wrote:

> At 14:20 +0200 2003/05/01, Frank Heckenbach wrote:
> >> We try to guess what you want.
> >
> >Sorry if my descriptions are so unclear. Basically, I'd like to do
> >what I did in my original example, but preferably without (ab)using
> >YYLLOC_DEFAULT, that's all.
> >
> >> If you want to attach a code snippet to each action without having to write
> >> it out explicitly into each action, here are some suggestions:
> >>
> >> - The Bison file tests/actions.at has some code
> >>   %printer { fprintf (yyout, "%d from %d", $$, @$.first_line); }
> >>     input line thing 'x'
> >
> >%printer doesn't seem to be documented yet, but from what I've found
> >(http://mail.gnu.org/archive/html/bison-patches/2002-06/msg00066.html,
> >e.g.), it seems to be like YYPRINT (though token-specific). So it
> >seems to be more for debugging and messages than for what I need.
> 
> So here the problem shows with you not describing what and why you want as
> much as those that might help you need.
> 
> Essentially, you say that you need this esoteric property X, and somehow
> draw the conclusion that Bison should be designed somehow to support it.
> 
> It is frequent in this group that when the original property X has been
> properly described, a suitable proper remedy can be found. One is unlikely
> to get Bison to support esoteric properties, because often they turn out to
> wrong from the point of view of proper language design.

The actual case is quite complicated, so I'll try to give a strongly
simplified example. Let's assume arithmetic expressions, say only
containing `+', `*' and parentheses.

But there can also be "directives" (here denoted just `#', while in
the actual case there can be several kinds of directives) which have
some global effects, e.g., change the number of fractional digits
used in computations or whatever. The exact effect is immaterial
here; what's important is that they can occur at any point in the
input and should affect all computations done after this point (and
of course none that happens before this point). ("Done" in the sense
where the last tokens used in that computation appears.)

As a first attempt, I tried to handle them during lexing. But that's
too early when bison reads look-ahead tokens. E.g., when the input
is `1+2#+3', bison will read the second `+' before it can do the
first addition, so the directive will affect the first addition
which it shouldn't do.

Then I've tried to make directives into terminals, and modify the
grammar to accept directives before each terminal (see t1.y). But
apart from cluttering up the grammar, it causes 2 R/R conflicts
(probably many more in a real example). I understand these
conflicts, and I can't see any easy way to avoid them.

When I tried to accept the directives *after* each terminal (t2.y),
the conflicts disappeared, but the result was wrong. Again with
`1+2#+3', the directive would "stick" to the `2' and be handled
before the first addition.

The next thing I tried was to stick a counter to each token that
counts the preceding directives (t3.y). (In a real example, it would
also have to keep track of the kinds and contents of directives, but
that's no problem.) The lexer generates this counter and the grammar
actions handle it for each terminal in a production.

This works (except for directives at the very end of the input, but
that's easy to fix, of course), but it's quite clumsy. Not only will
every rule need to check the directives for each terminal it
contains. Also YYSTYPE is "blown up", and consequently all regular
accesses to the semantic values need to have `.val' inserted.

So, to avoid this clumsiness, I looked for something that was
"parallel" to the semantic values, both in data types and in
(default) actions. Locations seem to do just that (YYLTYPE and
YYLLOC_DEFAULT), although they're of course not meant for this
purpose.

Such a solution seems to work too (t4.y) and it's much more readable
-- normal rules are not affected at all. So I guess I'll use this
solution, unless there is a better way that avoids the abuse of
locations (or I can convince you to implement one, but I don't think
so ...).

BTW, t4.y is a little different from the example in my original mail
(t4.y seems to be a little easier and should also work with GLR
parsers where there can be more than one token of "look-ahead"
during the split phase) but the basic idea of abusing locations is
the same.

Frank

-- 
Frank Heckenbach, address@hidden
http://fjf.gnu.de/
GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)

Attachment: t1.y
Description: Binary data

Attachment: t2.y
Description: Binary data

Attachment: t3.y
Description: Binary data

Attachment: t4.y
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]