|
From: | Frank Heckenbach |
Subject: | Re: Finding out when a token is consumed |
Date: | Thu, 8 May 2003 20:38:44 +0200 |
Hans Aberg wrote: > At 14:20 +0200 2003/05/01, Frank Heckenbach wrote: > >> We try to guess what you want. > > > >Sorry if my descriptions are so unclear. Basically, I'd like to do > >what I did in my original example, but preferably without (ab)using > >YYLLOC_DEFAULT, that's all. > > > >> If you want to attach a code snippet to each action without having to write > >> it out explicitly into each action, here are some suggestions: > >> > >> - The Bison file tests/actions.at has some code > >> %printer { fprintf (yyout, "%d from %d", $$, @$.first_line); } > >> input line thing 'x' > > > >%printer doesn't seem to be documented yet, but from what I've found > >(http://mail.gnu.org/archive/html/bison-patches/2002-06/msg00066.html, > >e.g.), it seems to be like YYPRINT (though token-specific). So it > >seems to be more for debugging and messages than for what I need. > > So here the problem shows with you not describing what and why you want as > much as those that might help you need. > > Essentially, you say that you need this esoteric property X, and somehow > draw the conclusion that Bison should be designed somehow to support it. > > It is frequent in this group that when the original property X has been > properly described, a suitable proper remedy can be found. One is unlikely > to get Bison to support esoteric properties, because often they turn out to > wrong from the point of view of proper language design. The actual case is quite complicated, so I'll try to give a strongly simplified example. Let's assume arithmetic expressions, say only containing `+', `*' and parentheses. But there can also be "directives" (here denoted just `#', while in the actual case there can be several kinds of directives) which have some global effects, e.g., change the number of fractional digits used in computations or whatever. The exact effect is immaterial here; what's important is that they can occur at any point in the input and should affect all computations done after this point (and of course none that happens before this point). ("Done" in the sense where the last tokens used in that computation appears.) As a first attempt, I tried to handle them during lexing. But that's too early when bison reads look-ahead tokens. E.g., when the input is `1+2#+3', bison will read the second `+' before it can do the first addition, so the directive will affect the first addition which it shouldn't do. Then I've tried to make directives into terminals, and modify the grammar to accept directives before each terminal (see t1.y). But apart from cluttering up the grammar, it causes 2 R/R conflicts (probably many more in a real example). I understand these conflicts, and I can't see any easy way to avoid them. When I tried to accept the directives *after* each terminal (t2.y), the conflicts disappeared, but the result was wrong. Again with `1+2#+3', the directive would "stick" to the `2' and be handled before the first addition. The next thing I tried was to stick a counter to each token that counts the preceding directives (t3.y). (In a real example, it would also have to keep track of the kinds and contents of directives, but that's no problem.) The lexer generates this counter and the grammar actions handle it for each terminal in a production. This works (except for directives at the very end of the input, but that's easy to fix, of course), but it's quite clumsy. Not only will every rule need to check the directives for each terminal it contains. Also YYSTYPE is "blown up", and consequently all regular accesses to the semantic values need to have `.val' inserted. So, to avoid this clumsiness, I looked for something that was "parallel" to the semantic values, both in data types and in (default) actions. Locations seem to do just that (YYLTYPE and YYLLOC_DEFAULT), although they're of course not meant for this purpose. Such a solution seems to work too (t4.y) and it's much more readable -- normal rules are not affected at all. So I guess I'll use this solution, unless there is a better way that avoids the abuse of locations (or I can convince you to implement one, but I don't think so ...). BTW, t4.y is a little different from the example in my original mail (t4.y seems to be a little easier and should also work with GLR parsers where there can be more than one token of "look-ahead" during the split phase) but the basic idea of abusing locations is the same. Frank -- Frank Heckenbach, address@hidden http://fjf.gnu.de/ GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)
t1.y
Description: Binary data
t2.y
Description: Binary data
t3.y
Description: Binary data
t4.y
Description: Binary data
[Prev in Thread] | Current Thread | [Next in Thread] |