[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Named symbol references

From: Joel E. Denny
Subject: Re: [PATCH] Named symbol references
Date: Wed, 11 Feb 2009 21:29:04 -0500 (EST)

Hi Alex,

On Wed, 11 Feb 2009, Alex Rozenman wrote:

> Please review the new version of the change. I would like to describe some
> points / issues I needed to address. Please see it as an invitation to
> continue our discussion.

I don't have time at the moment to review the patch, but I'll offer my 
thoughts on some of the issues you've raised.  It would be great if others 
would jump in and offer their opinions as well.

> 1. Syntax
> Joel's proposed "bracket" syntax is implemented. It looks like the
> following:
>          exp: exp[left] + exp[right] { $$ = $left + $right; } ;
> I decided, for now, not to allow named reference for rule's LHS, because (1)
> '$$' is already very short, clear and has stable semantics, whatever the
> symbol order in RHS; (2) in mid-rule actions, the '$$' has a different
> semantics, and it is currently not possible to assign to actual '$$'; Should
> we decide to remove this limitation by enabling symbolic names for LHS ?
> Please advice.

Your requirement that $$ and @$ always be used for the LHS somehow feels 
cleaner to me at the moment.  I think I can train my brain to ignore the 
LHS symbol name when I see the $exp or $term in your examples:

  exp : exp { $$ = $exp; }[left] + exp[right] { $$ = $<type>left + $right; }

  term: term '*' fact { $$ = $term * $fact; } ;

Does this bother anyone?

> 2. Lexical definition.
> As for now, I defined the symbolic reference as a non empty ID, allowing
> exactly the same char set as for symbol names. It is because, the symbol
> names themselves are playing role of symbolic references, when '[]' is
> omitted. Now, we have a minor problem with dots (I faced it when trying to
> compile my existing big grammars for Verilog and VHDL). In semantic action
> code we may have: $name.field, where '.field' is a reference to a field of
> %union component, which is a struct. If dots are allowed in symbolic named,
> all the name is eaten by the ID. The obvious solution here is
> "($name).field". I just needed to change it in my grammars by a regexp.

Ugh.  This may come up frequently with locations.  For example, 
@left.first_line seems right, but it's wrong.

This makes me wonder if Bison should require brackets after $ and @ as in 
the following:

  exp: exp[left] + exp[right] { $$ = $[left] + $[right]; } ; 

If we say that the brackets following $ and @ are not optional, there 
should never be confusion.  For example, the user would have to write 
$[] and @[left].first_line, which I think are clear.  Bison would 
complain that $ and @left.first_line have syntax errors immediately 
following the $ and @.

On the other hand, this bracket notation is perhaps slightly more verbose.  
Maybe your way is ok considering the following observation.  Let's say a 
user writes $name.field and means that field is a member of $name.  
Because this user has clearly forgotten how Bison treats ".", he probably 
has not chosen name.field as the name of any symbol.  Thus, rather than 
quietly misunderstanding the user's intentions, Bison would complain that 
it doesn't recognize $name.field.  Seeing this complaint, the user would 
have an opportunity to correct his mistake.  If we want to be really 
helpful, Bison could even detect a "." in an unrecognized name and remind 
the user of this issue.

I'm not sure which of those approaches is better.

> 3. Default symbolic names and scope.
> When the '[]' part is omitted, the symbol name itself can be used
> symbolically:
>         term: term '*' fact { $$ = $term * $fact; } ;
> Currently all the names of RHS (all symbol names + all defined symbolic
> names) are considered to be a single visibility scope. $name reference is
> considered to be valid, iff it's found exactly once among all these
> identifiers. Otherwise, "not found" or "ambiguous" error messages are
> generated. It is quite possible to make another decision and consider, for
> example, that when a symbolic reference is defined (symbol [name]), the
> symbol itself is hidden, and even when an empty symbol reference is defined
> (symbol [], not allowed currently), the symbol name is hidden from the
> scope. My personal opinion here is that, strict, simple and clear rules are
> the best.

So, if I write:

  pair: item[first] item { $$ = new_pair($first, $item); }

Bison reports that $item is ambiguous?  That seems ok because $item is 
slightly confusing here.  However, what about the following?

  lhs: rhs[r] { $$ = $rhs; }

It seems like Bison shouldn't permit $rhs here.  The user has promised to 
call it $r instead.  Can we just add another check after the ambiguity 
check passes?

> 4. Symbolic names for mid-rule action.
> When '[]' syntax is used after mid-rule action, the semantic value of the
> action can be accessed symbolically. For example:
>       exp : exp { $$ = $exp; }[left] + exp[right] { $$ = $<type>left +
> $right; }
> Please note, that first "$exp" is unambiguous, because the second "exp" is
> not visible at this point.

It seems logical to me... except I have to remind myself that $exp does 
not reference the LHS.

> I also fixed a bug, when a broken error message was generated in the
> following case:
>     LHS: RHS { x = $<type>N; }
> where N is greater than len(RHS). Error message was cut in the middle (like
> "integer out of bounds: $<type") and the integer itself was not shown.
> Should I submit a standalone patch for this ?

If it's not too much trouble, please do submit separately.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]