help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Combining tokens


From: Hans Aberg
Subject: Re: Combining tokens
Date: Mon, 15 Mar 2010 12:01:48 +0100

On 15 Mar 2010, at 10:57, Søren Andersen wrote:

Consider a language with all the normal expressions - you can add, subtract, multiply, etc. Now, you'd like for the user to be able to define his own operators - for instance, '+?' or something like that. In order to help with ambiguities, you decide these user defined operators must be at least 2 "elements" long (I'm specifically NOT using the word "tokens" here for reasons to become clear).
So, you'll allow '++' and '-+', etc.

Now, the problem is that this still ends in shift / reduce conflicts - mainly because if you write this naturally:

UserOp = PossOp PossOp*;
PossOp = '+' | '-' | '*' | ....;

The parser will look for a succession of tokens - you can write '-' '+'. But, this is exactly what results in conflicts - obviously, with just 1 token of lookahead, this will go wrong. What I really want is for my specification to specify *a single token* rather than a series of tokens, which is the exact opposite from what you usually want to happen.

You could generate the possible tokens up to a certain length automatically:
'++', '+-', '+*', ...
but this would be very large, and you can (obviously) only do it up to a certain length.

The typical thing would be to let the lexer recognize valid tokens. Then on can let the .y file recognize the operators and put them and values on a stack, which is then sorted out by a function in the actions, computing the value using operator precedences. This way the number of valid tokens can even be unlimited.

You might check out the Haskell interpreter Hugs <http://haskell.org/hugs/ , which has a .y file and a handwritten lexer in the file input.c. Perhaps the lexer is handwritten to handle the layout syntax. Also look into the file Prelude.hs to see how precedences are set. Haskell just admits about ten level, which is a bit too limited.

  Hans






reply via email to

[Prev in Thread] Current Thread [Next in Thread]