Consider a language with all the normal expressions - you can add,
subtract, multiply, etc.
Now, you'd like for the user to be able to define his own operators
- for instance, '+?' or something like that.
In order to help with ambiguities, you decide these user defined
operators must be at least 2 "elements" long (I'm specifically NOT
using the word "tokens" here for reasons to become clear).
So, you'll allow '++' and '-+', etc.
Now, the problem is that this still ends in shift / reduce conflicts
- mainly because if you write this naturally:
UserOp = PossOp PossOp*;
PossOp = '+' | '-' | '*' | ....;
The parser will look for a succession of tokens - you can write '-'
'+'. But, this is exactly what results in conflicts - obviously,
with just 1 token of lookahead, this will go wrong.
What I really want is for my specification to specify *a single
token* rather than a series of tokens, which is the exact opposite
from what you usually want to happen.
You could generate the possible tokens up to a certain length
automatically:
'++', '+-', '+*', ...
but this would be very large, and you can (obviously) only do it up
to a certain length.