bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: enum instead of #define for tokens


From: Akim Demaille
Subject: Re: RFC: enum instead of #define for tokens
Date: 05 Apr 2002 10:32:21 +0200
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp)

| On Thu, 2002-04-04 at 05:19, Akim Demaille wrote:
| > IMHO, my position is not nice wrt people who are abusing the system.
| > The example of Unicode demonstrates how bad it was to let chars be
| > tokens.  That default is very C specific, I really doubt that in other
| > languages, such an atrocity remains in their native Yaccs.  But I
| > confess I don't know.
| 
| Unicode really doesn't disagree with having tokens that happen to be in
| U+0000 to U+00FF. It doesn't disagree with having some simple way of
| representing those, either. We could very well call the signle-quotes in
| %token '=' the Unicode big-endian low-octect token creator operator. The
| acronym is quite interesting ;-)
| 
| Things like %token '=', %token ',', etc. are really usefull for
| programming languages. I agree that if people want some weird
| native-language equals sign (is there such a thing?) or one of the nice
| mathematical symbols, they should have their scanner deal with that.
| 
| I find rules like:
|       assignment:  lvalue '=' rvalue;
| to be pretty clear and terse, while:
|       assignment:  lvalue EQUALS rvalue;
| to be lacking at least in the terseness value. I also fear people would
| start using EQ or EQA. Same with other common ones; wonder what type of
| butchery will be done of 'left parenthesis', 'asterick'(sp?), and
| 'multiplication sign'?

Sorry, but this is only appearance, and I am not referring to
appearance.  I'm referring to the actual tokens which are used.  I
subscribe to your point of view, that's why I have snippets like this
in some of my parsers:

| %token COMMA  ","
| %token COLON  ":"
| %token SEMI   ";"
| %token LPAREN "("
| %token RPAREN ")"
| %token LBRACK "["
| %token RBRACK "]"
| %token LBRACE "{"
| %token RBRACE "}"
| %token DOT    "."
| %token PLUS   "+"
| %token MINUS  "-"
| %token TIMES  "*"
| %token DIVIDE "/"
| %token EQ     "="
| %token NE     "<>"
| %token LT     "<"
| %token LE     "<="
| %token GT     ">"
| %token GE     ">="
| %token AND    "&"
| %token OR     "|"
| %token ASSIGN ":="
| %token ARRAY  "array"
| %token IF     "if"
| %token THEN   "then"
| %token ELSE   "else"
| %token WHILE  "while"
| %token FOR    "for"
| %token TO     "to"
| %token DO     "do"
| %token LET    "let"
| %token IN     "in"
| %token END    "end"
| %token OF     "of"
| %token BREAK  "break"
| %token NIL    "nil"
| %token FUNCTION       "function"
| %token VAR    "var"
| %token TYPE   "type"
| %token ESCAPING "/* escaping */"
| 
| 
| %nonassoc "=" "<>" "<" "<=" ">" ">="
| %left "+" "-"
| %left "*" "/"
| 
| //>>
| %start program
| 
| %%
| /* Arithmetics. */
| exp:
|   exp "="  exp { $$ = new OpExp (@$, *$1, OpExp::eq, *$3); }
| | exp "<>" exp { $$ = new OpExp (@$, *$1, OpExp::ne, *$3); }
| | exp "<"  exp { $$ = new OpExp (@$, *$1, OpExp::lt, *$3); }
| | exp "<=" exp { $$ = new OpExp (@$, *$1, OpExp::le, *$3); }
| | exp ">"  exp { $$ = new OpExp (@$, *$1, OpExp::gt, *$3); }
| | exp ">=" exp { $$ = new OpExp (@$, *$1, OpExp::ge, *$3); }
| | exp "+"  exp { $$ = new OpExp (@$, *$1, OpExp::plus, *$3); }
| | exp "-"  exp { $$ = new OpExp (@$, *$1, OpExp::minus, *$3); }
| | exp "*"  exp { $$ = new OpExp (@$, *$1, OpExp::mul, *$3); }
| | exp "/"  exp { $$ = new OpExp (@$, *$1, OpExp::div, *$3); }
| /* `-E' is translated as `0 - E'. */
| | "-" exp %prec UMINUS
|             { $$ = new OpExp (@$, *new IntExp(@1, 0), OpExp::minus, *$2); }
| | "(" exp ")" { $$ = $2; }
| ;

As you can see, I'm not manipulating char-tokens.  There are only
tokens.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]