bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for Dlang support] d: change the return value of yylex() from


From: Adela Vais
Subject: Re: [PATCH for Dlang support] d: change the return value of yylex() from int to TokenKind
Date: Wed, 16 Sep 2020 21:47:41 +0300

Hello,

În sâm., 12 sept. 2020 la 18:18, Akim Demaille <akim@lrde.epita.fr> a scris:

> Hi Adela,
>
> > Le 11 sept. 2020 à 15:09, Adela Vais <adela.vais99@gmail.com> a écrit :
> >
> > * data/skeletons/lalr1.d: Change return value.
> > * examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
>
> This commit fails the test suite too, since the test suite is
> still using int for yylex's return value.
>
> As a matter of fact, the TODO had a question:
>
>   ** Change the return value of yylex
>   Historically people were allowed to return any int from the scanner
> (which
>   is convenient and allows `return '+'` from the scanner).  Akim tends to
> see
>   this as an error, we should restrict the return values to TokenKind (not
> to
>   be confused with SymbolKind).
>
>   In the case of D, without the history, we have the choice to support or
> not
>   `int`.  If we want to _keep_ `int`, is there a way, say via
> introspection,
>   to support both signatures of yylex?  If we don't keep `int`, just move
> to
>   TokenKind.
>
> I was really curious to know if D's introspection made it possible to
> support both signatures.  If it can't, or if we consider returning an
> int is not right, then, sure, your commit (once the test suite issue
> addressed) is the right path.
>

D doesn't support this.
Our reasoning was that int as a return value might be a bit
confusing/unclear for someone unfamiliar with the program, and TokenKind
would give more information about what is expected.
It is also Dlang good practice to choose the most specific type possible
when writing an interface, so TokenKind would be the best choice here.


> In case you're wondering where the test suite's yoles is coming from,
>
> m4_define([AT_CALC_YYLEX(d)],...
>
> in tests/calc.at.
>

I modified the failing test files. My understanding is that tokens given
using single quotes are not creating TokenKind entries, please correct me
if I am wrong. I separated the Bison declarations for D from the ones for
C/C++, because of the single quote declarations, as these tests were using
the int return value to its fullest, and changed the error tests' expected
output for the same reason.

Here is the patch:

d: change the return value of yylex() from int to TokenKind

* data/skeletons/lalr1.d: Change the return value.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
* tests/calc.at, tests/scanner.at: Adjust.
---
 data/skeletons/lalr1.d   |   4 +-
 examples/d/calc/calc.y   |   2 +-
 examples/d/simple/calc.y |   2 +-
 tests/calc.at            | 136 ++++++++++++++++++++++++++++++---------
 tests/scanner.at         |   2 +-
 5 files changed, 112 insertions(+), 34 deletions(-)

diff --git a/data/skeletons/lalr1.d b/data/skeletons/lalr1.d
index e879dabf..6d1bdfdd 100644
--- a/data/skeletons/lalr1.d
+++ b/data/skeletons/lalr1.d
@@ -68,7 +68,7 @@ public interface Lexer
    * to the next token and prepares to return the semantic value
    * ]b4_locations_if([and beginning/ending positions ])[of the token.
    * @@return the token identifier corresponding to the next token. */
-  int yylex ();
+  TokenKind yylex ();

   /**
    * Entry point for error reporting.  Emits an error
@@ -272,7 +272,7 @@ b4_user_union_members
       yyDebugStream.writeln (s);
   }
 ]])[
-  private final int yylex () {
+  private final TokenKind yylex () {
     return yylexer.yylex ();
   }

diff --git a/examples/d/calc/calc.y b/examples/d/calc/calc.y
index 2ad1227a..9fea82cd 100644
--- a/examples/d/calc/calc.y
+++ b/examples/d/calc/calc.y
@@ -114,7 +114,7 @@ class CalcLexer(R) : Lexer
     return semanticVal_;
   }

-  int yylex ()
+  TokenKind yylex ()
   {
     import std.uni : isWhite, isNumber;

diff --git a/examples/d/simple/calc.y b/examples/d/simple/calc.y
index 917eb131..0f441431 100644
--- a/examples/d/simple/calc.y
+++ b/examples/d/simple/calc.y
@@ -109,7 +109,7 @@ class CalcLexer(R) : Lexer
     return semanticVal_;
   }

-  int yylex ()
+  TokenKind yylex ()
   {
     import std.uni : isWhite, isNumber;

diff --git a/tests/calc.at b/tests/calc.at
index b95b4845..12ecdfa7 100644
--- a/tests/calc.at
+++ b/tests/calc.at
@@ -299,7 +299,7 @@ class CalcLexer(R) : Lexer
     return res;
   }

-  int yylex ()
+  TokenKind yylex ()
   {]AT_LOCATION_IF([[
     location.begin = location.end;]])[

@@ -342,7 +342,20 @@ class CalcLexer(R) : Lexer
         return TokenKind.YYerror;
       }

-    return c;
+    switch (c)
+    {
+      case '+':  return TokenKind.PLUS;
+      case '-':  return TokenKind.MINUS;
+      case '*':  return TokenKind.STAR;
+      case '/':  return TokenKind.SLASH;
+      case '(':  return TokenKind.LPAR;
+      case ')':  return TokenKind.RPAR;
+      case '\n': return TokenKind.EOL;
+      case '=':  return TokenKind.EQUAL;
+      case '^':  return TokenKind.POW;
+      case '!':  return TokenKind.NOT;
+      default:   return TokenKind.YYUNDEF;
+    }
   }
 }
 ]])
@@ -550,7 +563,7 @@ void location_print (FILE *o, Span s);
 %token CALC_EOF 0 ]AT_TOKEN_TRANSLATE_IF([_("end of input")], ["end of
input"])[
 %token <ival> NUM   "number"
 %type  <ival> exp
-
+]AT_LANG_MATCH([c\|c++], [[
 %nonassoc '='   /* comparison          */
 %left '-' '+'
 %left '*' '/'
@@ -592,9 +605,7 @@ exp:
         char buf[1024];
         snprintf (buf, sizeof buf, "calc: error: %d != %d", $1, $3);
         ]AT_GLR_IF([[yyparser.]])[error (]AT_LOCATION_IF([[@$, ]])[buf);
-      }]],
-      [d], [[
-      yyerror (]AT_LOCATION_IF([[@$, ]])[format ("calc: error: %d != %d",
$1, $3));]])[
+      }]])[
     $$ = $1;
   }
 | exp '+' exp        { $$ = $1 + $3; }
@@ -604,10 +615,60 @@ exp:
 | '-' exp  %prec NEG { $$ = -$2; }
 | exp '^' exp        { $$ = power ($1, $3); }
 | '(' exp ')'        { $$ = $2; }
-| '(' error ')'      { $$ = 1111; ]AT_D_IF([], [yyerrok;])[ }
-| '!'                { $$ = 0; ]AT_D_IF([return YYERROR], [YYERROR])[; }
-| '-' error          { $$ = 0; ]AT_D_IF([return YYERROR], [YYERROR])[; }
+| '(' error ')'      { $$ = 1111; yyerrok; }
+| '!'                { $$ = 0; YYERROR; }
+| '-' error          { $$ = 0; YYERROR; }
+;
+]])[]AT_LANG_MATCH([d], [[
+%token PLUS   "+"
+       MINUS  "-"
+       STAR   "*"
+       SLASH  "/"
+       LPAR   "("
+       RPAR   ")"
+       EQUAL  "="
+       POW    "^"
+       NOT    "!"
+       EOL    "\n"
+
+%nonassoc "="   /* comparison          */
+%left "-" "+"
+%left "*" "/"
+%precedence NEG /* negation--unary minus */
+%right "^"      /* exponentiation        */
+
+/* Grammar follows */
+%%
+input:
+  line
+| input line         { ]AT_PARAM_IF([++*count; ++global_count;])[ }
+;
+
+line:
+  EOL
+| exp EOL           { ]AT_PARAM_IF([*result = global_result = $1;])[ }
+;
+
+exp:
+  NUM
+| exp "=" exp
+  {
+    if ($1 != $3)
+      yyerror (]AT_LOCATION_IF([[@$, ]])[format ("calc: error: %d != %d",
$1, $3));
+    $$ = $1;
+  }
+| exp "+" exp        { $$ = $1 + $3; }
+| exp "-" exp        { $$ = $1 - $3; }
+| exp "*" exp        { $$ = $1 * $3; }
+| exp "/" exp        { $$ = $1 / $3; }
+| "-" exp  %prec NEG { $$ = -$2; }
+| exp "^" exp        { $$ = power ($1, $3); }
+| "(" exp ")"        { $$ = $2; }
+| "(" error ")"      { $$ = 1111; }
+| "!"                { $$ = 0; return YYERROR; }
+| "-" error          { $$ = 0; return YYERROR; }
 ;
+]])[
 %%

 int
@@ -952,32 +1013,39 @@ _AT_CHECK_CALC([$1],
 _AT_CHECK_CALC_ERROR([$1], [1], [1 2],
                      [[final: 0 0 1]],
                      [15],
-                     [AT_JAVA_IF([1.3-1.4], [1.3])[: syntax error on token
[number] (expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] ['\n'])]])
+                     [AT_JAVA_IF([1.3-1.4], [1.3])AT_D_IF([[: syntax error
on token [number] (expected: [=] [-] [+] [*] [/] [^] [\n])]],
+                                                          [[: syntax error
on token [number] (expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^']
['\n'])]])])
 _AT_CHECK_CALC_ERROR([$1], [1], [1//2],
                      [[final: 0 0 1]],
                      [20],
-                     [AT_JAVA_IF([1.3-1.4], [1.3])[: syntax error on token
['/'] (expected: [number] ['-'] ['('] ['!'])]])
+                     [AT_JAVA_IF([1.3-1.4], [1.3])AT_D_IF([[: syntax error
on token [/] (expected: [number] [-] [(] [!])]],
+                                                          [[: syntax error
on token ['/'] (expected: [number] ['-'] ['('] ['!'])]])])
 _AT_CHECK_CALC_ERROR([$1], [1], [error],
                      [[final: 0 0 1]],
                      [5],
-                     [AT_JAVA_IF([1.1-1.2], [1.1])[: syntax error on token
[invalid token] (expected: [number] ['-'] ['\n'] ['('] ['!'])]])
+                     [AT_JAVA_IF([1.1-1.2], [1.1])AT_D_IF([[: syntax error
on token [invalid token] (expected: [number] [-] [\n] [(] [!])]],
+                                                          [[: syntax error
on token [invalid token] (expected: [number] ['-'] ['\n'] ['('] ['!'])]])])
 _AT_CHECK_CALC_ERROR([$1], [1], [1 = 2 = 3],
                      [[final: 0 0 1]],
                      [30],
                      [AT_LAC_IF(
-                       [AT_JAVA_IF([1.7-1.8], [1.7])[: syntax error on
token ['='] (expected: ['-'] ['+'] ['*'] ['/'] ['^'] ['\n'])]],
-                       [AT_JAVA_IF([1.7-1.8], [1.7])[: syntax error on
token ['='] (expected: ['-'] ['+'] ['*'] ['/'] ['^'])]])])
+                       [AT_JAVA_IF([1.7-1.8], [1.7])AT_D_IF([[: syntax
error on token [=] (expected: [-] [+] [*] [/] [^] [\n])]],
+                                                            [[: syntax
error on token ['='] (expected: ['-'] ['+'] ['*'] ['/'] ['^'] ['\n'])]])],
+                       [AT_JAVA_IF([1.7-1.8], [1.7])AT_D_IF([[: syntax
error on token [=] (expected: [-] [+] [*] [/] [^])]],
+                                                            [[: syntax
error on token ['='] (expected: ['-'] ['+'] ['*'] ['/'] ['^'])]])])])
 _AT_CHECK_CALC_ERROR([$1], [1],
                      [
 +1],
                      [[final: 0 0 1]],
                      [20],
-                     [AT_JAVA_IF([2.1-2.2], [2.1])[: syntax error on token
['+'] (expected: ]AT_TOKEN_TRANSLATE_IF([[[end of file]]], [[[end of
input]]])[ [number] ['-'] ['\n'] ['('] ['!'])]])
+                     [AT_JAVA_IF([2.1-2.2], [2.1])AT_D_IF([[: syntax error
on token [+] (expected: ]AT_TOKEN_TRANSLATE_IF([[[end of file]]], [[[end of
input]]])[ [number] [-] [\n] [(] [!])]],
+                                                          [[: syntax error
on token ['+'] (expected: ]AT_TOKEN_TRANSLATE_IF([[[end of file]]], [[[end
of input]]])[ [number] ['-'] ['\n'] ['('] ['!'])]])])
 # Exercise error messages with EOF: work on an empty file.
 _AT_CHECK_CALC_ERROR([$1], [1], [/dev/null],
                      [[final: 0 0 1]],
                      [4],
-                     [[1.1: syntax error on token
]AT_TOKEN_TRANSLATE_IF([[[end of file]]], [[[end of input]]])[ (expected:
[number] ['-'] ['\n'] ['('] ['!'])]])
+                     [[1.1: syntax error on token
]AT_TOKEN_TRANSLATE_IF([[[end of file]]], [[[end of input]]])AT_D_IF([[
(expected: [number] [-] [\n] [(] [!])]],
+
                                            [[ (expected: [number] ['-']
['\n'] ['('] ['!'])]])])

 # Exercise the error token: without it, we die at the first error,
 # hence be sure to
@@ -999,35 +1067,45 @@ _AT_CHECK_CALC_ERROR([$1], [0],
                      [() + (1 + 1 + 1 +) + (* * *) + (1 * 2 * *) = 1],
                      [[final: 4444 0 5]],
                      [250],
-[AT_JAVA_IF([1.2-1.3], [1.2])[: syntax error on token [')'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.18-1.19], [1.18])[: syntax error on token [')'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.23-1.24], [1.23])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.41-1.42], [1.41])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.1-1.47], [1.1-46])[: calc: error: 4444 != 1]])
+[AT_JAVA_IF([1.2-1.3], [1.2])AT_D_IF([[: syntax error on token [)]
(expected: [number] [-] [(] [!])
+]],                                  [[: syntax error on token [')']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.18-1.19], [1.18])AT_D_IF([[: syntax error on token [)]
(expected: [number] [-] [(] [!])
+]],                                       [[: syntax error on token [')']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.23-1.24], [1.23])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])
+]],                                       [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.41-1.42], [1.41])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])
+]],                                       [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.1-1.47], [1.1-46])[: calc: error: 4444 != 1]])

 # The same, but this time exercising explicitly triggered syntax errors.
 # POSIX says the lookahead causing the error should not be discarded.
 _AT_CHECK_CALC_ERROR([$1], [0], [(!) + (1 2) = 1],
                      [[final: 2222 0 2]],
                      [102],
-[AT_JAVA_IF([1.10-1.11], [1.10])[: syntax error on token [number]
(expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] [')'])
-]AT_JAVA_IF([1.1-1.16], [1.1-15])[: calc: error: 2222 != 1]])
+[AT_JAVA_IF([1.10-1.11], [1.10])AT_D_IF([[: syntax error on token [number]
(expected: [=] [-] [+] [*] [/] [^] [)])
+]],                                     [[: syntax error on token [number]
(expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] [')'])
+]])AT_JAVA_IF([1.1-1.16], [1.1-15])[: calc: error: 2222 != 1]])

 _AT_CHECK_CALC_ERROR([$1], [0], [(- *) + (1 2) = 1],
                      [[final: 2222 0 3]],
                      [113],
-[AT_JAVA_IF([1.4-1.5], [1.4])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.12-1.13], [1.12])[: syntax error on token [number]
(expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] [')'])
-]AT_JAVA_IF([1.1-1.18], [1.1-17])[: calc: error: 2222 != 1]])
+[AT_JAVA_IF([1.4-1.5], [1.4])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])
+]],                                  [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.12-1.13], [1.12])AT_D_IF([[: syntax error on token
[number] (expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] [')'])
+]],                                  [[: syntax error on token [number]
(expected: ['='] ['-'] ['+'] ['*'] ['/'] ['^'] [')'])
+]])AT_JAVA_IF([1.1-1.18], [1.1-17])[: calc: error: 2222 != 1]])

 # Check that yyerrok works properly: second error is not reported,
 # third and fourth are.  Parse status is successful.
 _AT_CHECK_CALC_ERROR([$1], [0], [(* *) + (*) + (*)],
                      [[final: 3333 0 3]],
                      [113],
-[AT_JAVA_IF([1.2-1.3], [1.2])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.10-1.11], [1.10])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])
-]AT_JAVA_IF([1.16-1.17], [1.16])[: syntax error on token ['*'] (expected:
[number] ['-'] ['('] ['!'])]])
+[AT_JAVA_IF([1.2-1.3], [1.2])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])
+]],                                  [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.10-1.11], [1.10])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])
+]],                                       [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])
+]])AT_JAVA_IF([1.16-1.17], [1.16])AT_D_IF([[: syntax error on token [*]
(expected: [number] [-] [(] [!])]],
+                                          [[: syntax error on token ['*']
(expected: [number] ['-'] ['('] ['!'])]])])


 # YYerror.
diff --git a/tests/scanner.at b/tests/scanner.at
index 5ad18729..2ec2cd78 100644
--- a/tests/scanner.at
+++ b/tests/scanner.at
@@ -121,7 +121,7 @@ class YYLexer(R) : Lexer
     return semanticVal_;
   }

-  int yylex ()
+  TokenKind yylex ()
   {
     import std.uni : isNumber;
     // Handle EOF.
-- 
2.17.1


Thank you!
Adela


>
> Cheers!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]