bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for Dlang support] d: change the return value of yylex() from


From: Adela Vais
Subject: Re: [PATCH for Dlang support] d: change the return value of yylex() from int to TokenKind
Date: Mon, 28 Sep 2020 14:42:40 +0300

Hello Akim,

În sâm., 26 sept. 2020 la 14:43, Akim Demaille <akim@lrde.epita.fr> a scris:

> Hi Adela,
>
> > Le 16 sept. 2020 à 20:47, Adela Vais <adela.vais99@gmail.com> a écrit :
> >
> > Hello,
> >
> > Here is the patch:
> >
> > d: change the return value of yylex() from int to TokenKind
> >
> > * data/skeletons/lalr1.d: Change the return value.
> > * examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
> > * tests/calc.at, tests/scanner.at: Adjust.
>
> I have modified your patch to make it, I believe, a bit more maintainable.
> I find it very hard to find the right balance between having a simple
> test suite, and having easy means to factor the handling of similar cases.
> I think here we have reached a point where it makes sense to split D's
> calc.y from that of C and C++.
>

Yes, it does look way better now, without all the duplicate (D and
everything else) code for error messages.
Thank you!


>
> Note that now that the user is forced to use token kinds we can/should
> enable
> api.token.raw in D.  It will save us from one useless conversion.
>

I will do this. I also started working on LAC and I almost finished
updating the documentation. I found a few bugs or missing features that
were documented as existing and I will address them.


>
> There's something fishy happening on the CI: the examples are failing.
> It is not a recent change in the skeleton, it is a change in the toolchain.
> We now get errors from address sanitizer.  Have a look at
> https://travis-ci.org/github/akimd/bison/jobs/729867686 for instance.
>

Is there something I can do about this error, or it's outside of my reach?
I tried to replicate it, but I get nothing when I run
ldc2 -fsanitize=address calc.d
I found online only ldc examples for -fsanitize=address for compiling Dlang.
Can you please give me the command the tests use (if they are using
something else)?


>
> Cheers!
>
> commit f296669c0f23af63716050593865475471054941
> Author: Adela Vais <adela.vais99@gmail.com>
> Date:   Sat Sep 26 07:12:42 2020 +0200
>
>     d: change the return value of yylex from int to TokenKind
>
>     * data/skeletons/lalr1.d: Change the return value.
>     * examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
>     * tests/scanner.at: Adjust.
>     * tests/calc.at (_AT_DATA_CALC_Y(d)): New, extracted from...
>     (_AT_DATA_CALC_Y(c)): here.
>     The two grammars have been sufficiently different to be separated.
>     Still trying to be them together results in a maintenance burden.  For
>     the same reason, instead of specifying the results for D and for the
>     rest, compute the expected results with D from the regular case.
>
> diff --git a/TODO b/TODO
> index c692b9f9..8cb1a309 100644
> --- a/TODO
> +++ b/TODO
> @@ -249,21 +249,14 @@ are.  Keep the same variable names.  If you change
> the wording in one place,
>  do it in the others too.  In other words: make sure to keep the
>  maintenance *simple* by avoiding any gratuitous difference.
>
> -** Change the return value of yylex
> -Historically people were allowed to return any int from the scanner (which
> -is convenient and allows `return '+'` from the scanner).  Akim tends to
> see
> -this as an error, we should restrict the return values to TokenKind (not
> to
> -be confused with SymbolKind).
> -
> -In the case of D, without the history, we have the choice to support or
> not
> -`int`.  If we want to _keep_ `int`, is there a way, say via introspection,
> -to support both signatures of yylex?  If we don't keep `int`, just move to
> -TokenKind.
> -
>  ** Documentation
>  Write documentation about D support in doc/bison.texi.  Imitate the Java
>  documentation.  You should be more succinct IMHO.
>
> +** yyerrok
> +It appears that neither Java nor D support yyerrok currently.  It does not
> +need to be named this way...
> +
>  ** Complete Symbols
>  The current interface from the scanner to the parser is somewhat clumsy:
> the
>  token kind is returned by yylex, but the value and location are stored in
> diff --git a/data/skeletons/lalr1.d b/data/skeletons/lalr1.d
> index e879dabf..6d1bdfdd 100644
> --- a/data/skeletons/lalr1.d
> +++ b/data/skeletons/lalr1.d
> @@ -68,7 +68,7 @@ public interface Lexer
>     * to the next token and prepares to return the semantic value
>     * ]b4_locations_if([and beginning/ending positions ])[of the token.
>     * @@return the token identifier corresponding to the next token. */
> -  int yylex ();
> +  TokenKind yylex ();
>
>    /**
>     * Entry point for error reporting.  Emits an error
> @@ -272,7 +272,7 @@ b4_user_union_members
>        yyDebugStream.writeln (s);
>    }
>  ]])[
> -  private final int yylex () {
> +  private final TokenKind yylex () {
>      return yylexer.yylex ();
>    }
>
> diff --git a/examples/d/calc/calc.y b/examples/d/calc/calc.y
> index 2ad1227a..9fea82cd 100644
> --- a/examples/d/calc/calc.y
> +++ b/examples/d/calc/calc.y
> @@ -114,7 +114,7 @@ class CalcLexer(R) : Lexer
>      return semanticVal_;
>    }
>
> -  int yylex ()
> +  TokenKind yylex ()
>    {
>      import std.uni : isWhite, isNumber;
>
> diff --git a/examples/d/simple/calc.y b/examples/d/simple/calc.y
> index 917eb131..0f441431 100644
> --- a/examples/d/simple/calc.y
> +++ b/examples/d/simple/calc.y
> @@ -109,7 +109,7 @@ class CalcLexer(R) : Lexer
>      return semanticVal_;
>    }
>
> -  int yylex ()
> +  TokenKind yylex ()
>    {
>      import std.uni : isWhite, isNumber;
>
> diff --git a/tests/calc.at b/tests/calc.at
> index f1033517..e017e5e4 100644
> --- a/tests/calc.at
> +++ b/tests/calc.at
> @@ -299,7 +299,7 @@ class CalcLexer(R) : Lexer
>      return res;
>    }
>
> -  int yylex ()
> +  TokenKind yylex ()
>    {]AT_LOCATION_IF([[
>      location.begin = location.end;]])[
>
> @@ -342,7 +342,20 @@ class CalcLexer(R) : Lexer
>          return TokenKind.YYerror;
>        }
>
> -    return c;
> +    switch (c)
> +    {
> +      case '+':  return TokenKind.PLUS;
> +      case '-':  return TokenKind.MINUS;
> +      case '*':  return TokenKind.STAR;
> +      case '/':  return TokenKind.SLASH;
> +      case '(':  return TokenKind.LPAR;
> +      case ')':  return TokenKind.RPAR;
> +      case '\n': return TokenKind.EOL;
> +      case '=':  return TokenKind.EQUAL;
> +      case '^':  return TokenKind.POW;
> +      case '!':  return TokenKind.NOT;
> +      default:   return TokenKind.YYUNDEF;
> +    }
>    }
>  }
>  ]])
> @@ -444,13 +457,6 @@ m4_define([_AT_DATA_CALC_Y(c)],
>  [AT_DATA_GRAMMAR([calc.y],
>  [[/* Infix notation calculator--calc */
>  ]$4[
> -]AT_LANG_MATCH(
> -[d], [[
> -%code imports {
> -  alias semantic_value = int;
> -}
> -]],
> -[c\|c++], [[
>  %code requires
>  {
>  ]AT_LOCATION_TYPE_SPAN_IF([[
> @@ -489,7 +495,6 @@ void location_print (FILE *o, Span s);
>    /* Exercise pre-prologue dependency to %union.  */
>    typedef int semantic_value;
>  }
> -]])[
>
>  /* Exercise %union. */
>  %union
> @@ -592,9 +597,7 @@ exp:
>          char buf[1024];
>          snprintf (buf, sizeof buf, "error: %d != %d", $1, $3);
>          ]AT_GLR_IF([[yyparser.]])[error (]AT_LOCATION_IF([[@$, ]])[buf);
> -      }]],
> -      [d], [[
> -      yyerror (]AT_LOCATION_IF([[@$, ]])[format ("error: %d != %d", $1,
> $3));]])[
> +      }]])[
>      $$ = $1;
>    }
>  | exp '+' exp        { $$ = $1 + $3; }
> @@ -617,18 +620,16 @@ exp:
>        [c++], [[
>        {
>          ]AT_GLR_IF([[yyparser.]])[error (]AT_LOCATION_IF([[@3,
> ]])["error: null divisor");
> -      }]],
> -      [d], [[
> -      yyerror (]AT_LOCATION_IF([[@3, ]])["error: null divisor");]])[
> +      }]])[
>      else
>        $$ = $1 / $3;
>    }
>  | '-' exp  %prec NEG { $$ = -$2; }
>  | exp '^' exp        { $$ = power ($1, $3); }
>  | '(' exp ')'        { $$ = $2; }
> -| '(' error ')'      { $$ = 1111; ]AT_D_IF([], [yyerrok;])[ }
> -| '!'                { $$ = 0; ]AT_D_IF([return YYERROR], [YYERROR])[; }
> -| '-' error          { $$ = 0; ]AT_D_IF([return YYERROR], [YYERROR])[; }
> +| '(' error ')'      { $$ = 1111; yyerrok; }
> +| '!'                { $$ = 0; YYERROR; }
> +| '-' error          { $$ = 0; YYERROR; }
>  ;
>  %%
>
> @@ -682,11 +683,100 @@ AT_DATA_SOURCE([[calc-main.]AT_LANG_EXT],
>
>  ]AT_CALC_MAIN])
>  ])
> -])# _AT_DATA_CALC_Y
> +])# _AT_DATA_CALC_Y(c)
>
>
>  m4_copy([_AT_DATA_CALC_Y(c)], [_AT_DATA_CALC_Y(c++)])
> -m4_copy([_AT_DATA_CALC_Y(c)], [_AT_DATA_CALC_Y(d)])
> +
> +m4_define([_AT_DATA_CALC_Y(d)],
> +[AT_DATA_GRAMMAR([calc.y],
> +[[/* Infix notation calculator--calc */
> +]$4[
> +%code imports {
> +  alias semantic_value = int;
> +}
> +/* Exercise %union. */
> +%union
> +{
> +  semantic_value ival;
> +};
> +%printer { fprintf (yyo, "%d", $$); } <ival>;
> +
> +/* Bison Declarations */
> +%token CALC_EOF 0 ]AT_TOKEN_TRANSLATE_IF([_("end of input")], ["end of
> input"])[
> +%token <ival> NUM   "number"
> +%type  <ival> exp
> +
> +%token PLUS   "+"
> +       MINUS  "-"
> +       STAR   "*"
> +       SLASH  "/"
> +       LPAR   "("
> +       RPAR   ")"
> +       EQUAL  "="
> +       POW    "^"
> +       NOT    "!"
> +       EOL    "\n"
> +
> +%nonassoc "="   /* comparison          */
> +%left "-" "+"
> +%left "*" "/"
> +%precedence NEG /* negation--unary minus */
> +%right "^"      /* exponentiation        */
> +
> +/* Grammar follows */
> +%%
> +input:
> +  line
> +| input line         { ]AT_PARAM_IF([++*count; ++global_count;])[ }
> +;
> +
> +line:
> +  EOL
> +| exp EOL            { ]AT_PARAM_IF([*result = global_result = $1;],
> [AT_D_IF([], [USE ($1);])])[ }
> +;
> +
> +exp:
> +  NUM
> +| exp "=" exp
> +  {
> +    if ($1 != $3)
> +      yyerror (]AT_LOCATION_IF([[@$, ]])[format ("error: %d != %d", $1,
> $3));
> +    $$ = $1;
> +  }
> +| exp "+" exp        { $$ = $1 + $3; }
> +| exp "-" exp        { $$ = $1 - $3; }
> +| exp "*" exp        { $$ = $1 * $3; }
> +| exp "/" exp
> +  {
> +    if ($3 == 0)
> +      yyerror (]AT_LOCATION_IF([[@3, ]])["error: null divisor");
> +    else
> +      $$ = $1 / $3;
> +  }
> +| "-" exp  %prec NEG { $$ = -$2; }
> +| exp "^" exp        { $$ = power ($1, $3); }
> +| "(" exp ")"        { $$ = $2; }
> +| "(" error ")"      { $$ = 1111; ]AT_D_IF([], [yyerrok;])[ }
> +| "!"                { $$ = 0; return YYERROR; }
> +| "-" error          { $$ = 0; return YYERROR; }
> +;
> +%%
> +
> +int
> +power (int base, int exponent)
> +{
> +  int res = 1;
> +  assert (0 <= exponent);
> +  for (/* Niente */; exponent; --exponent)
> +    res *= base;
> +  return res;
> +}
> +
> +]AT_YYERROR_DEFINE[
> +]AT_CALC_YYLEX
> +AT_CALC_MAIN])
> +])# _AT_DATA_CALC_Y(d)
>
>  m4_define([_AT_DATA_CALC_Y(java)],
>  [AT_DATA_GRAMMAR([Calc.y],
> @@ -883,7 +973,10 @@ AT_PERL_REQUIRE([[-pi -e 'use strict;
>    s{syntax error on token \[(.*?)\] \(expected: (.*)\)}
>    {
>      my $unexp = $][1;
> -    my @exps = $][2 =~ /\[(.*?)\]/g;
> +    my @exps = $][2 =~ /\[(.*?)\]/g;]AT_D_IF([[
> +    # In the case of D, there are no single quotes around the symbols.
> +    $unexp =~ s/'"'(.)'"'/$][1/g;
> +    s/'"'(.)'"'/$][1/g for @exps;]])[
>      ($][#exps && $][#exps < 4)
>      ? "syntax error, unexpected $unexp, expecting @{[join(\" or \",
> @exps)]}"
>      : "syntax error, unexpected $unexp";
> diff --git a/tests/scanner.at b/tests/scanner.at
> index 5ad18729..2ec2cd78 100644
> --- a/tests/scanner.at
> +++ b/tests/scanner.at
> @@ -121,7 +121,7 @@ class YYLexer(R) : Lexer
>      return semanticVal_;
>    }
>
> -  int yylex ()
> +  TokenKind yylex ()
>    {
>      import std.uni : isNumber;
>      // Handle EOF.
>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]