[PATCH 4/4] dogfooding: use YYERRCODE in our scanner

bison-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 4/4] dogfooding: use YYERRCODE in our scanner

From:	Akim Demaille
Subject:	[PATCH 4/4] dogfooding: use YYERRCODE in our scanner
Date:	Tue, 28 Apr 2020 07:20:06 +0200

* src/scan-gram.l: Use it.
* tests/input.at: Adjust.
---
 NEWS            | 13 +++++++++++++
 src/scan-gram.l | 42 ++++++++++++++++++++++++++++++------------
 tests/input.at  |  1 -
 3 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/NEWS b/NEWS
index d676d6d0..528d0f22 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,19 @@ GNU Bison NEWS
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** New features
+
+*** Returning the error token
+
+  When the scanner returns an invalid token or the undefined token
+  (YYUNDEF), the parser generates an error message and enters error
+  recovery.  Because of that error message, most scanners that find lexical
+  errors generate an error message, and then ignore the invalid input
+  without entering the error-recovery.
+
+  The scanners may now return YYERRCODE, the error token, to enter the
+  error-recovery mode without triggering an additional error message.  See
+  the bistromathic for an example.
 
 * Noteworthy changes in release 3.5.90 (2020-04-18) [beta]
 
diff --git a/src/scan-gram.l b/src/scan-gram.l
index 0a6b3c04..601cc7a7 100644
--- a/src/scan-gram.l
+++ b/src/scan-gram.l
@@ -307,6 +307,7 @@ eqopt    ({sp}=)?
 
   "%"{id} {
     complain (loc, complaint, _("invalid directive: %s"), quote (yytext));
+    return GRAM_ERRCODE;
   }
 
   ":"                     return COLON;
@@ -328,6 +329,7 @@ eqopt    ({sp}=)?
      accept "1FOO" as "1 FOO".  */
   {int}{id} {
     complain (loc, complaint, _("invalid identifier: %s"), quote (yytext));
+    return GRAM_ERRCODE;
   }
 
   /* Characters.  */
@@ -382,6 +384,7 @@ eqopt    ({sp}=)?
     complain (loc, complaint, "%s: %s",
               ngettext ("invalid character", "invalid characters", yyleng),
               quote_mem (yytext, yyleng));
+    return GRAM_ERRCODE;
   }
 
   <<EOF>> {
@@ -398,7 +401,11 @@ eqopt    ({sp}=)?
 
 <SC_ESCAPED_CHARACTER,SC_ESCAPED_STRING,SC_ESCAPED_TSTRING,SC_TAG>
 {
-  \0        complain (loc, complaint, _("invalid null character"));
+  \0         {
+    complain (loc, complaint, _("invalid null character"));
+    STRING_FREE;
+    return GRAM_ERRCODE;
+  }
 }
 
 
@@ -454,6 +461,7 @@ eqopt    ({sp}=)?
         complain (loc, complaint,
                   _("unexpected identifier in bracketed name: %s"),
                   quote (yytext));
+        return GRAM_ERRCODE;
       }
     else
       {
@@ -474,7 +482,10 @@ eqopt    ({sp}=)?
           }
       }
     else
-      complain (loc, complaint, _("an identifier expected"));
+      {
+        complain (loc, complaint, _("an identifier expected"));
+        return GRAM_ERRCODE;
+      }
   }
 
   [^\].A-Za-z0-9_/ \f\r\n\t\v]+|. {
@@ -482,6 +493,7 @@ eqopt    ({sp}=)?
               ngettext ("invalid character in bracketed name",
                         "invalid characters in bracketed name", yyleng),
               quote_mem (yytext, yyleng));
+    return GRAM_ERRCODE;
   }
 
   <<EOF>> {
@@ -580,21 +592,27 @@ eqopt    ({sp}=)?
 {
   "'" {
     STRING_FINISH;
+    BEGIN INITIAL;
     loc->start = token_start;
     val->CHAR = last_string[0];
 
     if (last_string[0] == '\0')
-    {
-      complain (loc, complaint, _("empty character literal"));
-      /* '\0' seems dangerous even if we are about to complain.  */
-      val->CHAR = '\'';
-    }
+      {
+        complain (loc, complaint, _("empty character literal"));
+        STRING_FREE;
+        return GRAM_ERRCODE;
+      }
     else if (last_string[1] != '\0')
-      complain (loc, complaint,
-                _("extra characters in character literal"));
-    STRING_FREE;
-    BEGIN INITIAL;
-    return CHAR;
+      {
+        complain (loc, complaint, _("extra characters in character literal"));
+        STRING_FREE;
+        return GRAM_ERRCODE;
+      }
+    else
+      {
+        STRING_FREE;
+        return CHAR;
+      }
   }
   {eol}     unexpected_newline (token_start, "'");
   <<EOF>>   unexpected_eof (token_start, "'");
diff --git a/tests/input.at b/tests/input.at
index 97bd4e08..c459ecd5 100644
--- a/tests/input.at
+++ b/tests/input.at
@@ -102,7 +102,6 @@ input.y:6.1-17: error: invalid directive: 
'%a-does-not-exist'
 input.y:7.1: error: invalid character: '%'
 input.y:7.2: error: invalid character: '-'
 input.y:8.1-9.0: error: missing '%}' at end of file
-input.y:8.1-9.0: error: unexpected %{...%}
 ]])
 
 AT_CLEANUP
-- 
2.26.2

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/4] dogfooding: use YYERRCODE in bison, Akim Demaille, 2020/04/28
- [PATCH 4/4] dogfooding: use YYERRCODE in our scanner, Akim Demaille <=
- [PATCH 3/4] scanner: avoid spurious errors about empty character literals, Akim Demaille, 2020/04/28
- [PATCH 2/4] scanner: bad character literals are errors, Akim Demaille, 2020/04/28
- [PATCH 1/4] regen, Akim Demaille, 2020/04/28

Prev by Date: Re: RFC: a name for the error token
Next by Date: Re: RFC: a name for the error token
Previous by thread: [PATCH 0/4] dogfooding: use YYERRCODE in bison
Next by thread: [PATCH 3/4] scanner: avoid spurious errors about empty character literals
Index(es):
- Date
- Thread