bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 7/7] yacc.c: push: don't clear the parser state when accepting/re


From: Akim Demaille
Subject: [PATCH 7/7] yacc.c: push: don't clear the parser state when accepting/rejecting
Date: Sun, 28 Jun 2020 16:52:20 +0200

Currently when a push parser finishes its parsing (i.e., it did not
return YYPUSH_MORE), it also clears its state.  It is therefore
impossible to see if it had parse errors.

In the context of autocompletion, because error recovery might have
fired, the parser is actually already in a different state.  For
instance on `(1 + + <TAB>` in the bistromathic, because there's a
`exp: "(" error ")"` recovery rule, `1 + +` tokens have already been
popped, replaced by `error`, and autocompletions think we are ready
for the closing ")".  So here, we would like to see if there was a
syntax error, yet yynerrs was cleared.

In the case of a successful parse, we still have a problem: if error
recovery succeeded, we won't know it, since, again, yynerrs is
clearer.

It seems much more natural to leave the parser state available for
analysis when there is a failure.  And to provide an explicit means to
reinitialize a parser state for future parses.

Note that it also makes it easier to reuse the previously enlarged
stacks from one parse to another.

* data/skeletons/yacc.c (yypstate_clear): Make it public, non static.
(yypstate_new): Set up the stacks in their initial
configurations (setting their bottom to the stack array), and
use yypstate_clear to reset them (moving their top to their bottom).
(yypstate_delete): Adjust.
(yypush_parse): Stop clearing the parser instance when parsing is
finished.
* doc/bison.texi (yypstate_clear): Document.
Make it clear it must be called before parsing again.
* tests/torture.at: Adjust.

* examples/c/bistromathic/parse.y (expected_tokens): Do not propose
autocompletion when there are parse errors.
* examples/c/bistromathic/bistromathic.test: Check that case.
---
 NEWS                                      | 17 ++++++++
 TODO                                      | 14 -------
 data/skeletons/yacc.c                     | 48 ++++++++++++-----------
 doc/bison.texi                            | 19 ++++++++-
 examples/c/bistromathic/bistromathic.test | 20 +++++++---
 examples/c/bistromathic/parse.y           | 13 ++++--
 tests/torture.at                          |  3 +-
 7 files changed, 85 insertions(+), 49 deletions(-)

diff --git a/NEWS b/NEWS
index 60587f7f..bf1a2e82 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,23 @@ GNU Bison NEWS
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** Backward incompatible changes
+
+*** Push parsers no longer clear their state when parsing is finished
+
+  TL;DR: call yypstate_clear before calling yypush_parse in loops.
+
+  Previously push-parsers cleared their state when parsing was finished (on
+  success and on failure).  This made it impossible to check if there were
+  parse errors, since yynerrs was also reset.  This can be especially
+  troublesome when used in autocompletion, since a parser with error
+  recovery would suggest (irrelevant) expected tokens even if there were
+  failure.
+
+  Now the parser state can be examined when parsing is finished.  However,
+  before reusing a parser instance, its state must be reset with
+  `yypstate_clear (ps)`.
+
 ** New features
 
 *** File prefix mapping
diff --git a/TODO b/TODO
index f8799078..0a22e34e 100644
--- a/TODO
+++ b/TODO
@@ -37,8 +37,6 @@ Unless we play it dumb (little structure).
 examples about conflicts in the reports.
 
 ** Bistromathic
-- Hitting tab on a line with a syntax error is ugly
-
 - How about not evaluating incomplete lines when the text is not finished
   (as shells do).
 
@@ -434,11 +432,6 @@ undocumented ''features''.  Maybe an empty action ought to 
be
 presented too.  Shall we try to make a single grammar with all these
 features, or should we have several very small grammars?
 
-** --report=conflict-path
-Provide better assistance for understanding the conflicts by providing
-a sample text exhibiting the (LALR) ambiguity.  See the paper from
-DeRemer and Penello: they already provide the algorithm.
-
 ** Statically check for potential ambiguities in GLR grammars
 See <http://www.lsv.fr/~schmitz/pub/expamb.pdf> for an approach.
 An Experimental Ambiguity Detection Tool ∗ Sylvain Schmitz
@@ -534,13 +527,6 @@ XML output for GNU Bison and gcc
 XML output for GNU Bison
    http://yaxx.sourceforge.net/
 
-** Counterexample generation
-https://lists.gnu.org/archive/html/bug-bison/2016-06/msg00000.html
-http://www.cs.cornell.edu/andru/papers/cupex/
-
-Andrew Myers and Vincent Imbimbo are working on this item, see
-https://github.com/akimd/bison/issues/12
-
 * Coding system independence
 Paul notes:
 
diff --git a/data/skeletons/yacc.c b/data/skeletons/yacc.c
index c78a5a78..136e9ed4 100644
--- a/data/skeletons/yacc.c
+++ b/data/skeletons/yacc.c
@@ -269,6 +269,7 @@ int ]b4_prefix[push_parse (]b4_prefix[pstate 
*ps]b4_pure_if([[,
                   int pushed_char, ]b4_api_PREFIX[STYPE const 
*pushed_val]b4_locations_if([[, ]b4_api_PREFIX[LTYPE 
*pushed_loc]])])b4_user_formals[);
 ]b4_pull_if([[int ]b4_prefix[pull_parse (]b4_prefix[pstate 
*ps]b4_user_formals[);]])[
 ]b4_prefix[pstate *]b4_prefix[pstate_new (void);
+void ]b4_prefix[pstate_clear (]b4_prefix[pstate *ps);
 void ]b4_prefix[pstate_delete (]b4_prefix[pstate *ps);
 ]])
 
@@ -1457,25 +1458,21 @@ yypull_parse (yypstate *yyps]b4_user_formals[)
 ]b4_parse_state_variable_macros([b4_pstate_macro_define])[
 
 /* Initialize the parser data structure.  */
-static void
+void
 yypstate_clear (yypstate *yyps)
 {
   yynerrs = 0;
   yystate = 0;
   yyerrstatus = 0;
 
-  yystacksize = YYINITDEPTH;
-  yyssp = yyss = yyssa;
-  yyvsp = yyvs = yyvsa;]b4_locations_if([[
-  yylsp = yyls = yylsa;]])[]b4_lac_if([[
+  yyssp = yyss;
+  yyvsp = yyvs;]b4_locations_if([[
+  yylsp = yyls;]])[
 
-  yyes = yyesa;
-  yyes_capacity = ]b4_percent_define_get([[parse.lac.es-capacity-initial]])[;
-  if (YYMAXDEPTH < yyes_capacity)
-    yyes_capacity = YYMAXDEPTH;]])[
   /* Initialize the state stack, in case yypcontext_expected_tokens is
      called before the first call to yyparse. */
   *yyssp = 0;
+  yyps->yynew = 1;
 }
 
 /* Initialize the parser data structure.  */
@@ -1487,9 +1484,16 @@ yypstate_new (void)
     return YY_NULLPTR;]])[
   yyps = YY_CAST (yypstate *, malloc (sizeof *yyps));
   if (!yyps)
-    return YY_NULLPTR;
-  yyps->yynew = 1;]b4_pure_if([], [[
+    return YY_NULLPTR;]b4_pure_if([], [[
   yypstate_allocated = 1;]])[
+  yystacksize = YYINITDEPTH;
+  yyss = yyssa;
+  yyvs = yyvsa;]b4_locations_if([[
+  yyls = yylsa;]])[]b4_lac_if([[
+  yyes = yyesa;
+  yyes_capacity = ]b4_percent_define_get([[parse.lac.es-capacity-initial]])[;
+  if (YYMAXDEPTH < yyes_capacity)
+    yyes_capacity = YYMAXDEPTH;]])[
   yypstate_clear (yyps);
   return yyps;
 }
@@ -1502,10 +1506,10 @@ yypstate_delete (yypstate *yyps)
 #ifndef yyoverflow
       /* If the stack was reallocated but the parse did not complete, then the
          stack still needs to be freed.  */
-      if (!yyps->yynew && yyss != yyssa)
+      if (yyss != yyssa)
         YYSTACK_FREE (yyss);
 #endif]b4_lac_if([[
-      if (!yyps->yynew && yyes != yyesa)
+      if (yyes != yyesa)
         YYSTACK_FREE (yyes);]])[
       free (yyps);]b4_pure_if([], [[
       yypstate_allocated = 0;]])[
@@ -2061,22 +2065,20 @@ yyreturn:
       yydestruct ("Cleanup: popping",
                   YY_ACCESSING_SYMBOL (+*yyssp), yyvsp]b4_locations_if([, 
yylsp])[]b4_user_args[);
       YYPOPSTACK (1);
-    }
-#ifndef yyoverflow
-  if (yyss != yyssa)
-    YYSTACK_FREE (yyss);
-#endif]b4_lac_if([[
-  if (yyes != yyesa)
-    YYSTACK_FREE (yyes);]])b4_push_if([[
-  yypstate_clear (yyps);
-  yyps->yynew = 1;
+    }]b4_push_if([[
   goto yypushreturn;
 
 
 /*-------------------------.
 | yypushreturn -- return.  |
 `-------------------------*/
-yypushreturn:]])[
+yypushreturn:]], [[
+#ifndef yyoverflow
+  if (yyss != yyssa)
+    YYSTACK_FREE (yyss);
+#endif]b4_lac_if([[
+  if (yyes != yyesa)
+    YYSTACK_FREE (yyes);]])])[
 ]b4_parse_error_bmatch([detailed\|verbose],
 [[  if (yymsg != yymsgbuf)
     YYSTACK_FREE (yymsg);]])[
diff --git a/doc/bison.texi b/doc/bison.texi
index 678b6efe..19c61110 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -7135,6 +7135,18 @@ Reclaim the memory associated with a parser instance.  
After this call, you
 should no longer attempt to use the parser instance.
 @end deftypefun
 
+@findex yypstate_clear
+@deftypefun void yypstate_clear (@code{yypstate *}@var{yyps})
+@anchor{yypstate_clear}
+Reset a parser instance to parse a new input sequence.  Calling
+@code{yypstate_clear} during an unfinished parse (i.e., the last invocation
+returned @code{YYPUSH_MORE}) might result in memory leaks: the parser stacks
+are not emptied.
+
+It is safe to call @code{yypstate_clear} ``uselessly'' (e.g., before the
+very first run).
+@end deftypefun
+
 @findex yypush_parse
 You call the function @code{yypush_parse} to parse a single token.  This
 function is available if either the @samp{%define api.push-pull push} or
@@ -7146,8 +7158,13 @@ The value returned by @code{yypush_parse} is the same as 
for @code{yyparse}
 with the following exception: it returns @code{YYPUSH_MORE} if more input is
 required to finish parsing the grammar.
 
+After @code{yypush_parse} returned, the instance may be consulted.  For
+instance check @code{yynerrs} to see whether there were (possibly recovered)
+syntax errors.
+
 After @code{yypush_parse} returns a status other than @code{YYPUSH_MORE},
-the parser instance @code{yyps} may be reused for a new parse.
+the parser instance @code{yyps} may be reused for a new parse after calling
+@code{yypstate_clear}.
 @end deftypefun
 
 The fact that the parser state is reusable even after an error simplifies
diff --git a/examples/c/bistromathic/bistromathic.test 
b/examples/c/bistromathic/bistromathic.test
index 7301fb8b..f10cb69c 100755
--- a/examples/c/bistromathic/bistromathic.test
+++ b/examples/c/bistromathic/bistromathic.test
@@ -307,16 +307,24 @@ end of file  exit         exp          ''
 > ''
 err: '
 
-# Check that completion when there is an error prints valid locations.
+# Check that completion prints valid locations even when there is an error.
+sed -e 's/\\t/ /g' >input <<EOF
+1++\t
+EOF
+run -n 0 '> 1++ ''
+> ''
+err: 1.1: syntax error: expected - or ( or number or function or variable 
before +
+err: 1.3: syntax error: expected - or ( or number or function or variable 
before +
+'
+
+# And even when the error was recovered from.
 sed -e 's/\\t/ /g' >input <<EOF
 (1++2) + 3 +\t\t
 EOF
-run -n 0 '> (1++2) + 3 +
-(       -       atan    cos     exp     ln      number  sin     sqrt
-> (1++2) + 3 +
+run -n 0 '> (1++2) + 3 +  ''
 > 
 err: 1.1: syntax error: expected - or ( or number or function or variable 
before +
-err: 1.1: syntax error: expected - or ( or number or function or variable 
before +
+err: 1.1: syntax error: expected - or ( or number or function or variable 
before +
 err: 1.4: syntax error: expected - or ( or number or function or variable 
before +
-err: 1.13: syntax error: expected - or ( or number or function or variable 
before end of file
+err: 1.15: syntax error: expected - or ( or number or function or variable 
before end of file
 '
diff --git a/examples/c/bistromathic/parse.y b/examples/c/bistromathic/parse.y
index 97b60bc9..6d86c77f 100644
--- a/examples/c/bistromathic/parse.y
+++ b/examples/c/bistromathic/parse.y
@@ -456,10 +456,15 @@ expected_tokens (const char *input,
     status = yypush_parse (ps, token, &lval, &lloc);
   } while (status == YYPUSH_MORE);
 
-  // Then query for the accepted tokens at this point.
-  int res = yypstate_expected_tokens (ps, tokens, ntokens);
-  if (res < 0)
-    abort ();
+  int res = 0;
+  // If there were parse errors, don't propose completions.
+  if (!ps->yynerrs)
+    {
+      // Then query for the accepted tokens at this point.
+      res = yypstate_expected_tokens (ps, tokens, ntokens);
+      if (res < 0)
+        abort ();
+    }
   yypstate_delete (ps);
   return res;
 }
diff --git a/tests/torture.at b/tests/torture.at
index 3dbbed5e..0cd594eb 100644
--- a/tests/torture.at
+++ b/tests/torture.at
@@ -459,7 +459,8 @@ main (int argc, const char **argv)
   for (count = 0; count < 2; ++count)
     {
       int new_status;
-      yylval = yylval_init;
+      yylval = yylval_init;]m4_bmatch([$2], [api.push-pull both], [[
+      yypstate_clear (ps);]])[
       new_status = ]m4_bmatch([$2], [api.push-pull both],
                               [[yypull_parse (ps)]],
                               [[yyparse ()]])[;
-- 
2.27.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]