bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fix %error-verbose for conflicts resolved by %nonassoc.


From: Joel E. Denny
Subject: Re: Fix %error-verbose for conflicts resolved by %nonassoc.
Date: Wed, 26 Aug 2009 18:21:36 -0400 (EDT)
User-agent: Alpine 1.00 (DEB 882 2007-12-20)

On Wed, 26 Aug 2009, Akim Demaille wrote:

> > AT_PARSER_CHECK([./input '0<0'])
> > -# FIXME: This is an actual bug, but a new one, in the sense that
> > -# no one has ever spotted it!  The messages are *wrong*: there should
> > -# be nothing there, it should be expected eof.
> > AT_PARSER_CHECK([./input '0<0<0'], [1], [],
> > -         [syntax error, unexpected '<', expecting '<' or '>'
> > +         [syntax error, unexpected '<'
> > ])
> > 
> > AT_PARSER_CHECK([./input '0>0'])
> > AT_PARSER_CHECK([./input '0>0>0'], [1], [],
> > -         [syntax error, unexpected '>', expecting '<' or '>'
> > +         [syntax error, unexpected '>'
> > ])
> > 
> > AT_PARSER_CHECK([./input '0<0>0'], [1], [],
> > -         [syntax error, unexpected '>', expecting '<' or '>'
> > +         [syntax error, unexpected '>'
> > ])
> 
> I don't understand why it does not refer to eof, as was noted in the comment.

Thanks for mentioning that.  I meant to address it, but I fell down a 
different rabbit hole and forgot.

%error-verbose cannot be completely trusted without canonical LR.  There 
are two problems: default reductions, which lose lookaheads, and state 
merging, which loses and adds lookaheads.

Normally the parser cannot detect any syntax error when in a state with a 
default reduction because all lookaheads look valid even if they aren't.  
In that case, after performing the default reduction, the parser enters a 
new state where some valid lookaheads from the previous state might not be 
valid lookaheads anymore.  Thus, lookaheads are lost in the eventual 
syntax error message.

Your grammar has a special exception: it has an explicit syntax error in a 
state with a default reduction.  Thus, that particular syntax error is 
possible to detect without performing the default reduction.  
Nevertheless, the list of valid lookaheads for the default reduction is 
not recorded, so they aren't reported in the syntax error message.  $end 
is one such valid lookahead.

We can stop the loss of lookaheads due to default reductions by disabling 
default reductions for inconsistent states:

  %define lr.default-reductions "consistent"

A consistent state has one action.  If that action is shift, the one valid 
lookahead is known.  If that action is reduce, it can become a default 
reduction.  That drops the lookahead set, but that's fine.  Because the 
reduction is the only action in this state, it will always be performed, 
and the valid lookaheads for this left context are always the same as (or, 
due to state merging, a subset of) the lookaheads for the state that 
follows the subsequent goto.  They have to be the same because what can 
come next cannot be restricted by choosing the only possible action.  No 
valid lookaheads are lost.

I pushed the patch below to extend your test case in order to show the 
above solution.

Though it does not affect your test case, we still have not solved the 
state merging problem.  That is, when lookahead sets are merged for 
different left contexts, some invalid lookaheads might now look valid and 
thus might be reported as expected tokens.  Moreover, as with default 
reductions, invalid reductions might be performed leading to a restricted 
set of lookaheads.  Writing an LALR grammar does not prevent state merging 
and thus does not solve this problem in general, and neither does using 
IELR.  Canonical LR is required:

  %define lr.type "canonical LR"

A side effect of that declaration is that default reductions are disabled 
completely by default, so you no longer need the previous declaration.  
However, if you always want to construct as much of the semantic left 
context as possible before reading the next token, you should enable 
default reductions in consistent states by combining the above two 
declarations.

(To be clear, even though IELR cannot fix %error-verbose, it is still 
worthwhile during development when an accurate %error-verbose is 
ultimately needed.  I've found that the number of conflicts that must be 
debugged in canonical LR parser tables tends to be an order of magnitude 
larger than in LALR and IELR.  So, if you don't mind canonical LR's size 
in your final product in order to get perfect lookahead sets, debug your 
conflicts with IELR, and then switch to canonical LR.)

I really need to rewrite the lr.default-reductions and lr.type 
documentation.  A cross-reference from %error-verbose would be nice.  
I'll try to get to that before the 2.5 release.

> Maybe that would also explain my incorrect
> messages in the nearby thread about semantic error messages.

I skimmed that, and I would guess that you need canonical LR.  I haven't 
explored your grammar yet though.

>From d1cc31c5f04b81a3620fa291020ce23490f3f9e7 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Wed, 26 Aug 2009 14:15:53 -0400
Subject: [PATCH] tests: show a use of %define lr.default-reductions "consistent"

* tests/conflicts.at (%nonassoc and eof): Extend to test that it
prevents the omission of expected tokens for %error-verbose.
---
 ChangeLog          |    6 ++++++
 tests/conflicts.at |   23 +++++++++++++++++++++++
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 40df1d4..44f27ff 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2009-08-26  Joel E. Denny  <address@hidden>
+
+       tests: show a use of %define lr.default-reductions "consistent"
+       * tests/conflicts.at (%nonassoc and eof): Extend to test that it
+       prevents the omission of expected tokens for %error-verbose.
+
 2009-08-26  Akim Demaille  <address@hidden>
 
        tests: portability fix.
diff --git a/tests/conflicts.at b/tests/conflicts.at
index f2f7861..26ec08d 100644
--- a/tests/conflicts.at
+++ b/tests/conflicts.at
@@ -112,6 +112,29 @@ AT_PARSER_CHECK([./input '0<0>0'], [1], [],
          [syntax error, unexpected '>'
 ])
 
+# We must disable default reductions in inconsistent states in order to
+# have an explicit list of all expected tokens.  (However, unless we use
+# canonical LR, lookahead sets are merged for different left contexts,
+# so it is still possible to have extra incorrect tokens in the expected
+# list.  That just doesn't happen to be a problem for this test case.)
+
+AT_BISON_CHECK([-Dlr.default-reductions=consistent -o input.c input.y])
+AT_COMPILE([input])
+
+AT_PARSER_CHECK([./input '0<0'])
+AT_PARSER_CHECK([./input '0<0<0'], [1], [],
+         [syntax error, unexpected '<', expecting $end
+])
+
+AT_PARSER_CHECK([./input '0>0'])
+AT_PARSER_CHECK([./input '0>0>0'], [1], [],
+         [syntax error, unexpected '>', expecting $end
+])
+
+AT_PARSER_CHECK([./input '0<0>0'], [1], [],
+         [syntax error, unexpected '>', expecting $end
+])
+
 AT_CLEANUP
 
 
-- 
1.5.4.3





reply via email to

[Prev in Thread] Current Thread [Next in Thread]