[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LR(1) paser generator based on Pager's algorithm

From: Xin Chen
Subject: Re: LR(1) paser generator based on Pager's algorithm
Date: Sun, 01 Apr 2007 17:16:33 -1000

Hi Joel,

Thanks for sharing your idea and keen observation. I'm not in a position to 
give advice, but below is what I thought about your example grammar. I have 
attached a graph showing the canonical LR(1) parsing machine, where state 5 has 
a shift/reduce conflict on 'a'.

> > Any S/R conflict in a LALR(1) parser will also be present in the
> > corresponding LR(1) parser.
> Yes, but it might affect viable prefixes it did not in the LR(1) 
> parser.
> I should have included an example before:
>  S: 'a' A 'a'
>   | 'b' A 'b'
>   ;
>  A: 'a' 'a'
>   | 'a'
>   ;
> For canonical LR(1), there is a S/R conflict at the dot in this 
> valid 
> input:
>  aa.a
> but not at the dot in this valid input:
>  ba.ab
> For LALR(1), the associated states are merged so that both inputs 
> have the 
> conflict.
> If you then add this:
>  %left 'a'
> both the LR(1) and LALR(1) parsers perform a reduce at the dot in 
> the 
> first input.  The LR(1) parser performs a shift at the dot in the 
> second 
> input, but the LALR(1) parser still performs a reduce.  Thus, the 
> LR(1) 
> parser accepts the second input, but the LALR(1) parser eventually 
> reports 
> a syntax error.

I don't know if this prefix by adding "%left a" is viable, because the 
canonical LR(1) parsing machine does not accept all strings in the given 
grammar with the prefix. All strings accepted by this grammar are: aaaa, aaa, 
baab, bab. With the prefix, the parsing machine uses "reduce" at state 5 on 'a' 
as shown in the graph. If you have a try, you'll find out it accepts the other 
three strings but not "aaaa". For "aaaa" you're stuck at state 
8, the lookahead is "a", but state 8 has no action for symbol "a". On the other 
hand, if you always choose "shift" on 'a' at state 5, you will not be able to 
accept the string "aaa". You'll be stuck in state 9, with lookahead "$end", but 
there is no action on symbol "$end". 

I think the reason for this is that the grammar is not a LR(1) grammar, so the 
canonical LR(1) algorithm cannot handle it. To handle this grammar, we need to 
use GLR algorithm, to fork at state 5 and try the two possibilities.

I had a similar experience with shift/shift conflict. I came up with this 
S : A '+' A ; A : T '+' T | T ; T : 'r' ;
which after removing unit productions will generate shift/shift conflict. Later 
I found this was not a LR(1) grammar, because the canonical LR(1) parsing 
machine does not accept all strings in it (r+r, r+r+r, r+r+r+r). But back then, 
I went to Dr. Pager with this doubt. The answer I got was that, an LR(1) 
grammar is one which produces a parsing machine without conflicts when the 
original LR(1) algorithm is applied, and given such conflict-f
ree original parsing machine, the algorithms for combining compatible states 
and removing unit productions will generate conflict-free end parsing machine. 
I had that shift/shift conflict because my grammar was not a LR(1) grammar. 

So for your grammar, I personally think the reason is that it's not LR(1), the 
solution is to use GLR algorithm.

As for Dr. Pager's algorithm, I guess it can be patched like this: in grammars 
where precedence rule is used to solve shift/reduce conflict, combine two 
states iff they are weakly compatble AND none has a shift/reduce conflict. But 
it's hard to say, since once a grammar has conflicts under the original LR(1) 
algorithm, it's not LR(1). For example, in your grammar, if we only want to 
accept "baab", then this helps. But this does not help if 
we want to accept "aaaa", only GLR helps. Anyway, I guess use of precedence 
rules is not a guaranteed practice. 

These are just the things came to my mind for now. Please correct me if 
anything was wrong. 


Attachment: conf.JPG
Description: JPEG image

reply via email to

[Prev in Thread] Current Thread [Next in Thread]