[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Improvement in R/R Conflict resolution

From: Jot Dot
Subject: Improvement in R/R Conflict resolution
Date: Tue, 15 Dec 2020 11:00:24 -0700 (MST)

I have read the current R/R resolution method on the page:

I have an idea for an improvement in R/R conflict resolution.
First I will point out an issue, then a proposed solution.

As my work with lexical analysers and parsers continue, I am experiencing
S/R and R/R conflicts in the grammar I am writing. Naturally, one must strive
to eliminate them, but I noticed some simply can not be done easily with
grammar rules (parser only).

Take a classic C++ problem: int x(y * z);

Is that an int x being initialised with y*z or is that a forward declaration
of a function x that takes a pointer to a type y and then returns an int?

The only real way to resolve that is to understand the context surrounding it.

This occurs in my parser code due to this part:
    | ...
    | storage_type '(' expr ')'
    | storage_type '(' declaration ')'
    | ...

There is an R/R conflict since:
'expr' can be IDENTIFIER * IDENTIFIER (multiply)
and 'declaration' can be the same IDENTIFIER * IDENTIFIER.
This makes sense, since the choice depends on whether the lhs identifier
is a type or the name of a variable.

Doing some googling, I have found the solution (or one of many solutions?)
is to tie into the lexer and have it check the context for IDENTIFIER
so that the lexer can discern the difference between a simple identifier
or a typedef'd type; returning one token or another.
(ie: return IDENTIFIER or say, TYPE_IDENT).
We then change the parser grammar accordingly.

This is great for a single pass compiler. What if we want to check for
syntactic correctness only at that point but push context sensitive issues
into a second pass? Somehow we should be able to discern this ambiguity,
mark it, and move on.


As per: 
Section 5.6 Reduce/Reduce Conflicts, Paragraph 6

quote: "Bison resolves a reduce/reduce conflict by choosing to use the rule
that appears first in the grammar, but it is very risky to rely on this."


Instead of having Bison pick the first one, we should allow the user to
specify an alternate.

In the example shown above:
    | ...
    | storage_type '(' expr ')'
    | storage_type '(' declaration ')'
    | ...

Would it be possible to flag an alternate rule for the conflict?
A proposal would be to introduce a directive to choose the reduction for
a R/R conflict.

    | ...
    | storage_type '(' expr ')'
    | storage_type '(' declaration ')'
    | %RR storage_type '(' IDENTIFIER '*' IDENTIFIER ')' // For ambiguous 
    | ...

In what I am doing, this is exactly what I want.

If this makes sense, then it would not be as hard to introduce as one
might think. The %RR line is, in essence, a deliberate introduction
of a the very R/R conflict we want to resolve. The detection of this
is already in Bison. Instead of picking the first conflict, we choose
the one marked.

Note that this is a deliberate injection of a conflict. If there was
no conflict, this would introduce one. This can be easily detected
and marked as a warning and probably useless.

Disclaimer: I haven't looked at Bison code so this 'easy' mod may not
be so easy. I'm simply saying, from the outside looking in, it seems
the logic is there to possibly introduce this without too much pain.

I know it is a far cry from "knowing about the concepts of lexers and
parsers" to actually using them (as I am finding out.) But I do
believe this is a better solution than "choosing the first rule".

The above proposal would simplify R/R resolution by purposely allowing
the R/R resolution to pick a user supplied solution rather than just
"anything". The lexer can be simplified in these cases since it does not
have to be forced into context sensitive issues unless the user
deliberately wants to.

Thanks for taking the time reading.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]