bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: symbol type issue with unused non-terminals


From: Scheidler , Balázs
Subject: Re: symbol type issue with unused non-terminals
Date: Fri, 1 Feb 2019 13:27:36 +0100

On Fri, Feb 1, 2019 at 6:56 AM Akim Demaille <address@hidden> wrote:

> Hi Balázs!
>
> > Le 31 janv. 2019 à 19:15, Scheidler, Balázs <
> address@hidden> a écrit :
> >
> > Hi,
> >
> > We, in the syslog-ng project (https://github.com/balabit/syslog-ng)
> have a
> > bison grammar file that contains a number of unused non-terminals. The
> > reasons for this is complicated, which I could explain if needed.
>
> Yes, out of curiosity I'd be happy to know why it makes sense for you
> (there's no plan to refuse grammars with useless symbols!).
>

The reason is that we use bison to parse portions of our configuration
file, but the configuration language is extended with plugins that are
loaded at runtime.

The solution is:

   - we have a main grammar that supports the basic configuration language
   and a way to trigger on-demand loading of plugins
   - the plugin also has a bison grammar that "includes" rules from the
   main grammar file


For this reason:

   - there are rules in the main config that are only used by plugins,
   which cause unused rules when compiling the main grammar
   - when we include these rules into the plugin grammar file, not all of
   our included rules will be used by a specific plugin.

This means that both the main grammar and the plugin grammars will have
some unused rules.

We use a homegrown python script that grabs the reusable rules and adds
them to the plugin grammar during compilation.

I am happy to elaborate, if more information is needed.


> > [...]
> > Based on my debugging I've found this root cause:
> >
> >   - rules are parsed as part of the grammar, and get an associated symbol
> >   number
> >   - the RHS of rules reference terminal and non-terminal symbols using a
> >   symbol number. These are resolved at grammar read time and the symbol
> >   number is generated into the output eventually making it to m4.
> >   - at this point reduce_grammar() happens, this removes the unused
> >   non-terminal rules, causing symbols to be renumbered.
> >   - this makes an effort to update all symbol number references, however
> >   RHS of rules is not updated.
> >   - RHS of rules that reference "old" numbers that are higher than the
> >   maximum, cause those ugly m4 errors that you see above
> >   - At the same time, in such a case an RHS expression can easily
> >   reference the wrong symbol, if they got renumbered. A different
> >   manifestation of the same bug, where dollar actions (e.g. $1, $2, etc)
> >   start to use an invalid <tag> to reference the value in YYSTYPE.
>
> Thank you for the careful analysis!  Yes, you pinpointed the issue.
>
> For the record, something that is very useful to debug such issues is the
> --trace option.  In the present case, --trace=muscle would reveal the
> generated symbol numbers. Comparing 3.2 and 3.3 is instructive.
>

I used --trace=muscles while trying to understand what bison does. I was
also reading its code, which I've found pretty easy to read btw.


>
> > This was triggered in our code-base, because macOs brew updated to bison
> > 3.3.1 recently. If at all possible it would be great if this problem
> would
> > not spread too far (e.g. Debian). bison 3.2 still seems to work properly.
>
> I'll try to address this asap and release the fix immediately.
> Sorry about this issue.
>
> Our test suite is already quite big, but I regularly discover missing
> cases...
>
>
 It's an uphill battle, but still a useful one. I find that tests
(especially if they are fast) give me a lot of self confidence when cutting
releases :)

Bazsi


reply via email to

[Prev in Thread] Current Thread [Next in Thread]