[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: symbol type issue with unused non-terminals

From: Scheidler , Balázs
Subject: Re: symbol type issue with unused non-terminals
Date: Fri, 1 Feb 2019 13:27:36 +0100

On Fri, Feb 1, 2019 at 6:56 AM Akim Demaille <address@hidden> wrote:

> Hi Balázs!
> > Le 31 janv. 2019 à 19:15, Scheidler, Balázs <
> address@hidden> a écrit :
> >
> > Hi,
> >
> > We, in the syslog-ng project (https://github.com/balabit/syslog-ng)
> have a
> > bison grammar file that contains a number of unused non-terminals. The
> > reasons for this is complicated, which I could explain if needed.
> Yes, out of curiosity I'd be happy to know why it makes sense for you
> (there's no plan to refuse grammars with useless symbols!).

The reason is that we use bison to parse portions of our configuration
file, but the configuration language is extended with plugins that are
loaded at runtime.

The solution is:

   - we have a main grammar that supports the basic configuration language
   and a way to trigger on-demand loading of plugins
   - the plugin also has a bison grammar that "includes" rules from the
   main grammar file

For this reason:

   - there are rules in the main config that are only used by plugins,
   which cause unused rules when compiling the main grammar
   - when we include these rules into the plugin grammar file, not all of
   our included rules will be used by a specific plugin.

This means that both the main grammar and the plugin grammars will have
some unused rules.

We use a homegrown python script that grabs the reusable rules and adds
them to the plugin grammar during compilation.

I am happy to elaborate, if more information is needed.

> > [...]
> > Based on my debugging I've found this root cause:
> >
> >   - rules are parsed as part of the grammar, and get an associated symbol
> >   number
> >   - the RHS of rules reference terminal and non-terminal symbols using a
> >   symbol number. These are resolved at grammar read time and the symbol
> >   number is generated into the output eventually making it to m4.
> >   - at this point reduce_grammar() happens, this removes the unused
> >   non-terminal rules, causing symbols to be renumbered.
> >   - this makes an effort to update all symbol number references, however
> >   RHS of rules is not updated.
> >   - RHS of rules that reference "old" numbers that are higher than the
> >   maximum, cause those ugly m4 errors that you see above
> >   - At the same time, in such a case an RHS expression can easily
> >   reference the wrong symbol, if they got renumbered. A different
> >   manifestation of the same bug, where dollar actions (e.g. $1, $2, etc)
> >   start to use an invalid <tag> to reference the value in YYSTYPE.
> Thank you for the careful analysis!  Yes, you pinpointed the issue.
> For the record, something that is very useful to debug such issues is the
> --trace option.  In the present case, --trace=muscle would reveal the
> generated symbol numbers. Comparing 3.2 and 3.3 is instructive.

I used --trace=muscles while trying to understand what bison does. I was
also reading its code, which I've found pretty easy to read btw.

> > This was triggered in our code-base, because macOs brew updated to bison
> > 3.3.1 recently. If at all possible it would be great if this problem
> would
> > not spread too far (e.g. Debian). bison 3.2 still seems to work properly.
> I'll try to address this asap and release the fix immediately.
> Sorry about this issue.
> Our test suite is already quite big, but I regularly discover missing
> cases...
 It's an uphill battle, but still a useful one. I find that tests
(especially if they are fast) give me a lot of self confidence when cutting
releases :)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]