[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Joel E. Denny
Sun, 30 Sep 2007 20:57:40 -0400 (EDT)
On Sun, 30 Sep 2007, Akim Demaille wrote:
> > > > <rule number="6">
> > > > <lhs>useless</lhs>
> > > > <rhs>
> > > > <symbol class="terminal">STR</symbol>
> > > > </rhs>
> > > > </rule>
> > >
> > > Should we really repeat the rules then? Its number suffice:
> > > the grammar is defined elsewhere.
> > Useless rules are eliminated from the grammar, so this is not a
> > repetition.
> Really? I thought we dumped the exact grammar before simplifications.
> I checked by reading http://www.gnu.org.ua/~polak/hack/bison/xhtml/
> which shows rule 3 both in the grammar section, and in the useless
calc.xml.html shows an example of a useless rule, and there is no
repetition. errors.xml.html shows rule 3 under "Rules never reduced"
because of conflicts. I agree that the repetition there should be
> > > BTW, maybe the grammar should
> > > be defined first, and then the rest of the information. The
> > > order should be chosen to please tools, not humans.
> > Would that change help tools significantly? If not, I figure we might as
> > well please the humans.
I meant this to be about output order. I agree that repetition should be
eliminated, and I hadn't previously appreciated how much repetition there
> I kind of disagree here. The XML output should probably not be
> designed to mock the *output. In fact, I might even suggest that
> the grammar section itself annotates rules unused rules instead
> of using another separate sections.
> Similarly, IMHO, there should be only one section about terminals
> including both their description (name, number, "string" etc.),
> whether they are unused, useless etc. I don't see the point of
> separating them at the XML level.
> That would solve your "reduction" section naming problem, as
> there would be no such section :)
I agree with all those comments.
What about something like the following? Under grammar, we could have
rules/useless, rules/never-used, nonterminals/useless, and
terminals/unused. For example, for nonterminals:
<nonterminal type="9" symbol="$accept">
<nonterminal type="10" symbol="exp">
<nonterminal symbol="useless1" />
<nonterminal symbol="useless2" />
I have a few other issues here.
Do you think we need to summarize where nonterminals and terminals appear
as shown above? The XSLT can compute that from the grammar.
It seems to me that the "symbol" attribute above ought to be "name", and
"type" ought to be "number".
In the automaton, instead of:
we could have:
<item rule-number="0" marker="2" kernel="true" />
That could significantly reduce the size of the XML, and maybe then test
case "153: torture.at:139 Big triangle" wouldn't kill xsltproc on my
system. Maybe, but I haven't investigated the trouble there thoroughly.
In grammar/rules, instead of:
we could have:
<symbol-ref number="9" />
<symbol-ref number="10" />
<symbol-ref number="0" />
Or is this going too far? I think it's cleaner.
> Similarly, I'm not sure there should be a section about the
> conflicts, since the states already provide this information.
> I can be wrong, but I'd feel better if the XML file was
> without redundancy, even if that requires a bit more work
> from the XSLT tools. Work that I guess can be factored with
> an XLST library tailored to our XML format (I'm using words
> I understand, but which I never practiced for real, so I
> might suggest stupid things here :).
I agree with all this.
> On the contrary, as a human, I do want to have a table of
> content of the conflicts. And I'm *first* interested in
> the problems of my grammar, and then by the grammar itself.
> While tools would certainly prefer to see the grammar
> defined, and then facts about it.
In my experience, order isn't so important to XSLT. Maybe it would be for
SAX in some cases, but I've used SAX very little, so maybe I don't know.