Re: Multiple, different, %merge of a token sequence

help-bison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multiple, different, %merge of a token sequence

From:	Derek M Jones
Subject:	Re: Multiple, different, %merge of a token sequence
Date:	Wed, 03 May 2006 01:19:04 +0100
User-agent:	Thunderbird 1.5 (Windows/20051201)

Joel,

Thanks for the prompt reply.

I've now given your code a much closer look. This behavior is anothermanifestation of a complex Bison GLR problem that I've known about for awhile and am still working on. Unfortunately, I don't believe I'vepreviously posted about it. It touches on many areas of the GLRimplementation, and I just haven't found the time yet to properly describewhat I know.


Ok.  It would be useful to add a warning to the documentation that
this kind of thing sometimes happens (at least for the moment).

Unfortunately, it seems like the most obvious solution will worsen theparser's time complexity. I'm exploring a more sophisticated solution,but I'm afraid it'll be a while before I return to it.


I can parse + other not very complicated stuff all the .c files in
the gcc/postgresql/openafs/openmotif/netscape (original C version)
source trees in under 30 cpu seconds.  Having this increased to 60
seconds would not be a problem.

If your %merge functions eliminate subtrees, it appears that you areright: this bug actually does increase subtree sharing for your exampleinput because it delays elimination until higher up.


For me this is not really a problem because it appears to be rare and I
can ignore the leaked memory caused by not doing the free's. A bigger
potential problem is that the high level merge has to do some analysis
to handle the fact that the expected disambiguation did not happen in
a lower level production.  Since my project involves measuring C
source characteristics a small amount of 'random' error can be handled
(ie, I don't worry about doing a correct disambiguation).  See
www.knosof.co.uk/cbook/cbook.html

However, nodesharing is a general problem in GLR parsers that can't always be solved bywriting such %merge functions. Specifically, if any subtrees (such astokens) exist on a stack before a split and appear on the RHS ofreductions during the split, those subtrees will be shared by both stacksand you will have to be careful not to doubly free them when the stacksare merged.


Would I be correct to say this subtree sharing would not occur if
the grammar was left recursive?

My usual solution is node reference counting. Until this bug is fixed, itlooks like you will need that too. I suspect however that you'lleventually need reference counting regardless of this bug. You may evenneed it now for tokens.


My quick fix was to not free up any parse trees inside the merge
functions.

--
Derek M. Jones                              tel: +44 (0) 1252 520 667
Knowledge Software Ltd                      mailto:address@hidden
Applications Standards Conformance Testing    http://www.knosof.co.uk

[Prev in Thread]

Current Thread

[Next in Thread]

Multiple, different, %merge of a token sequence, Derek M Jones, 2006/05/01
- Re: Multiple, different, %merge of a token sequence, Joel E. Denny, 2006/05/01
  - Re: Multiple, different, %merge of a token sequence, Derek M Jones, 2006/05/02
- Re: Multiple, different, %merge of a token sequence, Joel E. Denny, 2006/05/02
  - Re: Multiple, different, %merge of a token sequence, Derek M Jones <=
    - Re: Multiple, different, %merge of a token sequence, Joel E. Denny, 2006/05/06
  - Re: Multiple, different, %merge of a token sequence, Paul Eggert, 2006/05/03
    - Re: Multiple, different, %merge of a token sequence, Joel E. Denny, 2006/05/06

Prev by Date: Re: Multiple, different, %merge of a token sequence
Next by Date: RE: Conditional generation of Grammar rules
Previous by thread: Re: Multiple, different, %merge of a token sequence
Next by thread: Re: Multiple, different, %merge of a token sequence
Index(es):
- Date
- Thread