[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: enum instead of #define for tokens

From: Akim Demaille
Subject: Re: RFC: enum instead of #define for tokens
Date: 04 Apr 2002 12:11:45 +0200
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp)

>>>>> "Paul" == Paul Eggert <address@hidden> writes:

>> From: Akim Demaille <address@hidden> Date: 03 Apr 2002 19:13:05
>> +0200
>> I have seen patches from Jim Blandy to enable macros debugging in
>> gdb.

Paul> Didn't know that.  That reduces the priority of using enums for
Paul> bison.

Yep (modulo what reported Miles).

>> At home, I'm working on moving the engine from using shorts
>> everything as indices into arrays to using actual pointers.

Paul> Doesn't this grow the table size by a factor of 4 on 64-bit
Paul> hosts?  For typical parsers it wouldn't matter too much, but for
Paul> parsers with large tables this could be a big hit.  In
Paul> particular, I worry that it might hurt performance for
Paul> dynamically linked modules, since the dynamic linker might have
Paul> to relocate all those pointers individually when the module is
Paul> loaded.  I recall that Ulrich Drepper went through the GNU C
Paul> library recently, replacing many pointers with integers,
Paul> precisely to improve performance this way.

Oh man, don't tell me this :(  I view these changes are really needed
for Bison.  There are so many different uses of short, that it becomes
quite unreadable.  It also results in many bizarre indirections via
arrays in several different places.

Also, under some conditions (which are pretty rare, I agree), symbols
and rules have to be renumbered.  This means that you need to walk
through all the arrays, renumbering from the old number, to the new
number.  Using pointers, I no longer have to do that: you just change
the member `number' of symbols/rules, and your done.

The motivations for this change is (i) making the maintenance of Bison
easier, thanks to a minimum of type checking service from the
compiler, and (ii) making extensions of Bison much easier: you don't
need to know all the arrays that exist in there to recovered the
values associated to this or that guy: you have the guy, so you have
all you need about it.

Paul> With this in mind I would like to retain the option of using
Paul> integers in these tables, though the integers may need to be
Paul> wider than they currently are.  They could be narrower, too, to
Paul> save space.

We already know that we have to escape from short in most places, so
we are talking about int vs pointer, hence a factor of 2.  Can this be
really a problem?  Do we have to forget about the natural C
programming, heavily based on pointers, to move to indices in arrays :(
This is really a bad news to me.

I really want to handle my guys, not artificial numbers, and I want
the compiler to type check what I do.  Even more: I want the type to
tell me what I'm manipulating, using shorts only makes it a nightmare
to find the meaning.  And typedefing shorts is not a solution, as the
compiler does not help.

>> My position, probably not very nice, is that this should not
>> happen.  Passing chars (wchars) as tokens is wrong.

Paul> I tend to agree, but I also think we're stuck with it, as POSIX
Paul> requires it and it's extremely common practice.  

POSIX is probably not referring to Unicode anyway.  And IIRC, POSIX
mandates 257 as first symbol number, so if we move to Unicode
char-tokens, we are no longer POSIX compliant.  Well, that's my
understanding, but I'm ready to be corrected.

Paul> At best we can warn in the documentation that it doesn't work if
Paul> you change encodings between the Bison run and the cc+runtime
Paul> runs.

Or we should find a means not to output the characters as
shorts/integer, but as the characters themselves.

>> I think only the latest was not installed: the one that removes the
>> ability of growing the stack when %union is not used.

Paul> OK, you're talking about

Paul> http://mail.gnu.org/pipermail/bison-patches/2002-March/000760.html

Paul> I started to do this, and even checked in white-space changes to
Paul> bison-1_29-branch to make the merge go easier, but ran into one
Paul> little problem.  Should YYLTYPE_IS_TRIVIAL and
Paul> YYSTYPE_IS_TRIVIAL be m4 thingamabobs in the main branch?  I'm a
Paul> bit new to the m4ization of Bison so it's not obvious to me how
Paul> it should be done.

I'd say so.  Make it a muscle, and use it in the skeleton.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]