help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finding out when a token is consumed


From: Hans Aberg
Subject: Re: Finding out when a token is consumed
Date: Fri, 9 May 2003 20:36:42 +0200

At 07:06 -0600 2003/05/09, David Fletcher wrote:
>>>>>> "HA" == Hans Aberg <address@hidden> writes:
>
>HA> At 02:26 +0200 2003/05/09, Frank Heckenbach wrote:
>
>HA> We still do not get to know say whether this is a language given
>HA> to you, or whether the examples you have are just your own
>HA> experimenting with designing a grammar.
>
>When it comes down to it, does it really matter?

My guess is that if somebody is asking about it, they deem it important.
Are you saying that you know better what others should think?

>HA> The difference here is that these are some localize, well defined,
>HA> features which can be handled by special tricks. You are asking
>HA> for something else, fuzzy directives that can be thrown in
>HA> anywhere.
>
>From what I can tell, these don't appear to be "fuzzy."  I admit I
>haven't been following this too closely, but I don't see the evidence
>for your claim, Hans.

If you now know what is relevant or not without following the details,
knowing that those details are not fuzzy but sharp, why do you not give the
answer of the problem?

>>> In a theoretical sense, it might be cleaner to put all such things
>>> in a proper grammar (even if it causes some conflicts and might
>>> even require a more general and less efficient parsing method). In
>>> practice, it's probably better to use an efficient LALR(1) parser
>>> for the most part, and do the ugly bits outside of the parser. My
>>> question was only about how to interface it to the parser in the
>>> least painful way ...
>
>HA> It is difficult to make such interfacing, especially when there is
>HA> no way to identify the interfacing segments.
>
>I understand what you're saying, but I don't completely agree.  Here
>are a few approaches that I've taken in the past, and all have their
>strong and weak points, but all of the approaches have worked for me.
>Choosing a particular approach depends on a variety of factors:

>       - You can modify your grammar to handle special constructs in
>         a direct fashion.  As you have noted, this can "bloat" the
>         grammar if you're not careful.  Sometimes, I've created
>         special grammar rules (leaf rules) that only process tokens
>         arriving from the lexer, and the rest of the grammar doesn't
>         deal with tokens arriving from the lexer at all.  With care,
>         this approach can simplify situations like you describe.
>
>       - You can modify your lexer to handle the language directives,
>         but doing this completely in the lexer can be tricky because
>         you have to know how they tie in vis a vis the grammar.  It
>         sounds like the directives get "pushed up" the syntax tree?
>         Even so, without knowing all of the details it appears that
>         you might be able to perform some special processing to pull
>         this off.  I'd have to look at your examples more closely...
>
>       - Sometimes I've created intermediate code that sits between
>         the lexer and parser.  That is, yacc calls MySpecialLex()
>         instead of yylex(), and MySpecialLex() will call yylex()
>         when needed.  But, MySpecialLex() maintains its own state,
>         might do special processing at certain times.  Doing this
>         can keep the grammar much simpler.  The result is still
>         efficient in operation, and easier to support than the
>         alternative (modifying the grammar).  I've used this
>         approach a number of times, and the deciding factor comes
>         down to the coding complexity and resultant support
>         cost(s).
>
>       - It may make sense to create an intermediate representation
>         from your parser, with the directives attached to this
>         representation.  A post-processing step can then "apply"
>         these directives in the correct fashion.  I've done this to
>         good effect in the past.  If your language is complex enough,
>         this may be worth considering.
>
>       - Finally, you might consider altering the parser to insert
>         special code to do what you need.  This is... uhm... painful
>         with bison and getting harder all the time.
>
>         byacc might be simpler to work with.  If you don't mind
>         switching to other tools and grammars, there is a plethora
>         of parser generators.  Some are quite interesting (e.g.,
>         Elkhound) and well-proven (e.g., ANTLR).  Perhaps a more
>         flexible recursive-descent parser might suffice?  For
>         example, the latest g++ parser is no longer a yacc-based
>         parser, but a hand-written r.d. parser.  I know that there
>         are r.d. parser generators available, and some appear to be
>         quite good.  But, it may be more work to switch from the
>         (quite dated) yacc syntax to something else, so modifying
>         the yacc grammar might be the way to go.  It sounds like
>         you've already done this using the "approved interface," but
>         this interface appears to be lacking.

We are told that there is no specific given grammar for those extra
directives, but they can be thrown in just anywhere. So how could those
methods work then?

>HA> If you want to say implement operators with dynamic precedents,
>HA> then this can be done by writing Bison rules that make the save
>HA> the expression components in a stack, and then write a special
>HA> program that sorts out how they should be combined. So here one is
>HA> saving the semantics and sorts it out later. If you want to make
>HA> use of lexer context switches, you must make sure they do not
>HA> clash with parser lookaheads.
>
>This sounds rather complicated and prone to failure.  I realize that
>certain languages are implemented this way, my experience is that this
>leads to hard-to-maintain code.  Most people don't take this into
>account, as maintenance hovers around 70% of the cost for a software
>system.  If the code is to survive, whether open source or not, it
>needs to be easy to maintain.

If you want to describe how you implement say Prolog dynamic operators with
a large number of levels integrated into a Bison generated parser, you are
welcome.

I do not see how why the method would result in hard to maintain code -- at
least not the version I did.

>HA> ...suggest that you are designing your own language where
>HA> directives can be thrown in just anywhere. This is a poor language
>HA> design, because you do not get semantics attached to parsing tree.
>
>It doesn't sound to me that the directives are thrown in willy nilly.
>Instead, it sounds like this gentleman is looking for more powerful
>ways to handle directives that are potentially well-defined, so that
>these directives can be processed once instead of myriad places in the
>grammar.  He already has a solution that works, it just uses a macro
>that wasn't intended for this purpose.  If this macro works, why not
>use it?  Perhaps the bison maintainers should consider expanding on
>the macro?

He told that they could be thrown in just anywhere, with no indicated
references to a grammar. Then it is going to be hard, because of the
lookahead problem, just as he indicated.

>From where I sit, this is a question of the best way to code for
>simplicity, reduced maintenance, and elegance more than anything
>else.

You mean, if one first only gets a working version? Personally, I prefer
accurate, structured software that works over code written for simplicity,
reduced maintenance, and elegance more than anything else with a lot of
bugs in it.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]