grammatica-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grammatica-users] Question about grammatica


From: Per Cederberg
Subject: Re: [Grammatica-users] Question about grammatica
Date: Tue, 23 Nov 2004 19:23:38 +0100

Hi James,

In general, the Grammatica parse tree is ill suited for performing
multi-pass analysis the way that you suggest. As you noted, the 
analyze() methods recreate the parse tree, in effect optimized for
manipulating the parse tree structure in each analysis pass. Also,
the node values cannot be passed from one pass to another, as the 
previous tree is ignored upon analysis except for it's structure
and tokens.

So, how to work around this? Well, the "Grammatica" way is to
regenerate the node values bottom-up in each analysis pass. A good
example of this is the Grammatica grammar parser itself. It creates
the tokens and production names in the first pass. Only in the
second pass are the right-hand side of the productions handled.
The result of the second pass is a null parse tree, as all new
nodes created are eventually discarded. In order to run the second 
analysis pass, the tokens and production names must have been
stored away somewhere outside the parse tree (in this case in a
Grammar object). Have a look at the source code in
src/java/net/percederberg/grammatica/SecondPassAnalyzer.java if
you'd like to see all the hairy details.

The Analyzer API is not based on the Visitor design pattern as the
idea is that the analysis shall be possible to perform *during* the
parsing (much like yacc does). Instead, it is a parse-tree builder
API:

  enterX()
      Called just after creating the parse tree node and before
      parsing (or analyzing) any child nodes. Can be used if some
      structure must be created before analyzing the child nodes.
      By default this method does nothing.

  childX()
      Called just after a child node has been parsed and analyzed.
      NOTE: If the child node exitY() method returned null, this
      method will be given a null child. By default this method
      adds the child node to this node.

  exitX()
      Called just after finishing parsing this node. By returning
      null from this method, this node will not be included in the
      generated parse tree. This is useful for pruning the parse
      tree during the analysis. By default this method just returns
      the current node.

So, as you see, given this analysis API it is not easy to create a
sensible visitor implementation. And that is the reason that the
one that exists in the analyze() methods creates a new parse tree
instead of attempting to reuse the old one.

In general, Grammatica is designed like this as it is not viable to
to create full parse trees at all times. In particular when parsing
very large files. One might consider adding a separate Visitor API
to the generated code, making it easier to perform multi-pass
analysis on the same parse tree, but that does not exist today.

I don't know if this helps you in any way, but it's hard to be more
specific without more details... :-)

Cheers,

/Per

On tue, 2004-11-23 at 16:49 +0000, MANSION, James, FM wrote:
> Hi
>  
> First, apologies about the standard mail gateway added stuff at the
> bottom.  I can't turn it off. :-(
>  
> Now - I'm trying to use grammatica analysis in two phases.  I'm using
> the Java API, version 1.4.
>  
> In the first, the value I return at values(0) is a type code, which I
> use in a sematic analysis pass.
> Then, I perform an interpretive pass (and actually, I may repeat this)
> in which I evaluate my parse tree having set up mappings for
> 'external' data.
>  
> One of the side effects of my semantic analysis is that I save node
> references away for various bits of the parse tree, and for literal
> values I save them in a HashMap which is keyed by the Node.
>  
> I find that unfortunately the rewriting of the Production tree in an
> analyzer works against me because of coutrse the hashCode changes as a
> result of the tree rebuild.  In my case I can work around this because
> in practice I always have a terminal Token against which I can key the
> result.
>  
> When I found that exitProd returned a Node, I had assumed that if I
> wanted to do rewriting and restructuring, then that is where I would
> do it, and that rather than visit re-adding children, the driver would
> replace the children in-place in the child vector and then
> compress-out any nulls.  I would hope that a 'null' analysis for an
> interpreter pass would not perform any tree rewriting but would just
> visit nodes. :-(
>  
> I'n in two minds whether to attempt this, or have another method
> similar to analyze(Node,ParserLogException) that just visits the
> structure, without implicitly clearing values from any nodes, and
> ignoring returned Node references so that the structure is unchanged.
>  
> How do others handle this?
> 
> James Mansion
> 
> +44 20 7648 3972






reply via email to

[Prev in Thread] Current Thread [Next in Thread]