[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: how to use parsing expressing grammar
From: |
Mike Mattie |
Subject: |
Re: how to use parsing expressing grammar |
Date: |
Wed, 4 Mar 2009 22:55:54 -0800 |
On Wed, 04 Mar 2009 09:35:00 +0900
Miles Bader <miles@gnu.org> wrote:
> W Dan Meyer <specuu@gmail.com> writes:
> > Are we talking about memoising PEG (e.g Packrat) in elisp? ...
> > Packrat parser is the best option out there currently because it
> > recognises much complex grammar then LALR(1), and it has no
> > ambiguities with expressing these grammars.
>
> You also might be interested in Roberto Ierusalimschy's paper on the
> implemenation of LPEG, which is a PEG implementation for Lua:
>
> http://www.inf.puc-rio.br/~roberto/docs/peg.pdf
>
>
> Note that LPEG does _not_ use the packrat algorithm, as apparently it
> presents some serious practical problems for common uses of parsing
> tools:
>
> In 2002, Ford proposed Packrat [5], an adaptation of the
> original algorithm that uses lazy evaluation to avoid that
> inefficiency.
>
> Even with this improvement, however, the space complexity of the
> algorithm is still linear on the subject size (with a somewhat big
> constant), even in the best case. As its author himself recognizes,
> this makes the algorithm not befitting for parsing “large amounts
> of relatively flat” data ([5], p. 57). However, unlike parsing tools,
> regular-expression tools aim exactly at large amounts of relatively
> flat data.
>
> To avoid these difficulties, we did not use the Packrat
> algorithm for LPEG. To implement LPEG we created a virtual parsing
> machine, not unlike Knuth’s parsing machine [15], where each pattern
> is represented as a program for the machine. The program is somewhat
> similar to a recursive-descendent parser (with limited
> backtracking) for the pattern, but it uses an explicit stack instead
> of recursion.
>
>
> The general LPEG page is here:
>
> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
>
>
> -Miles
>
It seems there is a convergence :) I implemented the same thing in my
parser compiler, with a different intent.The PEG operators decompose
down to an internal instruction set translated by the core into
elisp functions.
The VM is a co-routine of two functions. A compiler which generates
code from a valid instruction set, and a "union" function which
determines the validity of a given instruction sequence. The union
essentially attempts to pack as many instructions as possible into
a matching function. When adding an instruction produces an invalid set
it kicks the set to the compile function and starts a fresh matching
function.
My goal was to allow the creation of custom operators for the ugly parts
of practical grammars that can be mixed with standard operators
in a consistent fashion. If the compiler is flexible enough then maybe
the AST can move inside the parser so the user does not have to re-invent
the wheel from the match hooks -> AST.
It's probably getting a bit to parser specific for emacs-help, but if you
want to compare notes feel free to e-mail me.
Cheers,
Mike Mattie