guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PEG Patches


From: Michael Lucy
Subject: Re: PEG Patches
Date: Mon, 28 Mar 2011 17:17:09 -0500

A variant on the second option would be first defining
peg-string-compile to just throw an error, then redefining it later to
actually compile the string.  That seems a little less hackish, at
least to me.

A fifth option would be to make peg-sexp-compile take an optional
argument FUN-RECUR that it will call instead of recursing into itself
(so in your example FUN-RECUR would be peg-extended-compile).  This
involves more rewriting than the other options to pass the optional
argument around, but it's pretty clean and would allow users to write
other parsing layers on top of peg-sexp-compile should they wish
(achieving similar results to the fourth option).

On Mon, Mar 28, 2011 at 3:44 PM, Noah Lavine <address@hidden> wrote:
> Hi,
>
>> I think the solution is to confront the circularity directly.  It exists
>> because the PEG s-exp grammar also deals with the string grammar, which
>> needs an already-build PEG parser.
>>
>> Let's break it instead into layers without cycles: removing the string
>> grammar from the s-exp code generator.  If we want a layer with both, we
>> build it on top of the two lower layers.
>>
>> What do you think?
>
> I've been working on that. The attached two patches break the
> circularity. The code still isn't organized brilliantly, but after
> applying these I think we would only want pretty minor cleanups before
> merging PEG into the main branch.
>
> However, there's an interesting issue which I am not sure how to
> confront. Here it is:
>
> Currently, peg-sexp-compile is defined as a big case statement:
>
> (define (peg-sexp-compile pattern accum)
>  (syntax-case pattern (....)
>    <lots of cases here>))
>
> What these patches do is take out the case for embedded PEG strings,
> so the case statement has one fewer case. Then they add a new function
> peg-extended-compile, defined by
>
> (define (peg-extended-compile pattern accum)
>  (syntax-case pattern (peg)
>    ((peg str)
>     (string? (syntax->datum #'str))
>     (peg-string-compile #'str (if (eq? accum 'all) 'body accum)))
>    (else (peg-sexp-compile pattern accum))))
>
> peg-string-compile takes a string, parses it, and then calls
> peg-sexp-compile on the result, so this is noncircular.
>
> Unfortunately, this sacrifices a feature. The trouble is that the
> cases in peg-sexp-compile call peg-sexp-compile on parts of
> themselves, because PEG expressions are recursive. Those inner PEG
> expressions can never contain embedded string PEGs with this
> definition, because those calls never go through peg-extended-compile.
>
> I see a few options:
>  - say that string PEGs can only occur at the top level of a PEG
> expression. The peg module has never been released, so no one uses
> this feature now anyway.
>  - instead of defining a new function peg-extended-compile, redefine
> peg-sexp-compile via set! once we have string pegs.
>  - write peg-extended-compile as its own big case statement, basically
> duplicating peg-sexp-compile.
>  - adopt some interface that allows people to extend the cases in
> peg-sexp-compile. We would start with just s-expression PEGs, then use
> this interface to add string PEGs later in the load sequence.
>
> The second and third options seem hackish to me. The third option is
> especially bad because I think some of the calls to peg-sexp-compile
> are in helper functions that peg-sexp-compile calls, so we might have
> to duplicate most of codegen.scm to make this work.
>
> The fourth option seems elegant, but I'm not sure what a good
> interface for that is. Is there anything in Guile now that can
> idiomatically be used for an extensible list of cases? It seems almost
> like something GOOPS would do, but not quite. I am also a bit
> concerned about the fourth option because it could become an interface
> that is only ever used once, and might just add unnecessary
> complexity.
>
> I think the first option is the best one for now, because it doesn't
> require much work and it would allow a smooth transition if we ever
> enable non-top-level PEG strings in the future. What do other people
> think?
>
> Noah
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]