[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Grammatica-users] Easier lexer generation.
From: |
Per Cederberg |
Subject: |
Re: [Grammatica-users] Easier lexer generation. |
Date: |
Tue, 28 Jun 2005 23:56:41 +0200 |
Matti,
What you are suggesting is actually quite similar to an old
future improvement suggestion for Grammatica:
https://savannah.nongnu.org/bugs/?func=detailitem&item_id=3599
That is interesting for more purposes though, allowing the
creation of very readable and compact grammars. Why aren't
grammars modularized and abstracted the same way software is?
I guess it would be possible to do already today, but one sure
needs some glue handy. The Parser class require something that
behaves like a Tokenizer (which is unfortunately not an
interface), so one has to implement the public API of that
class. Also, as a parser outputs a parse tree, and not a stream
of tokens, one would have to create a special Analyzer subclass
that converts certain (but obviously not all) productions into
Tokens. Not that tricky maybe, but perhaps not all that
beautiful either.
Cheers,
/Per
On tue, 2005-06-28 at 12:00 +0300, Matti Katila wrote:
> On Tue, 28 Jun 2005, Per Cederberg wrote:
> > Well, the Grammatica lexer is essentially identical to most other
> > tools. It provides support for defining tokens via regexps (and
> > plain strings). If you want to use EBNF, you can just write those
> > "tokens" as productions instead.
>
> This would make the division to lexer and parser much harder. For
> example python lexer which marks_ indent and dedent is easy to do after
> lexing - we may call that post-lexer. Post-lexing is much easier to do
> after long literals_ and comments are regocnized already.
>
> _marks =: http://docs.python.org/ref/indentation.html
> _literals =: http://docs.python.org/ref/strings.html
>
> Of course, I don't know yet since I haven't tried it out, I could create
> foolexer.grammar and fooparser.grammar for grammar foo and use some sort
> of wrapping with grammatica to use lexer.grammar as lexer where I want to
> use ebnf in lexer grammar.
>
> > The general problem with grammars where tokens are specified in
> > EBNF is efficiency and ambiguities.
>
> Efficiency sounds as premature-optimizing which shall considered as no-no :)
>
> > That is not a problem in a language specification document where
> > readability is the highest concern.
>
> I think readability is high concern in every case :) Well, I think I know
> what you mean.
>
>
> -Matti