emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Handling extensions of programming languages


From: Stephen Leake
Subject: Re: Handling extensions of programming languages
Date: Tue, 30 Mar 2021 11:41:11 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (windows-nt)

haj@posteo.de (Harald Jörg) writes:

>> For indentation, it's fundamentally harder (for the same reason that
>> combining two LALR grammars doesn't necessarily give you an LALR
>> grammar), so it will have to be done in a somewhat ad-hoc way.
>
> Indeed.  Indentation needs more "context".

The Gnu ELPA package 'wisi' provides a way to declare indentation in the
grammar as actions; that provides all the context needed.

The wisi parsers also have excellent error correction, so the grammar
actions operate on a complete syntax tree (or fail utterly when the
input is really bad).

I have not tried to use wisi for Perl; it works for Ada and Java.

This does not address your issue of extending a language with new
syntax; as far as wisi is concerned, that is a new language, and needs
an entirely new grammar file. This is true for any LR parser.
It may not be true for a packrat parser, although the base parser would
have to provide hooks in each nonterminal parsing routine.

In wisi, it might be possible to extend the grammar file syntax with
something like:

#base_grammar <grammar file>

but it would still generate separate parsers for the base and extended
languages.

As long as the extended language is a superset of the base language, it
mostly doesn't hurt to always use the extended language parser. The
ada-mode parser implements a language that is an extension of standard
Ada 2012; that reduces conflicts and simplifies specifying indentation.

One downside of using an extended parser; it will not report syntax
errors for extended syntax in a file that is not supposed to contain
any. For ada-mode this is not a significant problem; the extensions
allow things that no Ada programmer would write even by mistake, and the
real compiler catches them soon enough.

> And as for indentation...  I'd say the code in both modes needs to catch
> up with current perl before we consider extensions.  Maybe they could
> share functions or regular expressions how to find the beginning of a
> function, or how to identify closing braces which terminate a statement:
> The specification for this logic comes from Perl and should be the same
> for both modes.

The reason I started the wisi package and WisiToken parser generator was
to migrate ada-mode away from ad-hoc code to grammar based code, to
support Ada 2012. To work well, the parser needs to be error correcting.
SMIE is inherently more error tolerant than an LR parser without error
correction, but I doubt it's good enough for indent.

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]