emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions about tree-sitter


From: Lynn Winebarger
Subject: Re: Questions about tree-sitter
Date: Wed, 6 Sep 2023 12:11:24 -0400

On Wed, Aug 30, 2023 at 3:03 AM Yuan Fu <casouri@gmail.com> wrote:
> > On Aug 29, 2023, at 2:26 PM, Augustin Chéneau (BTuin) <btuin@mailo.com> 
> > wrote:
> > I have a few questions about tree-sitter.
> >
> > I'm currently developing a grammar for GNU Bison alongside a tree-sitter
> > major mode, it's a work in progress.  The grammar is here:
> > <https://gitlab.com/btuin2/tree-sitter-bison>, still incomplete but so
> > far able to parse simple files, and the major mode prototype is
> > attached to this message.
> >
> > So, the questions:
> >
> > 1. Is there a way to reload a grammar?
> >
> > Emacs is pretty nice as a playground for testing grammars, but once a
> > grammar is loaded, it won't be loaded again until Emacs restarts (as far
> > as I know).
> > Is it possible to reload a grammar after modifying it?
>
> No, and it’s probably not easy to implement either, since unloading the 
> grammar would require Emacs to purge/invalid all the node/query/parsers using 
> that grammar.

Reviewing some generated "parser.c" files, and some of the available
documentation, it appears the parser.c file basically creates a lexing
function that adheres to a certain protocol in terms of
producing/consuming a standard lexer state data structure, and an
LR(1) parser table suitable for GLR parsing (i.e. allows ambiguous
actions).  These and definitions of the tokens and grammar symbols are
bundled up in a language structure passed to the tree-sitter library.
LALR(1) tables are essentially simplified/compressed LR(1) tables, and
emacs has code to calculate such tables directly in elisp.
Therefore, given functionality to translate elisp data into the raw C
structures, we should be able to dynamically create language data
structures to pass to the tree-sitter library to create a library.
We would also need a table driven lexer framework in place of the
generated lexer in the C file to completely avoid going through a C
compiler.
The other novel features of tree-sitter parsers appear to be
implemented in the parser runtime, not in the table calculation.

I've implemented LALR(1) parser generators two or three times in the
last couple of decades, this might be a fun project for me while I am
unambiguously able to contribute to GNU Emacs.

Regards,
Lynn



reply via email to

[Prev in Thread] Current Thread [Next in Thread]