emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter maturity


From: Daniel Colascione
Subject: Re: Tree-sitter maturity
Date: Sun, 29 Dec 2024 09:59:38 -0500
User-agent: K-9 Mail for Android


On December 29, 2024 5:21:21 AM EST, Yuan Fu <casouri@gmail.com> wrote:
>
>
>> On Dec 29, 2024, at 1:14 AM, Daniel Colascione <dancol@dancol.org> wrote:
>> 
>> 
>> 
>> On December 29, 2024 3:59:50 AM EST, Yuan Fu <casouri@gmail.com> wrote:
>>> 
>>> 
>>>> On Dec 29, 2024, at 12:41 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>>>> 
>>>>> Date: Sun, 29 Dec 2024 03:01:44 -0500
>>>>> From: Daniel Colascione <dancol@dancol.org>
>>>>> 
>>>>>>> Enforcing this policy will just mean that Emacs doesn't support *at 
>>>>>>> all* some languages out of the box and will put even more wind in the 
>>>>>>> sails of soft forks like Doom. Tree sitter language descriptions are 
>>>>>>> free software. There's no reason not to rely on them.
>>>>>> 
>>>>>> We started with this concept of adding tree-sitter based modes to
>>>>>> auto-mode-alist by default, but found that people who don't have the
>>>>>> grammar installed didn't appreciate seeing the warnings about the
>>>>>> missing grammars.  So Emacs 29 made these modes optional, activated
>>>>>> only by an explicit user action.  Emacs 30 still does that.
>>>>>> 
>>>>>> We are currently discussing how to improve this (see the thread Re:
>>>>>> Turning on/off tree-sitter modes, which seems to have stalled lately).
>>>>>> But until the grammar libraries are ubiquitous, and we can rely on
>>>>>> them being present on most systems, I think we will still need some
>>>>>> user say-so before enabling tree-sitter based modes.
>>>>> 
>>>>> Wouldn't vendoring the grammars, and maybe even tree sitter itself, 
>>>>> silence the complaints about the warnings? Tree sitter is pure 
>>>>> algorithmic code. It doesn't have any particular platform dependencies. 
>>>>> Why not simplify the whole system and make it a mandatory (and optionally 
>>>>> bundled) dependency so that the show cognitive load of having to consider 
>>>>> non-TS environments is just deleted?
>>>> 
>>>> First, the tree-sitter library itself is optional, so Emacs could be
>>>> built without it.  Or are you suggesting to import the library as well
>>>> into Emacs?  If we don't import the library, making it a mandatory
>>>> dependency is not TRT, IMO, because some users don't need the modes
>>>> supported by tree-sitter, so forcing them to install the library that
>>>> is not really useful to them is not right.  We never do anything like
>>>> that with any other external libraries.  GMP is special, but even for
>>>> it we added our own "mini-gmp".
>>>> 
>>>> Next, importing the grammar libraries into Emacs is not a simple
>>>> matter, either.  Their sources are in JavaScript, so if we want to let
>>>> users produce modified grammars (as we do with everything we have in
>>>> the release tarballs), they will need to have Node.js etc. installed,
>>>> which will become a prerequisite.  And there are other complications,
>>>> like the need to sync regularly with their upstream repositories.
>>>> Moreover, there's no precedent for doing this, if you exclude lwlib
>>>> and oldXMenu (which are different, since they are not developed
>>>> outside Emacs).
>>>> 
>>>> So I, for one, am not very happy to add this to our maintenance
>>>> burden.  It might make things easier for some (but see below), but it
>>>> doesn't come for free.
>>>> 
>>>> I also don't understand the fuss, really.  Compiling a grammar library
>>>> after cloning the repository takes seconds, so why do we have to do
>>>> all this on behalf of the users if the users can do it so easily, even
>>>> if distros don't?  E.g., I have on my system almost 70 grammar
>>>> libraries, which I regularly update and build with a small number of
>>>> simple Makefiles -- how hard can that be for anyone who is interested
>>>> in these modes?  Why does it have to be _our_ responsibility, any more
>>>> than, say, Grep or Findutils -- which are also heavily used by Emacs?
>>>> Or even the image libraries?  Why shouldn't this be the job of the
>>>> distros?  The upstream project doesn't have to think about packaging,
>>>> it's the job of the distros.
>>> 
>>> Also, distros are picking up on packaging tree-sitter grammars, so I’m 
>>> hopeful of a future where Emacs is packaged with tree-sitter grammars for 
>>> the builtin major modes. And AFAIK packagers very much dislike editors 
>>> bundling tree-sitter library and grammars themselves.
>>> 
>>> Bundling grammars has another complication which is tree-sitter 
>>> library-grammar compatibility. If we were to bundle grammars, we must also 
>>> bundle tree-sitter library, lest we risk to encounter a tree-sitter library 
>>> provided by the system that’s incompatible with the bundled grammars. And 
>>> bundling the tree-sitter library is obviously undesirable.
>>> 
>>> Yuan
>> 
>> The grammars don't make any backwards compatibility guarantees. There have 
>> been multiple Emacs bugs arising from grammars unilaterally changing 
>> terminal names and such.
>
>They don’t have to be backward compatible. Any change to the grammar is 
>probably backward-incompatible simply due to the nature of grammars. The 
>situation will be better once package managers start packing grammars with 
>Emacs. I’m also working on making Emacs explicitly state what versions of 
>grammars are compatible, so people can choose the right grammar version to 
>install.
>
>> ISTM the only way to guarantee compatibility is to vendor the whole stack.
>
>Vendoring the whole stack is one way, yes, but why do you think letting 
>package managers to package Emacs with the right grammars wouldn’t work? 

I'm thinking of single-versioning issues. If there's a Debian package for, e.g. 
the C++ grammar, and different grammar versions are incompatible, and two 
programs want that grammar, each at a different version, there's no package 
system that will make both programs happy.

Just vendoring the stack keeps things simple, reproducible, and compatible. The 
grammars are small enough that I don't see much of a downside in vendoring them 
either.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]