emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter introduction documentation


From: Yuan Fu
Subject: Re: Tree-sitter introduction documentation
Date: Fri, 30 Dec 2022 16:03:42 -0800


> On Dec 30, 2022, at 7:31 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Fri, 30 Dec 2022 03:06:37 -0800
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>> Eli Zaretskii <eliz@gnu.org>,
>> Dmitry Gutov <dgutov@yandex.ru>,
>> theophilusx@gmail.com,
>> emacs-devel@gnu.org
>> 
>>> I have asked the question before, but freedom or not, the above is a
>>> nuisance to run for every language.  If the process is as automatic as
>>> the above example demonstrates, shouldn't Emacs have a command to take a
>>> grammar and compile+install it?  I guess this could be more complicated
>>> if the grammar is generated using a custom tool-chain for each language
>>> (or is it always Javascript?), but nothing impossible.
>> 
>> Though the magic of programming, such command now exists: 
>> treesit-install-language-grammar. It needs recipes to work, though. The 
>> recipe would involve https://github.com, which I guess is probably too 
>> heretical to include in Emacs source, so I left the recipes empty. I tested 
>> the install command with these recipes:
>> 
>> (setq treesit-language-source-alist
>>      '((python "https://github.com/tree-sitter/tree-sitter-python.git";)
>>        (typescript 
>> "https://github.com/tree-sitter/tree-sitter-typescript.git";
>>                    "typescript/src" "typescript")))
> 
> Thanks.  I did some minor fixes to the doc strings, but this command
> still "needs work"(TM).  See my comments below:
> 
>  This command requires Git, a C compiler and (sometimes) a C++ compiler,
>  and the linker to be installed and on PATH.  It also requires that the
>  recipe for LANG exists in `treesit-language-source-alist'.
> 
> I don't think treesit-language-source-alist is a good idea, especially
> if we don't intend populating it, at least not as a user-facing
> feature.  Instead, the command should ask the user for the relevant
> values, and offer recording the values on some file that would be read
> next time the user wants to install an updated library.

I consider this as a fallback method for installing language grammars. Because 
distress might not end up bundle language grammar for us, and even if they do, 
they can’t cover every grammar so some user would end up needing to install 
some grammar by themselves. If we don’t include this feature, someone will 
definitely write something like this and make it a third-party package (indeed, 
someone already has). So we might have it in Emacs and do it right.

This is the use case that I had in mind when writing this function: some major 
mode xxx-mode requires language grammar for xxx, so it has the following 
instruction in its readme:

Add installation recipe of tree-sitter-xxx to your config, and run 
treesit-install-language-grammar:

(add-to-list 'treesit-language-source-alist
             '(xxx "https://github.com/xxx/tree-sitter-xxx.git";))

> 
>  OUT-DIR is the directory to put the compiled library file, it
>  defaults to ~/.emacs.d/tree-sitter.
> 
> I don't understand what "defaults" means here, since OUT-DIR is not an
> optional argument of treesit--install-language-grammar-1.

Ah yes, fixed.

> 
>  (let* ((lang (symbol-name lang))
>         (default-directory "/tmp")
> 
> A literal "/tmp" is not portable and un-Emacsy; please use
> temporary-file-directory instead.
> 
>         (soext (pcase system-type
>                  ('darwin "dylib")
>                  ((or 'ms-dos 'cywin 'windows-nt) "dll")
> 
> MS-DOS doesn't use DLL files.  Please use dynamic-library-suffixes
> instead, it's already set up correctly.  And the code should be ready
> for that variable having a nil value.

Fixed those, thanks.

> 
>          (message "Cloning repository")
>          ;; git clone xxx --depth 1 --quiet workdir
>          (treesit--call-process-signal
>           "git" nil t nil "clone" url "--depth" "1" "--quiet"
>           workdir)
> 
> Why "--depth 1"?  This should be a defcustom, and the default should
> be to clone the full repository, IMO.  Also, what about updating the
> library when it is already installed, and the Git repository already
> exists for it?  Or are we going to clone anew each time and them
> remove the repository? that could make its cloning be slow in some
> cases.

Since the purpose of this command is to install the grammar, why would we want 
a full clone? For an “average user”, all they need is the library. If they 
wants to hack on the grammar, it makes more sense to install the toolchain and 
clone the repository themselves. And yes, this command clone anew each time and 
removes the repository. 

> 
>          ;; cp "${grammardir}"/grammar.js "${sourcedir}"
>          (copy-file (concat grammar-dir "/grammar.js")
>                     (concat source-dir "/grammar.js"))
> 
> Why is this part needed?  In any case, please don't use concat to
> produce file names, use expand-file-name instead.  Also, we should
> call copy-file with 4th argument non-nil, I think.

To be honest I don’t remember, it is in build.sh so I copied it verbatim. I’ll 
see what it’s for. (But I kept it for now.)

> 
>          (treesit--call-process-signal
>           cc nil t nil "-fPIC" "-c" "-I." "parser.c")
> 
> I wonder why we don't use 'compile' here.  That would show the
> compiler output without any extra efforts.

I wanted to keep it simple, synchronous, and quiet, and didn’t thought much 
about it.

> 
>          ;; Copy out.
>          (copy-file lib-name (concat out-dir "/") t)
> 
> See above: don't use concat here.
> 
> This command should also be mentioned in NEWS, where we describe how
> to install the grammar libraries.

I’ll do that if we decide this function is desirable and good.

> Bottom line: I think we need first to discuss how we want such a
> facility to work, and only then implement it.

I agree. I was worried about the feature freeze thing :-)

Yuan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]