[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Subprojects in project.el (Was: Eglot, project.el, and python virtua

From: Dmitry Gutov
Subject: Re: Subprojects in project.el (Was: Eglot, project.el, and python virtual environments)
Date: Fri, 25 Nov 2022 01:38:08 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 25/11/22 00:46, Tim Cross wrote:

João Távora <joaotavora@gmail.com> writes:

On Thu, Nov 24, 2022 at 3:01 AM Dmitry Gutov <dgutov@yandex.ru> wrote:

  I'm imagining that traversing a directory tree with an arbitrary
  predicate is going to be slow. If the predicate is limited somehow (e.g.
  to a list of "markers" as base file name, or at least wildcards), 'git
  ls-files' can probably handle this, with certain but bounded cost.

I've seen references to superior performance benefits of git ls-file a
couple of times in this thread, which has me a little confused.

There has been lots in other threads regarding the importance of not
relying on and not basing development on an underlying assumption
regarding the VCS being used. For example, I would expect project.el to
be completely neutral with respect to the VCS used in a project.

That's the situation where we can optimize this case: when a project is Git/Hg.

So how is git ls-file at all relevant when discussing performance
characteristics when identifying files in a project?

Not files, though. Subprojects. Meaning, listing all (direct and indirect) subdirectories which satisfy a particular predicate. If the predicate is simple (has a particular project marker: file name or wildcard), it can be fetched in one shell command, like:

git ls-files -co -- "Makefile" "package.json"

(which will traverse the directory tree for you, but will also use Git's cache).

If the predicate is arbitrary (i.e. implemented in Lisp), the story would become harder.

I also wonder if some of the performance concerns may be premature. I've
seen references to poor performance in projects with 400k or even 100k
files. What is the expected/acceptable performance for projects of that
size? How common are projects of that size? When considering
performance, are we not better off focusing on the common case rather
than extreme cases, leaving the extremes for once we have a known
problem we can then focus in on?

OT1H, large projects are relatively rare. OT2H, having a need for subprojects seems to be correlated with working on large projects.

What is the common case, in your experience, and how is it better solved? Globally customizing a list of "markers", or customizing a list of subprojects for every "parent" project?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]