[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A project-files implementation for Git projects

From: Tassilo Horn
Subject: Re: A project-files implementation for Git projects
Date: Mon, 23 Sep 2019 09:42:50 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Dmitry Gutov <address@hidden> writes:

Hi Dmitry,

>> No, it doesn't slow down the listing (in comparison to just hg status
>> --all).  However, my test hg repo is not extraordinarily large (~4000
>> files).
> If performance is C*N where N is the number of files, then we can
> compare the times to complete on any medium-sized repo, as long as the
> list of ignores is significant (though they don't need to match
> anything).

I'll see if I can get some larger repo and report back.  With my test
repo with ~4000 files, it took around 0.25 seconds no matter if zero or
ten --exclude patterns were given.

> Anyway, if the perf looks good to you we could push the improvement
> first and then deal with negative reports.


>>> BTW, can Hg support extra whitelist entries as well?
>> "hg status --all" prints everything including ignored files.  An
>> --exclude restricts the output and filters the output so that
>> matching files are not listed.  --include also restricts the output
>> so than only files matched by such an include pattern are listed.
> What about the hgignore contents? Does EXTRA-IGNORES in the Hg
> implementation actually mean ALL-IGNORES, i.e. will we need to pass
> the whole ignores list there?

No, "hg status --all" prints files with their status, e.g.,

$ hg status --all
? unregistered.txt
I ignored.o
C .hgignore
C committed.txt

Right now, we don't collect files marked as "I"gnored.  As soon as you
add extra ignores, files will actually be filtered:

$ hg status --all --exclude '*.o'
? unregistered.txt
C .hgignore
C committed.txt

> I've been toying with an implementation for Git which uses negative
> pathspecs to specify all ignores (including the whitelist), instead of
> modifying the ignores list. Performance-wise, it looks good enough, so
> it seems my intuition was wrong. We could hit maximum command line
> length this way, though this didn't happen with Emacs's gitignore,
> which is not small. I wonder how much of a concern that would be.
> The actual implementation wasn't saved on disk and got eaten by a
> reboot, but I can show it later if you like.

Sure, then I can check if that's doable with at least hg.

>> Ok, I see.  So that would be this and it seems like now we have the
>> same semantics as with the hg version:
> Very good. Support for rooted entries and whitelist can be easily
> added here.
> There's a caveat, though: negative pathspecs have only been added
> AFAICT in Git 1.9. Whereas CentOS Stable is on Git 1.8.3 currently.
> So we'll have to handle it somehow, e.g. use a fallback for that
> version.

IMHO, the fallback is just use the existing "find" version, no?

>> A quick look at bzr suggests there's just a way restrict positively,
>> i.e., like --include with hg.
> That makes me more inclined to just hardcode two implementations (one
> for Git and another for Hg) inside project.el. At least as the first
> version of this feature.

I have no clear preference but as my main concern is better performance
with our Git repo at work, I won't object.

>>> Yeah, I wonder if we should treat this as a VC operation. On the
>>> other hand, the fallback implementation could just as well use
>>> 'find'.
>> Right now, it uses `vc-file-tree-walk'...
> Shouldn't somebody reimplement it on top of 'find'?

I don't know.  It would surely be faster but there might be systems
without 'find'.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]