bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time


From: Gregory Heytings
Subject: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Date: Fri, 24 Sep 2021 16:24:50 +0000


IMO, the one and only case where a specialized tool beats ripgrep (or just plain grep) is when you just want the place(s) where the identifier is defined.

No, a specialized tool that uses a DB will scale much better than any tool which searches the filesystem. _And_ it will be more accurate (if used correctly).


Sorry, but I simply don't believe this. At least not for general regex searches. I'd be interested to see some numbers to support your viewpoint.

That's not correct, mkid only supports a limited number of programming languages. And it's not even precise: rg O_CREAT on the Emacs trunk for example returns 45 matches, gid O_CREAT returns 33 matches.

I'm sorry, but this has NIH written all over it. Am I right guessing that you aren't an active user of ID Utils, and perhaps didn't even know about it before I mentioned it?


You are wrong; of course I knew about ID Utils. I tried and compared it with ripgrep a few years ago, and concluded that it's a (far) less useful tool, at least for my purposes, for the reasons I mentioned: it works with a database that must be updated, which is slow, and it is not faster than ripgrep.


More to the point: are you saying that a tool that returns more matches is necessarily better?


The purpose of project-find-regexp is to find all matches.


Look closer at those matches which gid "missed", and you will see why it didn't show them to you.


I looked close (before you asked), and no, I don't see why some matches are not included. For example it returns

lib/tempname.c:212: __GT_FILE: create the file using open(O_CREAT|O_EXCL)

but not

lib/tempname.h:47: GT_FILE: create a large file using open(O_CREAT|O_EXCL)

and it returns

lib/open.c:99: /* Fail if one of O_CREAT, O_WRONLY, O_RDWR is specified and the 
filename

but not

lisp/gnus/nnmaildir.el:387: ;; If Emacs had O_CREAT|O_EXCL, we could return 
number-open here.


Oh, and if ripgrep finds only 45 matches, then something is wrong with it, because there are actually no less than 119 literal matches for O_CREAT in the tree (not counting many binary files that also match). So by this measure, ripgrep is also not the right tool for the job.


No, there are exactly 45 matches of "O_CREAT" on a fresh clone of the trunk.

Five seconds to scan the whole Emacs trunk is IMO not fast enough (ripgrep does it in < 0.2 seconds).

Those 5 sec are invested only when needed, while the time it takes Grep/ripgrep to scan the files is invested every search. Do this enough times, and you paid too much time.


Did you read the numbers I mentioned earlier? rg O_CREAT is as fast as gid O_CREAT. And this is without regexes; rg O.*CREAT is three times faster than gid O.*CREAT.

And without incremental updates, updating the database would be necessary before each invocation of gid, because what users expect when they search for something are accurate results corresponding to the current state of the project, not results from, say, an hour ago.

It is very easy to trigger a new mkid run from a file-notification that watches the project tree, so that it runs in the background without the user noticing when needed. Puff! the problem's gone.


Your "very easy" solution is still, IMO, an unnecessary complexity, with little (if any) benefit.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]