|
From: | Dmitry Gutov |
Subject: | bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient |
Date: | Thu, 26 Jan 2023 01:21:08 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 |
On 25/01/2023 14:49, Eli Zaretskii wrote:
Cc: 60953@debbugs.gnu.org Date: Wed, 25 Jan 2023 05:48:13 +0200 From: Dmitry Gutov <dgutov@yandex.ru>We can probably match the regexp in-place, just limit the match to the range of the node.That's what I tried to do in the patch attached to the first message: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60953#5 But the effect on performance was surprisingly hard to notice. It also broke the actual highlighting, but that's probably because the regexp uses anchors \` and \', which don't really work for fast_looking_at calls inside a buffer.The condition for a match in that patch is not correct, AFAIU: if (val >= 0) return true; else return false; It should be "if (val > 0)" instead, since fast_looking_at returns the number of characters that matched (unlike fast_string_match in the original code, which returns the _index_ of the match).
Thank you. Unfortunately, the performance improvement from this patch is still fairly negligible.
Even though I got the highlighting to work -- by removing the \` and \' anchors from ruby-ts--builtin-methods (reducing the precision a little, but that's not important for the benchmark).
Switching to using :pred with function (like I did in commit d94dc606a0934) which still uses buffer-substring inside is significantly faster.
Also, fast_string_match is capable of succeeding if the match begins not at the first character, whereas fast_looking_at does an anchored match. Do we expect the text to match from its beginning in this case? If not, I think the replacement didn't do what the original code does, and you should have used search_buffer or maybe search_buffer_re instead.
I suppose one could use a non-anchored regexp with :match, but that's not the case with the regexp I'm using currently.
Anyway, that's only going to be important if we find something that I missed here with this patch. Because otherwise the major bottleneck is somewhere else.
If we do end up using it and try to get it to 100% correctness, I suppose a combination of narrow-to-region (so that the \` and \' anchors work) with re-search-forward can do the trick.
Although I've tried using that combination inside ruby-ts--builtin-method-p (to avoid the buffer-substring call), and it wasn't much of an improvement in performance either.
[Prev in Thread] | Current Thread | [Next in Thread] |