bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60953: The :match predicate with large regexp in tree-sitter font-lo


From: Dmitry Gutov
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Thu, 26 Jan 2023 01:21:08 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 25/01/2023 14:49, Eli Zaretskii wrote:
Cc: 60953@debbugs.gnu.org
Date: Wed, 25 Jan 2023 05:48:13 +0200
From: Dmitry Gutov <dgutov@yandex.ru>

We can probably match the regexp in-place, just limit the match to the range of 
the node.

That's what I tried to do in the patch attached to the first message:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60953#5

But the effect on performance was surprisingly hard to notice. It also
broke the actual highlighting, but that's probably because the regexp
uses anchors \` and \', which don't really work for fast_looking_at
calls inside a buffer.

The condition for a match in that patch is not correct, AFAIU:

    if (val >= 0)
      return true;
    else
      return false;

It should be "if (val > 0)" instead, since fast_looking_at returns the
number of characters that matched (unlike fast_string_match in the
original code, which returns the _index_ of the match).

Thank you. Unfortunately, the performance improvement from this patch is still fairly negligible.

Even though I got the highlighting to work -- by removing the \` and \' anchors from ruby-ts--builtin-methods (reducing the precision a little, but that's not important for the benchmark).

Switching to using :pred with function (like I did in commit d94dc606a0934) which still uses buffer-substring inside is significantly faster.

Also, fast_string_match is capable of succeeding if the match begins
not at the first character, whereas fast_looking_at does an anchored
match.  Do we expect the text to match from its beginning in this
case?  If not, I think the replacement didn't do what the original
code does, and you should have used search_buffer or maybe
search_buffer_re instead.

I suppose one could use a non-anchored regexp with :match, but that's not the case with the regexp I'm using currently.

Anyway, that's only going to be important if we find something that I missed here with this patch. Because otherwise the major bottleneck is somewhere else.

If we do end up using it and try to get it to 100% correctness, I suppose a combination of narrow-to-region (so that the \` and \' anchors work) with re-search-forward can do the trick.

Although I've tried using that combination inside ruby-ts--builtin-method-p (to avoid the buffer-substring call), and it wasn't much of an improvement in performance either.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]