bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60953: The :match predicate with large regexp in tree-sitter font-lo


From: Dmitry Gutov
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Mon, 30 Jan 2023 21:01:02 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 30/01/2023 20:42, Eli Zaretskii wrote:
Date: Mon, 30 Jan 2023 20:20:46 +0200
Cc: casouri@gmail.com, 60953@debbugs.gnu.org
From: Dmitry Gutov <dgutov@yandex.ru>

On 30/01/2023 19:49, Eli Zaretskii wrote:
Date: Mon, 30 Jan 2023 19:15:07 +0200
Cc: casouri@gmail.com, 60953@debbugs.gnu.org
From: Dmitry Gutov <dgutov@yandex.ru>

fast_looking_at already does an anchored match, so I'm not sure I
follow.  I don't even understand why you need th \` part, when the
match will either always start from the first position or fail.

The regexp might include the anchors, or it might not.

It might also use a different anchor like ^ or $ or \b.

OK, but it always goes only forward, so narrowing to the beginning
shouldn't be necessary.  Right?

Are you saying that fast_looking_at ("\\`", ...) will always succeed?

And fast_looking_at ("^", ...), etc.

For example, for "^", if you hint that it must look back to make sure
there's a newline there, then your narrowing will also prevent it from
doing that, right?

fast_looking_at ("^", ...) succeeds inside a narrowing because it always succeeds at BOB. Even though there are no physical newlines before BOB.

One possible alternative, I suppose, would be to create a raw pointer to
a part of the buffer text and call re_search directly specifying the
known length of the node in bytes. If buffer text is one contiguous
region in memory, that is.

It isn't, though: there's the gap.  Which is why doing this is not
recommended; instead, use something like search_buffer_re, which
already handles this complication for you.  (Except that
search_buffer_re is a static function, so only code in search.c can
use it.  So you'd need to make it non-static.)

Interesting. Does search_buffer_re match the \` anchor at POS and \' at
LIM? IOW, does in treat the rest of the buffer as non-existing? Or could it?

That is the low-level subroutine called by re-search-forward, so you
know the answers already, I think?  IOW, that function behaves exactly
like re-search-forward in those situations.

So, I suppose not?

But that doesn't answer the question "Could it?".





reply via email to

[Prev in Thread] Current Thread [Next in Thread]