bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60953: The :match predicate with large regexp in tree-sitter font-lo


From: Dmitry Gutov
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Mon, 30 Jan 2023 02:49:47 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 26/01/2023 23:26, Dmitry Gutov wrote:
(But I thought you concluded that GC alone cannot explain the
difference in performance?)
I'm inclined to think the difference is related to copying of the regexp
string, but whether the time is spent in actually copying it, or
scanning its copies for garbage later, it was harder to say. Seems like
it's the latter, though.
If we can avoid the copying, I think it's desirable in any case.  They
are constant regexps, aren't they?

Yes, but how?

Memoization is one possible step, but then we only avoid re-creating the predicate structures for each match. We still send a pretty large query and, apparently, get it back..? Might be some copying involved there.

TBH the moderate success the memoization patch shows has me stumped.

Okay, I have cleaned up both experiments that I had. And when combined, they make the :match approach a little faster than the :pred one.

I'm still not sure why the difference is so little, given that the :pred one has Lisp funcalls and extra allocation, and :match does not.

Still, if nobody has any better ideas, I suggest we install both of these changes now. They are attached in separate patches.

memoize_vector.diff improves the performance of both cases. For :pred, it's roughly 10%; for :match, it's more.

treesit_predicate_match.diff improves the performance of the latter, though only a little: maybe 3-4%.

Code review welcome.

Is applying (and undoing) the narrowing this way legal enough? Or should I go through some error handlers, or ensure blocks, etc?

Speaking of pref, the profile looks like this now (very similar to what it was before the added rule):

17.25% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_current_status 10.93% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_goto_next_sibling 9.89% emacs libtree-sitter.so.0.0 [.] ts_tree_cursor_goto_first_child
   9.01%  emacs  emacs                       [.] process_mark_stack
   4.80%  emacs  libtree-sitter.so.0.0       [.] ts_node_start_point
   3.84%  emacs  emacs                       [.] re_match_2_internal
   3.82%  emacs  libtree-sitter.so.0.0       [.] ts_tree_cursor_parent_node
3.06% emacs libtree-sitter.so.0.0 [.] ts_language_symbol_metadata

Attachment: memoize_vector.diff
Description: Text Data

Attachment: treesit_predicate_match.diff
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]