bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60953: The :match predicate with large regexp in tree-sitter font-lo


From: Eli Zaretskii
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Thu, 26 Jan 2023 20:24:11 +0200

> Date: Thu, 26 Jan 2023 19:15:51 +0200
> Cc: 60953@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> 
> On 26/01/2023 10:10, Eli Zaretskii wrote:
> > Perhaps Dmitry could present comparison of profiles from perf which
> > would allow us to understand the reason(s)?
> 
> I believe I did that in the second message in this thread: 
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60953#8
> 
> To quote the specific profiles, it's
> 
>    15.30%  emacs         libtree-sitter.so.0.0       [.]
> ts_tree_cursor_current_status
>    14.92%  emacs         emacs                       [.] process_mark_stack
>     9.75%  emacs         libtree-sitter.so.0.0       [.]
> ts_tree_cursor_goto_next_sibling
>     8.90%  emacs         libtree-sitter.so.0.0       [.]
> ts_tree_cursor_goto_first_child
>     3.87%  emacs         libtree-sitter.so.0.0       [.] ts_node_start_point
> 
> for :pred vs.
> 
>    23.72%  emacs         emacs                    [.] process_mark_stack
>    12.33%  emacs         libtree-sitter.so.0.0    [.]
> ts_tree_cursor_current_status
>     7.96%  emacs         libtree-sitter.so.0.0    [.]
> ts_tree_cursor_goto_next_sibling
>     7.38%  emacs         libtree-sitter.so.0.0    [.]
> ts_tree_cursor_goto_first_child
>     3.37%  emacs         libtree-sitter.so.0.0    [.] ts_node_start_point
> 
> for :match.
> 
> And to continue the quote:
> 
>    Here's a significant jump in GC time which is almost the same as the
>    difference in runtime. And all of it is spent marking?
> 
>    I suppose if the problem is allocation of a large string (many times
>    over), the GC could be spending a lot of time scanning through the
>    memory. Could this be avoided by passing some substitute handle to TS,
>    instead of the full string? E.g. some kind of reference to it in the
>    regexp cache.

If you are saying that GC is responsible, then running the benchmark
with gc-cons-threshold set to most-positive-fixnum should produce a
more interesting profile and perhaps a more interesting comparison.
(But I thought you concluded that GC alone cannot explain the
difference in performance?)

Otherwise, the profiles are too similar to support any conclusions,
and the fact that process_mark_stack is in a prominent place doesn't
help.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]