bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60953: The :match predicate with large regexp in tree-sitter font-lo


From: Yuan Fu
Subject: bug#60953: The :match predicate with large regexp in tree-sitter font-lock seems inefficient
Date: Tue, 24 Jan 2023 19:13:29 -0800


> On Jan 23, 2023, at 8:04 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> Cc-ing Yuan, just in case.
> 
> On 20/01/2023 05:53, Dmitry Gutov wrote:
>> In my benchmarking -- using this form in 
>> test/lisp/progmodes/ruby-mode-resources/ruby.rb after enabling ruby-ts-mode:
>>   (benchmark-run 1000 (progn (font-lock-mode -1) (font-lock-mode 1) (let 
>> (treesit--font-lock-fast-mode) (font-lock-ensure))))
>> the rule added to its font-lock in commit d66ac5285f7
>>    :language language
>>    :feature 'builtin-functions
>>    `((((identifier) @font-lock-builtin-face)
>>       (:match ,ruby-ts--builtin-methods
>>        @font-lock-builtin-face)))
>> ...seems to have made it 50% slower.
>> The profile looked like this:
>>   9454  84%                   - font-lock-fontify-region
>>   9454  84%                    - font-lock-default-fontify-region
>>   8862  79%                     - font-lock-fontify-syntactically-region
>>   8702  78%                      - treesit-font-lock-fontify-region
>>    128   1%                         treesit-fontify-with-override
>>    123   1%                         facep
>>     84   0% treesit--children-covering-range-recurse
>>     60   0%                       + ruby-ts--comment-font-lock
>>      4   0%                       + font-lock-unfontify-region
>>    568   5%                     + font-lock-fontify-keywords-region
>>     16   0%                     + font-lock-unfontify-region
>> So there's nothing on the Lisp level to look at.
> 
> I've done some perf recordings now. It seems most/all of the difference comes 
> down to garbage collection. Or more concretely, time spent inside 
> process_mark_stack.
> 
> Without the added query benchmark reports:
> 
> (10.13723333 49 1.141649534999999)
> 
> And the perf top5 is:
> 
>  17.26%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_current_status
>  10.83%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_next_sibling
>  10.18%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_first_child
>   8.37%  emacs         emacs                       [.] process_mark_stack
>   4.63%  emacs         libtree-sitter.so.0.0       [.] ts_node_start_point
> 
> With this simple query that colors everything:
> 
>   :language language
>   :feature 'builtin-function
>   `((((identifier) @font-lock-builtin-face)))
> 
> I get:
> 
> (11.993968995 82 1.9326509270000045)
> 
> Note the jump in runtime that's larger than the jump in GC.
> 
>  17.26%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_current_status
>  10.83%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_next_sibling
>  10.18%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_first_child
>   8.37%  emacs         emacs                       [.] process_mark_stack
>   4.63%  emacs         libtree-sitter.so.0.0       [.] ts_node_start_point
> 
> The current query looks like this:
> 
>   :language language
>   :feature 'builtin-function
>   `((((identifier) @font-lock-builtin-face)
>      (:pred ruby-ts--builtin-method-p @font-lock-builtin-face)))
> 
> Benchmarking:
> 
> (12.493614359 107 2.558609025999999)
> 
>  15.30%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_current_status
>  14.92%  emacs         emacs                       [.] process_mark_stack
>   9.75%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_next_sibling
>   8.90%  emacs         libtree-sitter.so.0.0       [.] 
> ts_tree_cursor_goto_first_child
>   3.87%  emacs         libtree-sitter.so.0.0       [.] ts_node_start_point
> 
> Here we get the same jump in runtime as in GC. Even though this rule ends up 
> coloring much fewer (almost none) nodes in the current buffer. I interpret 
> the results like this:
> 
> - The jump in runtime of the previous query was probably related to the 
> number of nodes needed to be processed, but not with the resulting 
> highlighting, even though every identifier in the buffer ends up being 
> colored.
> 
> - The GC overhead created by the predicates is non-negligible.
> 
> And the original query that I tried:
> 
>   :language language
>   :feature 'builtin-function
>   `((((identifier) @font-lock-builtin-face)
>      (:match ,ruby-ts--builtin-methods @font-lock-builtin-face)))
> 
> Benchmarking:
> 
> (16.433451865000002 249 5.908674810000001)
> 
>  23.72%  emacs         emacs                    [.] process_mark_stack
>  12.33%  emacs         libtree-sitter.so.0.0    [.] 
> ts_tree_cursor_current_status
>   7.96%  emacs         libtree-sitter.so.0.0    [.] 
> ts_tree_cursor_goto_next_sibling
>   7.38%  emacs         libtree-sitter.so.0.0    [.] 
> ts_tree_cursor_goto_first_child
>   3.37%  emacs         libtree-sitter.so.0.0    [.] ts_node_start_point
> 
> Here's a significant jump in GC time which is almost the same as the 
> difference in runtime. And all of it is spent marking?
> 
> I suppose if the problem is allocation of a large string (many times over), 
> the GC could be spending a lot of time scanning through the memory. Could 
> this be avoided by passing some substitute handle to TS, instead of the full 
> string? E.g. some kind of reference to it in the regexp cache.
> 

FYI the predicates are not processed by tree-sitter, but by us. For example, 
the #equal predicate is handled by treesit_predicate_equal. For #match, right 
now we create a string with Fbuffer_substring and pass it to fast_string_match, 
so it definitely causes a lot of gc’s, as you observed. We can probably match 
the regexp in-place, just limit the match to the range of the node.

Yuan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]