bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode


From: Dmitry Gutov
Subject: bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode
Date: Tue, 10 Jan 2023 16:10:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 10/01/2023 10:10, Juri Linkov wrote:
After more rules were added recently to ruby-ts--font-lock-settings,
font-lock became slow even on very small files.  Some measurements:

If you saw a particular commit that made things slower, did you try
reverting it? What was the performance after?

No particular commit, just adding more rules degrades performance
gradually.

But I don't think I added that many rules recently. No more than a quarter anyway.

M-: (benchmark-run 1000 (progn (font-lock-mode -1) (font-lock-mode 1) 
(font-lock-ensure)))
M-x ruby-mode
(1.3564674989999999 0 0.0)
M-x ruby-ts-mode
(8.349582391999999 2 6.489918534000001)

I have tried this scenario (which, to be frank, is pretty artificial, given
that fontification is usually performed in chunks, not over the whole
buffer).

Perhaps the results depend on a particular file. The ones I have tried
(ruby.rb and ruby-after-operator-indent.rb) show only 2x difference (or
less). The difference was in favor of ruby-mode, but given the difference
in approaches I wouldn't be surprised if ruby-ts-mode incurs a fixed
overhead somewhere.

On test/lisp/progmodes/ruby-mode-resources/ruby.rb I see these numbers:

ruby-mode
(8.701560543000001 95 1.045961102)

ruby-ts-mode
(34.653148898000005 1464 16.904981779)

Interesting. It's 12s vs 36s for me, as I've retested now.

This is not a problem when files are visited infrequently, but
becomes a problem for diff-syntax fontification that wants to
highlight simultaneously many files from git logs.
So a temporary measure would be not to enable ruby-ts-mode
in internal buffers:

Is it common to try to highlight 1000 or even 100 files in one diff?

100 is rare, but tens is pretty common, so this problem affects
only this specific case.

So it's a 0,8-3s delay in those cases? That's not ideal.

(add-hook 'find-file-hook
            (lambda ()
              (when (and (eq major-mode 'ruby-mode)
                         ;; Only when not internal as from diff-syntax
                         (not (string-prefix-p " " (buffer-name))))
                (ruby-ts-mode))))

Have you tried similar tests with other -ts- modes? Ones with complex
font-lock rules in particular.

I tried with c-ts-mode, and it's very fast.

Just how fast is it? The number of font-lock features is has is comparable (though a little smaller).

I've tried the same benchmark for it in admin/alloc-colors.c, and it comes out to

  (3.2004193190000003 30 0.9609690980000067)

Which seems comparable.

Not sure how to directly test the modes against each other, but if I enable ruby-ts-mode in the same file, the benchmark comes to 1s.

Or if I enable c-ts-mode in ruby.rb -- 16s.

I've tried commenting out different rules in ruby-ts--font-lock-settings,
but none of them seem to have particularly outsides impact. Performance
seems, roughly, inversely proportional to the number of separate
"features".

Indeed, this is what I see - no particular rule, only their number
affects performance.

And if all ts modes turn out to have this problem, perhaps the place to
improve this is inside some common code.

I noticed that while most library files are small, e.g.
libtree-sitter-c.so is 401,528 bytes,
libtree-sitter-ruby.so is 2,130,616 bytes
that means that it has more complex logic
that might explain its performance.

ruby is indeed one of the larger ones. Among the ones I have here compiled, it's exceeded only by cpp. 2.29 MB vs 2.12 MB.

But testing admin/alloc-colors.c with c++-ts-mode vs c-ts-mode gives very similar performance, so it's unlikely that the complexity of the grammar is directly responsible.

In this case, when nothing could be done to improve performance,
please close this request.

Perhaps Yuan has some further ideas. There are some strong oddities here:

- Some time into debugging and repeating the benchmark again and again, I get the "Pure Lisp storage overflowed" message. Just once per Emacs session. It doesn't seem to change much, so it might be unimportant.

- The profiler output looks like this:

  18050  75%                    - font-lock-fontify-syntactically-region
  15686  65%                     - treesit-font-lock-fontify-region
3738 15% treesit--children-covering-range-recurse
    188   0%                        treesit-fontify-with-override

- When running the benchmark for the first time in a buffer (such as ruby.rb), the variable treesit--font-lock-fast-mode is usually changed to t. In one Emacs session, after I changed it to nil and re-ran the benchmark, the variable stayed nil, and the benchmark ran much faster (like 10s vs 36s).

In the next session, after I restarted Emacs, that didn't happen: it always stayed at t, even if I reset it to nil between runs. But if I comment out the block in treesit-font-lock-fontify-region that uses it

    ;; (when treesit--font-lock-fast-mode
    ;;   (setq nodes (treesit--children-covering-range-recurse
    ;;                (car nodes) start end (* 4 jit-lock-chunk-size))))

and evaluate the defun, the benchmark runs much faster again: 11s.

(But then I brought it all back, and re-ran the tests, and the variable stayed nil that time around; to sum up: the way it's turned on is unstable.)

Should treesit--font-lock-fast-mode be locally bound inside that function, so that it's reset between chunks? Or maybe the condition for its enabling should be tweaked? E.g. I don't think there are any particularly large or deep nodes in ruby.rb's parse tree. It's a very shallow file.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]