bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode


From: Yuan Fu
Subject: bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode
Date: Sun, 29 Jan 2023 15:23:34 -0800


> On Jan 29, 2023, at 3:07 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
> Hi Yuan,
> 
> On 29/01/2023 10:25, Yuan Fu wrote:
> 
>>>>> So if previously it happened once somehow during a certain scenario, now 
>>>>> I have to repeat the same scenario 4 times, and the condition is met.
>>>> I was hoping that the scenario only happen once, oh well :-) I’ll
>>>> change the decision based on analyzing the tree’s dimension: too
>>>> deep or too wide activates the fast mode. Let’s see how it works.
>>> 
>>> Thank you, let me know when it's time to test again.
>> Sorry for the delay. Now treesit-font-lock-fontify-region uses
>> treesit-subtree-stat to determine whether to enable the "fast mode". Now
>> it should be impossible to activate the fast mode on moderately sized
>> buffers.
> 
> Thank you, it seems to work just fine in my scenario. And 
> treesit-subtree-stat makes sense.
> 
> I have a few more questions about the current strategy, though.
> 
> IIUC, we only do the treesit--font-lock-fast-mode test once in 
> treesit-font-lock-fontify-region, and then use the detected value for the 
> whole later life of the buffer. Is that right?
> 
> What if the buffer didn't originally have the problematic error nodes we are 
> guarding from, and then later the user wrote enough code to have at least one 
> of them? If they didn't close Emacs, or revert the buffer, our logic still 
> wouldn't use the "fast node", would it?
> 
> Or vice versa: if the buffer started out with error nodes, and consequently, 
> "fast mode", but then the user has edited it so that those error nodes 
> disappeared, shouldn't the buffer stop using the "fast mode"?
> 
> From my measurements, in ruby-mode, at least treesit-subtree-stat is 20-40x 
> faster than refontifying the whole buffer. So one possible strategy would be 
> to repeat the test every time. I'm not sure it's fast enough in the "problem" 
> buffers, though, and I don't have any to test.
> 
> In those I did test, though, it takes ~1 ms.
> 
> But we could repeat the test only once every couple of seconds and/or after 
> the buffer has changed again. That would hopefully make it a non-bottleneck 
> in all cases.

I should mention this in the comments, but the fast mode is only for very rare 
cases, where the file is mechanically generated and has some peculiarities that 
causes tree-sitter to work poorly. If the file is hand-written and “normal”, 
even huge files like xdisp.c is well below the bar. Therefore I don’t think 
“crossing the line” will realistically happen when editing source files.

Here is the stats of two “problematic files”, named packet and dec_mask, 
comparing to xdisp.c:

;;           max-depth max-width count
;; cut-off   100       4000
;; packet   (98159     46581 1895137)
;; dec mask (3         64301 283995)
;; xdisp.c  (29        985   218971)

I’d say that any regular source file, even mechanically generated, wouldn’t go 
beyond ~50 levels in depth, and hand-written files should never has a node that 
has 4000+ direct children in the parse tree.

Yuan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]