[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode
From: |
Yuan Fu |
Subject: |
bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode |
Date: |
Sun, 29 Jan 2023 15:23:34 -0800 |
> On Jan 29, 2023, at 3:07 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:
>
> Hi Yuan,
>
> On 29/01/2023 10:25, Yuan Fu wrote:
>
>>>>> So if previously it happened once somehow during a certain scenario, now
>>>>> I have to repeat the same scenario 4 times, and the condition is met.
>>>> I was hoping that the scenario only happen once, oh well :-) I’ll
>>>> change the decision based on analyzing the tree’s dimension: too
>>>> deep or too wide activates the fast mode. Let’s see how it works.
>>>
>>> Thank you, let me know when it's time to test again.
>> Sorry for the delay. Now treesit-font-lock-fontify-region uses
>> treesit-subtree-stat to determine whether to enable the "fast mode". Now
>> it should be impossible to activate the fast mode on moderately sized
>> buffers.
>
> Thank you, it seems to work just fine in my scenario. And
> treesit-subtree-stat makes sense.
>
> I have a few more questions about the current strategy, though.
>
> IIUC, we only do the treesit--font-lock-fast-mode test once in
> treesit-font-lock-fontify-region, and then use the detected value for the
> whole later life of the buffer. Is that right?
>
> What if the buffer didn't originally have the problematic error nodes we are
> guarding from, and then later the user wrote enough code to have at least one
> of them? If they didn't close Emacs, or revert the buffer, our logic still
> wouldn't use the "fast node", would it?
>
> Or vice versa: if the buffer started out with error nodes, and consequently,
> "fast mode", but then the user has edited it so that those error nodes
> disappeared, shouldn't the buffer stop using the "fast mode"?
>
> From my measurements, in ruby-mode, at least treesit-subtree-stat is 20-40x
> faster than refontifying the whole buffer. So one possible strategy would be
> to repeat the test every time. I'm not sure it's fast enough in the "problem"
> buffers, though, and I don't have any to test.
>
> In those I did test, though, it takes ~1 ms.
>
> But we could repeat the test only once every couple of seconds and/or after
> the buffer has changed again. That would hopefully make it a non-bottleneck
> in all cases.
I should mention this in the comments, but the fast mode is only for very rare
cases, where the file is mechanically generated and has some peculiarities that
causes tree-sitter to work poorly. If the file is hand-written and “normal”,
even huge files like xdisp.c is well below the bar. Therefore I don’t think
“crossing the line” will realistically happen when editing source files.
Here is the stats of two “problematic files”, named packet and dec_mask,
comparing to xdisp.c:
;; max-depth max-width count
;; cut-off 100 4000
;; packet (98159 46581 1895137)
;; dec mask (3 64301 283995)
;; xdisp.c (29 985 218971)
I’d say that any regular source file, even mechanically generated, wouldn’t go
beyond ~50 levels in depth, and hand-written files should never has a node that
has 4000+ direct children in the parse tree.
Yuan
- bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode, (continued)
bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode, Yuan Fu, 2023/01/18
bug#60691: 29.0.60; Slow tree-sitter font-lock in ruby-ts-mode, Yuan Fu, 2023/01/29