bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56682: Fix the long lines font locking related slowdowns


From: Dmitry Gutov
Subject: bug#56682: Fix the long lines font locking related slowdowns
Date: Sun, 7 Aug 2022 01:58:06 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

On 06.08.2022 10:28, Eli Zaretskii wrote:
Date: Sat, 6 Aug 2022 01:38:05 +0300
Cc: 56682@debbugs.gnu.org, Eli Zaretskii <eliz@gnu.org>,
  Stefan Monnier <monnier@iro.umontreal.ca>
From: Dmitry Gutov <dgutov@yandex.ru>

So what you prefer IIUC would be to call fontification-functions with a
locked narrowing to 1 MB if point is before that threshold, and to not
call fontification-functions at all after that threshold?  That might be
another doable approach.

If we have to support huge files with max responsiveness, then that
would be my preference, yes.

I don't see the point of using a "locked" narrowing for this, though.
Maybe not even a narrowing at all: just avoid calling
fontification-functions with START > value_of(large_file_fontification_max).

I don't understand why this would be a good idea.  First, if we are
able to fontify the first 1MB of a file, why won't we be able to
fontify the 2nd, 3rd, ... Nth 1MB in a similarly fast fashion?

Fontifying a window-ful of buffer near the end of 2N will take ~2x as long as doing that near the end of 1N, 3N - ~3x as long, etc.

Do you
assume that fontification must go to BOB, but will never go forward
towards EOB?

That's my experience, yes. Language parsers either don't need to look forward at all, or only need to look forward a little. And especially since we regularly have to deal with unfinished code (and code is most often unfinished from the tail end, not from the front), our syntax-propertize and font-lock matchers have to account for that.

Why would we make such an assumption, especially with
long lines, where EOB could also be EOL?

That's where syntax-wholeline-max comes into play: it won't extend the current region past additional 10000 chars.

Even if that fails to work somehow, we could code up font-lock to apply narrowing past a certain point, to definitely limit the possible overhead. But I'd rather we not do that without a concrete problem case.

Next, why assume that your personal preferences in this case will be
shared by everyone else?  I, for one, will find it very strange to
have font-lock suddenly disappear at an arbitrary point.

Is it more strange than syntax highlighting starting to go crazy around the same arbitrary point?

I'm assuming that people will quickly understand that very big files are treated in a special way, and I seem to recall that certain other editors use a similar approach (switch off some features past a certain file length).

If we don't
want to make the assumption it will be good for everyone, your
proposal should be one optional behavior, but not the only one.

I'm happy for this to be customizable, and for everyone to be able to choose the tradeoffs they want. For that to be possible, this choice need to happen in Lisp. But I made my arguments for what seems like a good default behavior, both reasonable-looking and fast in (hopefully) all significant cases.

The important part is for this to be customizable separately from long-line-threshold because these are different issues, and the slowdowns happen around different amounts of text.

And finally, there's no way to restrict fontification-functions from
accessing the buffer beyond the first 1MB, even if we want to, without
something like the locked narrowing.

font-lock rules really shouldn't call 'widen', we've talked about this before (in mmm-mode threads), then went through the built-in major modes which did so, and fixed that.

The sole (big) exception is CC Mode, but its internals aren't compatible with the current forced narrowing either, and Gregory said it's not important in the context of long lines or large files anyway.

So as long as major modes behave, we won't need the locked narrowing in this area. And hopefully won't need narrowing at all.

  > emacs -Q
  > M-: (setq long-line-threshold nil syntax-wholeline-max most-positive-
  > fixnum) RET
  > C-x C-f dictionary.json RET y ;; takes 160 seconds
  > C-e ;; takes 200 seconds
  >
  > emacs -Q
  > M-: (setq long-line-threshold nil) RET
  > C-x C-f dictionary.json RET y ;; immediate
  > C-e ;; not finished after 1200 seconds (20 minutes), I killed Emacs

I get what you're saying, but the approach does seem "right" to me
because it works, letting me view and edit dictionary.json with good
performance and behavior.

On a Core i9 machine and with an optimized build, I presume?  And with
"good" being your subjective judgment that places more importance on
fontification than on responsiveness?  That's not necessarily a good
data point on which to base our decisions regarding the default
behavior.

On a slower machine, you'll just need to find a correspondingly smaller file to see the same combination: the line is long enough for redisplay to stutter with (setq long-line-threshold nil), but the file is not big enough for font-lock to really become the bottleneck.

Even with my "big and beefy" i9, I've been seeing redisplay slowdowns with lines containing 1000s of chars. Very noticeable around 40K.

Whereas font-lock can remain fast with files up to 100x that size. And even then remain bearable so long as the user doesn't jump around too much, editing the file at BOB then going to EOB, time and time again.

Either way, I believe the change is at the right level of abstraction,
and if it has bugs, they should be solvable without major redesign.

Alas, Mr. ShouldBe is not available for this project, and probably
won't be any time soon.

No need to mock my request for a proper reproduction scenario.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]