emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs rendering comparisson between emacs23 and emacs26.3


From: Alan Mackenzie
Subject: Re: emacs rendering comparisson between emacs23 and emacs26.3
Date: Mon, 20 Apr 2020 21:19:41 +0000

Hello, Dmitry.

On Sun, Apr 19, 2020 at 21:12:57 +0300, Dmitry Gutov wrote:
> On 19.04.2020 20:12, Alan Mackenzie wrote:

[ .... ]

> >>> Inserting a C++ raw string opener does typically necessitate a full
> >>> scan (a search for a matching closer), but that would also be the
> >>> case using syntax-propertize.

> >> Not really. It would just mark the opener as a string opener (maybe with
> >> some extra text property), and that's that.

> > You don't know whether it's an unterminated raw string (the usual case)
> > until you've scanned for a potential closing delimiter.

> Is C++ syntax so ambiguous? Can R"( mean something else?

C++ syntax is ambiguous in several places, but not here.  R"foo(,
assuming it's not itself inside a literal, means exactly one thing.  But
you don't know whether it starts an unterminated string unless you seach
for a closing delimiter.

> > This affects the font locking.  (An unterminated opening delimiter
> > gets font-lock-warning-face, a terminated one doesn't.)

> If everything after R"( is fontified as a string, it serves as a 
> "warning" of sort as well.

That means putting syntax-table properties on every " after the R"foo(,
otherwise that string would only extend to the first such ".

But it wouldn't work well, since it wouldn't tell the user whether her
string was already closed.

> > This is the sort of feature which I'm not willing to sacrifice.

> Is it worth a full buffer scan every time you write a new raw string 
> literal?

Yes, definitely.

> >> Then font-lock would fontify the following text as string contents
> >> (until the end of the window or a little bit farther). Then you type
> >> the closer, it only has to scan a little far back (it'll call
> >> syntax-ppss to find the string opener), the closer is propertized as
> >> appropriate, and that's that. No full buffer scans at any step.

> >> I recall that fontifying the rest of the buffer as text after a simple
> >> string opener could be a sore topic for you, but raw strings should be
> >> rare enough (aren't they?), or if they are not, fontification logic
> >> could opt to do something different, while syntax-table properties will
> >> be applied the "correct" way.

> > I'm not sure what you mean by "as text".

> Sorry, I meant "as string".

OK.

Same procedure for a simple string - if it's a terminated string the "
gets font-lock-string-face, if it's not it gets f-l-warning face.

> > I've no reason to think raw strings are at all rare.  I've had
> > several bug reports for them.  I'm not sure what you mean by
> > "fontification logic ... something different" - do you mean in the
> > raw string case?

> I mean that if a raw string is unterminated, the default behavior should 
> be to fontify the rest of the buffer as string. But then again, you 
> could choose some different highliting in font-lock rules.

The current strategy is to fontify the unterminated R"foo( with
warning-face, and let the devil deal with the rest of the string (i.e. no
attempt is made to apply syntax-table properties).  The first portion of
the raw string will indeed get string-face.

As soon as the closing delimiter is typed, the warning-face is removed
from the opener and syntax-table text properties applied throughout the
string.  The entire string then gets string-face.

[ .... ]

> >> Yes, I think before-change-functions should become empty. Or much
> >> emptier.

> > It can't become empty.  after-change-functions is fine for dealing
> > with insertions, but can't do much after a deletion.  Consider the
> > case where you're in a string and all you know is that 5 characters
> > have been deleted.  Those characters might have been )foo", so after
> > checking the beginning of the string starts off with R"bar(, you've
> > then got to scan a long way forward looking for )bar".  Effectively
> > every deletion within a string would involve scanning to the end of
> > that string.

> This is an example of extra complexity you have to retain to implement 
> the above feature the way you do.

It will become more complex and slower, if information from
before-change-functions is ignored, or discarded.  The alternative is,
after each deletion, to scan forward checking that the terminating
delimiter still exists.  This is slower and more complicated than
checking in b-c-f whether it's about to be removed.

> It's probably also an example of how before/after-change-functions 
> essentially duplicate the knowledge of language syntax. I'm guessing 
> here, but to make it work like that, you need to have multiple functions 
> "understand" the raw string syntax.

b/a-c-f implement the language syntax.  It's one of the places the
language is codified.  The mechanism is in several functions, yes.  If
you're interested, go into cc-engine.el and search for "raw string".

> Whereas with syntax-propertize-function, that knowledge is concentrated 
> in one place (maybe two, if font-lock rules do something unusual). This 
> way, the code is simplified.

No, it gets complicated, assuming no loss of functionality.  A given
amount of functionality would get squashed into a smaller place.  The
current implementation (of C++ raw strings) is optimised such that normal
insertion and deletion don't cause the s-t properties on the entire
string to be modified.  That requires details of the buffer before the
insertions and deletions.

[ .... ]

> >>> Maybe, but with a slowdown.  More of these properties will get erased
> >>> than needed (with nested template forms), and they will all need to get
> >>> put back again.

> >> We won't really know until we can measure the result.

> > What's the point in investing all the effort to make the change, when
> > there's not even a prediction of a speed up?

> In principle, the speed-up will come from:

> - Deferred execution (where several buffer changes can be handled 
> together and not right away),

I've never been wholly convinced by laziness.  Sooner or later these
changes need to be handled, and delaying them is not going to accelerate
them.

> - No parsing the buffer contents much farther than the current window, 
> in most cases. Which can speed up the majority of user actions. The 
> exceptions will remain slower, but that is often a good tradeoff.

This will involve loss of functionality, as already noted.  And bugs;
whilst typing in normal text, CC Mode has to search backwards for a safe
place, otherwise context fontification can mess things up.  This is an
area where optimisation would be useful.

> > And I'm not sure where the proof of the syntax-propertize mechanism
> > being helpful is.  Has anybody but its originator positively chosen
> > to use it, whilst being aware of the alternatives?

> The alternatives being reinventing the relevant logic from zero in each 
> major mode? And writing syntax caching logic each time?

Or writing and using a better framework.

The question remains: has anybody other than Stefan M. freely chosen to
use syntax-ppss and syntax-propertize-function, whilst being aware of
their disadvantages and of alternatives?

Remember, that for an extended period of time syntax-ppss didn't work
properly, and even now it doesn't do the right thing in narrowed buffers,
at least for a programming mode such as CC Mode.  The syntax-propertize
mechanism erases s-t p's in a manner not under the control of the major
mode, which means the major mode needs to implement workarounds (which
are liable to be slow).

> > To become usable for CC Mode, it would need to provide something on
> > before-change-functions to complement what's on a-c-f, and it would need
> > to provide some control to the major mode over which syntax-table
> > properties get erased.

> Not something I can comment on.

Hmmm.

-- 
Alan Mackenzie (Nuremberg, Germany).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]