[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: /srv/bzr/emacs/trunk r101338: * lisp/emacs-lisp/syntax.el (syntax-pp

From: Stefan Monnier
Subject: Re: /srv/bzr/emacs/trunk r101338: * lisp/emacs-lisp/syntax.el (syntax-ppss): More sanity check to catch
Date: Wed, 12 Feb 2014 09:23:18 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

> I don't fully understand the explanation, but the logic "if syntax-beginning
> equals point, go to previous syntax-beginning" could've been handled in the
> specific syntax-beginning-function instead.

Could be.

>> Right, but that largely defeats the purpose of syntax-ppss (which is to
>> use caching to speed up (parse-partial-sexp (point-min) (point))).
> The optimization is still used if `syntax-ppss' is called several times
> during the syntax-propertization or fontification of one region.

In 99% of the cases, syntax-ppss is only called once during font-lock.
So, while in theory, yes, we might still get some benefits, in practice
we don't.

>> One option is to have a hook that takes a (POS . PPSS) pair, which
>> syntax-ppss intends to use as a starting point for parsing, and return
>> a new such pair to use instead, where the returned position should
>> always be >= POS.
> Sounds fine to me.  As long as the hook is called at the same point
> `syntax-ppss' is called at,

Yes, the intention would be to call this hook just before calling
(parse-partial-sexp POS (point) nil nil PPSS).

> we can check whether POS is in the same region,
> look for nested submode regions between POS and point, and either discard
> the passed PPSS if the current subregion begins after POS, or manually
> `parse-partial-sexp' each piece of the current subregion (of the primary
> mode region, if we're there) between POS and some position closer to point.

Right, the hook could find the nearest boundary and if it's before POS,
return that boundary together with the PPSS to use for it.

> We could parse the buffer till point itself, though. It wouldn't be harder
> coding-wise (we'll do `parse-partial-sexp's anyway), and that way the hook
> could be more flexible. Then the meaning of the hook would be "here's the
> last saved position and value, what will be the value at point?".

I think I'd still prefer syntax-ppss to do the call to
parse-partial-sexp (e.g. if the region to parse is very large,
syntax-ppss parses it in several chunks, storing the intermediate state
in the cache).  Also syntax-ppss would probably want to add the
(BOUNDARY . PPSS) returned by the hook to its cache.

Of course, another issue is "which syntax-table and syntax-propertize
function to use".  How does mmm-mode handle that currently?

>> This way, syntax-ppss could make full use of its cache, but mmm-mode
>> could tell it about chunk boundaries (and decide what state to use at
>> the beginning of a boundary).
>> The main problem I see with this approach is that this hook would be
>> called maybe too many times, so we'd want to improve the "fast path"
>> (i.e. the first branch in syntax-ppss which tries to use
>> syntax-ppss-last) so it can know when calling this new hook is unneeded.
> Maybe we want that, but scanning the buffer for overlays should still be a)
> proportional to the distance between bounds, b) faster than
> `parse-partial-sexp', so at worst in mmm-mode the new scheme will just be
> slower than plain `syntax-ppss' by some constant ratio, on average.

In the fast path, parse-partial-sexp may often end up parsing just
a handful (or even 0) of characters, which can be much less than the
distance to the nearest boundary.  It's important for this case to be
fast so that we can use syntax-ppss liberally, knowing it's very cheap
because we just called it nearby recently.

But maybe you're right that the performance impact might not be
that important.

> We call `syntax-ppss', happily report to it that the value at point (or some
> position near it) can be used until point + 400. Then move a few chars lower
> and delete the rest of the given region. NEXT-BOUNDARY becomes stale, and
> calling `syntax-ppss' from the region below can return a wrong value.

That's OK.  syntax-ppss already has an after-change-function to
flush its various caches.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]