bug#31290: Fundamental bugs in syntax-propertize

From: Alan Mackenzie
Subject: bug#31290: Fundamental bugs in syntax-propertize
Date: Fri, 27 Apr 2018 21:08:59 +0000
Hello, Emacs.

There are fundamental bugs in syntax-propertize and
syntax-propertize-function.  The doc string of the latter states:

    The specified function may call `syntax-ppss' on any position before
    END, ....

This is untrue.  True is that syntax-ppss can be called on a position
only up to syntax-propertize--done.  After this point, the syntax-table
properties haven't been applied, so calling syntax-ppss is, in general,
going to give a false result.

At least that would be true if syntax-propertize--done hadn't been
prematurely and spuriously increased, crudely to prevent an infinite
recursion, falsely indicating to the syntax-ppss infrastructure that the
syntax-table properties have already been applied to the region (BEGIN

    .... but it should not call `syntax-ppss-flush-cache', ....

Why not?  Because syntax-ppss-flush-cache sets syntax-propertize--done
back to its true value, allowing the wrongly allowed syntax-ppss calls at
a later position to cause a recursive loop.

    .... which means that it should not call `syntax-ppss' on some
    position and later modify the buffer on some earlier position.

This is a bad restriction, because sometimes syntax-table properties can
only be correctly determined by examining the syntax of later buffer
positions.  An example of this is giving the string-fence syntax-table
text property to an unbalanced opening string quote, but not to correctly
matched quotes.


The plain fact is that (syntax-ppss pos) calls (syntax-propertize pos),
so syntax-propertize cannot itself use syntax-ppss because of the
recursive loop thus created.


Proposed solutions:

1. Major modes' syntax-propertize-function's are somehow given read
access to syntax-propertize--done, and may call syntax-ppss up to that
point only.  syntax-propertize--done is updated only after the
syntax-table properties have been applied.  Or....

2. syntax-propertize-function's are banned from using syntax-ppss, the
documentation instead directing them to use parse-partial-sexp directly.

In either solution, the restriction on using syntax-ppss-flush-cache
would no longer be necessary, and there would be no restriction on
setting syntax-table text properties at an earlier position than the one
currently being analysed.

I think solution 2 is the better one.

Alan Mackenzie (Nuremberg, Germany).

