[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CC Mode and electric-pair "problem".

From: Alan Mackenzie
Subject: Re: CC Mode and electric-pair "problem".
Date: Mon, 18 Jun 2018 15:42:27 +0000
User-agent: Mutt/1.9.4 (2018-02-28)

Hello, João.

On Mon, Jun 18, 2018 at 14:24:39 +0100, João Távora wrote:
> Alan Mackenzie <address@hidden> writes:

> > Yes.  But it is the master branch, where not everything can be expected
> > to work all the time.  I think the main thing is, we're _going_ to fix
> > this bug.

> Well, I respectfully and totally disagree.  The reason we have automated
> tests in Hydra is to catch unintentional breakage, not intentional
> breakage.

This breakage is unintentional, and we're working out how to fix it.

> For development temporarily unhampered by tests, I think a separate
> branch is a much better alternative.  It's a very easy thing to do in
> git (and in your case, trivial to merge from and back to master, given
> you have near-total control over that area of code).

It's possible, but it's a hassle; it's outside of normal workflow,
therefore involves getting into git's execrable documentation.

> Can't you just revert the commit that broke it? 

It was three (or maybe four) successive commits.  If I revert them, it
will postpone indefinitely the bug that they've fixed.

> > OK, here goes.  Why should major modes tie themselves in knots, just so
> > that electric-pair-mode can work?  What CC Mode is doing is natural, and
> > matches the reality.

> I think you mean "mode", in the singular form :-).

No.  CC Mode comprises lots of modes, not all of them maintained by me.
But even aside from that, CC Mode has often been a pioneer, developing
new techniques, which the rest of Emacs has then followed.  Examples are
hungry deletion and electric indentation.

> Also, it doesn't "match reality": if you open a line in a string, it
> syntax highlights the remaining string as C statements, but the C parser
> doesn't see C statements.  IOW, newline doesn't *really* terminate a
> string in C.

We could argue about words like "terminate" indefinitely.  What I think
is incontrovertible is if you open a line in a string, the portion after
that opening is not part of the string opened on the line above.  The
new fontification reflects this fact.

> > electric-pair-mode's chomp facility could be more rigorously coded -
> > sometimes it is dealing with visible whitespace, sometimes it is dealing
> > with syntactic properties.  Surely it should be working with visible
> > whitespace all the time?

> No.  If it did so, it would chomp parenthesis from non-comment regions
> into comment regions, for example.

But it could use the strategy of determining the end of any comment,
then using non-syntax facilities for traversing the space up to that
end.  Or something like that.

> That doesn't make sense, not according to show-paren-mode, for example.

> By the way, after your change, very basic commands which fall completely
> outside electric-pair-mode have fundamentally changed their behaviour in
> cc-mode. Here are a few, out of Emacs -Q:

> * Open a line in a string, using C-o.  Sexp-navigation is now messed up
>   in the whole buffer, i.e. C-M-*.  Most commads error or produce
>   surprising result.  So even if the intent is to eventually add a
>   backslash escaping the newline, or make it two adjacent strings by
>   typing two quotes (something perfectly allowed by C).

I've tried this, obviously, but as far as I'm aware, the operation of
C-M-* is correct for the (now syntactically incorrect) buffer.  If you
can give me a concrete example, I can look at it and correct it.

> * Inside the string, `forward-sexp' in a parenthesis of a NL-terminated
>   string now errors where it would previously do its job of jumping to
>   the closer;

It works more or less the same as C-M-n always has from a parenthesis
inside a string, which isn't matched in that string.  Just that the
notion of "inside a string" is now more exact than it used to be.

> * Also inside the string, `blink-matching-paren', on by default, also
>   doesn't work as before: closing a paren on a NL-started string doesn't
>   match the opener.

Do you mean a NL-ENDED string?  I see matching here.  If you can be more
precise about the failure, I can look at it.

> There are no automated tests for these things, otherwise you could be
> seeing test breakage here too (and, with higher probably, you may be
> seeing breakage in user's expectations later on).

No, these things are not all intended functionality of Emacs, they're
just side effects of the way the real functionality was implemented.

> > I've attempted a bit of debugging.  In addition to
> > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction
> > ended-prematurely-fn of function electric-pair--balance-info, which
> > snagged on the end of string at EOL.

> I don't understand how this matters to the problem at hand, but
> regardless, can you make a bug report demonstrating the presumed bug and
> its impact so I can follow up?

I attempted to see how difficult it would be to modify elec-pair.el to
cope with unconstrained text properties in buffers.  This was the second
problem I came up against.

> > We are talking about a corner case in e-p-m, namely where e-p-m attempts
> > to chomp space between parens inside an invalid string.  This surely
> > won't come up in practice very much.  Is it worth fixing?  (I would say
> > yes.)

> Don't forget that the particular piece of e-p-m we're talking about is
> one of the ways (arguably the easiest way) to actually fix the specific
> C/C++ problem at hand for the user.  IOW it's not some random whimsical
> useless thing.

It's not useless, but it's rare - it's three things happening all at the
same time, namely a broken string, pseudo-matching parens and space
between them.  This isn't going to happen very often.  I'd wager that
broken strings (two "s with non-escaped NLs between them) in themselves
are quite rare.  But I still think it should be fixed.  :-)

> > The user is visually informed of the reality: that one or more
> > strings are unterminated, and where the "breakage" is (where the
> > font-lock-string-face stops).  This is an improvement over the
> > previous handling, where the opening invalid " merely got
> > warning-face, but the following unterminated string flowed on
> > indefinitely.

> I suppose that's a "yes".  In that case, the face `warning`, which
> defaults to a very bright red, would be fine for me personally (and I'm
> confident if could be made even more evident).  Also, the fact that the
> remaining string is now syntax-highlighted as C statements is extremely
> confusing.

Why?  They are now C statements, and would be handled by the compiler as
such.  Having them fontified as strings (as they previously were) was

> > The disadvantage is that e-p-m is constraining major modes in how they
> > can use syntax-table text properties.  I think this is a problem in
> > electric-pair-mode, not in CC Mode.

> Again, AFAIK, "mode", singular.

See above.  Perhaps it's worth noting that AWK-Mode has used this method
of indicating invalid strings for around 15 years, now.  There have
never been any complaints about this from users.

> And, obviously, I'm not going to special-case cc-mode in elec-pair.el:
> after doing some of my own mulling, I may open a customization point
> for cc-mode.el to use.

I think it's a general case, that of having non-neutral syntax-table
text properties on visual space characters.  What do you see as a
customisation option here?

> So at the very least, it's going to require some (potentially trivial)
> fix in cc-mode.el, for sure.


> But now that I've understood the non-e-p-m implications of your change,
> I urge to at least make this configurable (if it is already
> configurable, then don't mind me).

Make correct fontification configurable?

To sum up my viewpoint, I regard the way CC Mode now fontifies broken
strings as correct (aside from any remaining bugs, of course).  I think
elec-pair.el's assumption that whitespace always has "neutral" syntax is
unwarranted, and is the root of the current bug.

There remains the problem of making chomping parens inside a broken
string work.  I honestly think that modifying elec-pair.el is the way to
go, but I'm open to suggestions of alternative strategies that CC Mode
could follow to get the same fontification, that wouldn't require
modifying elec-pair.el.

> João

Alan Mackenzie (Nuremberg, Germany).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]