Re: CC Mode and electric-pair "problem".

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CC Mode and electric-pair "problem".

From:	Alan Mackenzie
Subject:	Re: CC Mode and electric-pair "problem".
Date:	Mon, 18 Jun 2018 18:08:46 +0000
User-agent:	Mutt/1.9.4 (2018-02-28)
Hello again, João.

On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote:
> Alan Mackenzie <address@hidden> writes:

> > No.  CC Mode comprises lots of modes, not all of them maintained by
> > me.  But even aside from that, CC Mode has often been a pioneer,
> > developing new techniques, which the rest of Emacs has then followed.
> > Examples are hungry deletion and electric indentation.

> But they are all children of cc-mode.el right?  I meant singular as in,
> afaik, nobody else independently thought of doing that besides you.

Probably other people have thought of it.  The actual doing was quite
involved.  But maybe we'll see whether or not the idea spreads.

> > We could argue about words like "terminate" indefinitely.  What I
> > think is incontrovertible is if you open a line in a string, the
> > portion after that opening is not part of the string opened on the
> > line above.  The new fontification reflects this fact.

> OK, but now reflects it reflects something that is also wrong (they're
> not statements either), but to a much greater degress. And on top of
> that with many more adverse side effects, of which only one is breaking
> e-p-m mode.

How adverse are they really?  I mean, I think you are currently in the
"looking for flaws" mode, which is essential, worthwhile, and
appreciated, but if you were just using, say, C++ mode, how bad would
these side effects actually be?  That's not a rhetorical question.  It's
about deciding whether to invest the work to make the "correct" behaviour
optional.

> > I've tried this, obviously, but as far as I'm aware, the operation of
> > C-M-* is correct for the (now syntactically incorrect) buffer.  If you
> > can give me a concrete example, I can look at it and correct it.

> It's now much hard to select the whole invalid string.  It used to be a
> matter of C-M-u C-M-SPC.  To use query-replace in the region, for
> example.

OK, thanks.  But how often does this happen?

> >> * Also inside the string, `blink-matching-paren', on by default, also
> >>   doesn't work as before: closing a paren on a NL-started string doesn't
> >>   match the opener.

> > Do you mean a NL-ENDED string?  I see matching here.  If you can be more
> > precise about the failure, I can look at it.

> No, I mean the closer.  You and the mode don't consider that a string
> anymore, but you used to, and I still want do.

OK.

> >> There are no automated tests for these things, otherwise you could be
> >> seeing test breakage here too (and, with higher probably, you may be
> >> seeing breakage in user's expectations later on).
> > No, these things are not all intended functionality of Emacs, they're
> > just side effects of the way the real functionality was implemented.

> These accidents, as you have them, work just fine in just about any
> other mode I can imagine.  And they worked just fine in c-mode up until
> your change.

I suspect it is more that people have got so used to them, that any
change will appear to be bad.  Maybe.

> Well, programming is a continuous problem in general.  If I understand
> correctly, the thing you're trying to change is an implementation detail
> of electric-pair-mode, not part of its contract, right?  If, on the
> contrary, you think it is a bug, let me know.

It is what I said at the end of my previous post.  e-p-m assumes that
whitespace has "neutral" syntax.  When it doesn't (like here, with a
string-fence property), the scan-sexps doesn't work as desired.  I'm
convinced this could be changed.

> >> > We are talking about a corner case in e-p-m, namely where e-p-m attempts
> >> > to chomp space between parens inside an invalid string.  This surely
> >> > won't come up in practice very much.  Is it worth fixing?  (I would say
> >> > yes.)
> >> Don't forget that the particular piece of e-p-m we're talking about is
> >> one of the ways (arguably the easiest way) to actually fix the specific
> >> C/C++ problem at hand for the user.  IOW it's not some random whimsical
> >> useless thing.
> > It's not useless, but it's rare - it's three things happening all at the
> > same time, namely a broken string, pseudo-matching parens and space
> > between them.  This isn't going to happen very often.  I'd wager that
> > broken strings (two "s with non-escaped NLs between them) in themselves
> > are quite rare.  But I still think it should be fixed.  :-)

> Well, it's handling the rarities that makes Emacs stand out.

Indeed!  Let's carry on doing this.

> >> > The user is visually informed of the reality: that one or more
> >> > strings are unterminated, and where the "breakage" is (where the
> >> > font-lock-string-face stops).  This is an improvement over the
> >> > previous handling, where the opening invalid " merely got
> >> > warning-face, but the following unterminated string flowed on
> >> > indefinitely.

> >> I suppose that's a "yes".  In that case, the face `warning`, which
> >> defaults to a very bright red, would be fine for me personally (and I'm
> >> confident if could be made even more evident).  Also, the fact that the
> >> remaining string is now syntax-highlighted as C statements is extremely
> >> confusing.

> > Why?  They are now C statements, and would be handled by the compiler as
> > such.

> Clarify "would". Because this doesn't compile.  My compiler doesn't even
> seem to look at anything after the unterminated string:
    
>    int main () {
>       printf("foo
>              ); 
>       printf("bar");
>       return 0;
>      }

Maybe the compiler has the same bug as the old CC Mode.  ;-)

But to see my point of view, type the following into a C Mode buffer in
Emacs-26.1, the last two lines first, then type in the first line above
them:

char *foo = "foo;
int bar = 5;
char *baz = "baz";

The entire second line, and the third line, up to the first ", get string
face.  We've been used to this for so long that we've lost sight of just
how bad and amateurish it really is.

Now do the same in master.  The fontification of the last two lines
remains unaffected by typing in the first line, as it should.

> > See above.  Perhaps it's worth noting that AWK-Mode has used this
> > method of indicating invalid strings for around 15 years, now.  There
> > have never been any complaints about this from users.

> But they weren't ever exposed to the previous behaviour, right?  And
> also, I believe that there is some discrepancy between the number users
> of AWK and C, the complexity of the average program, etc...

Most AWK programmers will also be using C, shell-script, whatever.  And
while there aren't that many of them, they aren't as rare as all that.
And when I say no complaints, I mean none whatsoever; not a single one.

> >> But now that I've understood the non-e-p-m implications of your change,
> >> I urge to at least make this configurable (if it is already
> >> configurable, then don't mind me).
> > Make correct fontification configurable?

> For some newfound value of "correct", surely...

Yes.  ;-)

> > There remains the problem of making chomping parens inside a broken
> > string work.  I honestly think that modifying elec-pair.el is the way to
> > go, but I'm open to suggestions of alternative strategies that CC Mode
> > could follow to get the same fontification, that wouldn't require
> > modifying elec-pair.el.

> As I said, I will look into providing an entry point in elec-pair.el for
> this.

Thanks.

> Didn't you mention earlier pike-mode and d-mode? Quoting your earlier
> message:

>     > Pike Mode has a special feature whereby a string starting with #"
>     > is a multiline string.  I think in D Mode (not maintained here),
>     > strings simply are multiline, and there is no such thing as an
>     > escaped EOL.

>     > The writer of the mode sets the CC Mode "language variable"
>     > c-multiline-string-start-char to the character # for Pike Mode, or
>     > some non-character non-nil value for D Mode (usually t, of
>     > course).

> Can't I do this to my c/c++ mode?  Would't this be a way to get the old
> behaviour back.  Perhaps it could be be let-bound in tests, also.

These are intended as language variables (i.e. variables which define a
language), not user configuration variables.  I can't immediately see any
adverse effects to binding them, but I can't guarantee there'll be none.

As for let binding them for tests, that should be for a short time only.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).
[Prev in Thread]
Current Thread
[Next in Thread]
Re: CC Mode and electric-pair "problem"., (continued)
Prev by Date: Re: CC Mode and electric-pair "problem".
Next by Date: Re: CC Mode and electric-pair "problem".
Previous by thread: Re: CC Mode and electric-pair "problem".
Next by thread: Re: CC Mode and electric-pair "problem".
Index(es):
- Date
- Thread