[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?

From: Drew Adams
Subject: RE: Change of Lisp syntax for "fancy" quotes in Emacs 27?
Date: Fri, 2 Feb 2018 16:00:12 -0800 (PST)

> I see two main categories of users here, with different needs.
> Less-expert users are likely to run into problems with quotes
> and other characters (that's why we got bug reports), and
> appreciate diagnostics pinpointing the problems; also,
> programmers concerned about security are likely to want these
> confusing characters to be diagnosed, to prevent an attacker
> from sending code that is easily read one way but actually
> operates in a different way.
> On the other hand, programs that generate Elisp code might
> prefer not having to special-case these characters. So
> perhaps there should be a buffer-local variable that controls
> which behavior is selected. The default behavior should be
> the one that caters better to general users and is safer.

The distinction I think needs to be made is between:

1. Trying to _warn users_ (all users, less-expert or not)
   about possible misuse of particularly confusable chars.
   This just warns about possible pilot error.

2. _Changing Lisp_ reading and evaluating, to treat some
   (all?) confusable characters specially, changing their
   syntax and requiring them to be escaped in order to be
   treated normally (i.e., as they have been treated so far).

I object to #2, NOT to #1.

#1: By all means, we should try to help users.  We can
    issue byte-compilation warnings and some interactive
    warnings - provided we can helpfully and unambiguously
    distinguish the right situations.

#2 changes Lisp in non-neglible, non-helpful ways.
   See bug #30217 for more.


There are lots more characters to which the same
non-bug "fix" of changing Lisp might be applied (which
means that users will wonder why this confusable char
is treated specially, and not that one).

Such chars include pretty much anything that could be
confused with anything that is ever used as a delimiter
in Emacs Lisp: brackets (in the British sense) of all
sorts: parens, square, angle, curly.  There are really
quite a few such bracket-confusables.

Such chars also include pretty much anything that could
be confused with any other chars that are used specially
in Lisp: period, comma, quote, backquote, colon.  Again:
there are quite a few such confusables.

They even include chars that could be confused with the
directory separators used in Emacs Lisp.

Finally (?), they include chars that could be confused
with the ASCII-digit numerals 0123456789.  There are
lots of these confusables too.

(Even with just ASCII there are confusables.  Think of
what some use in passwords or leet: zero vs uppercase
letter O, digit 1 vs lowercase letter l, etc.  We've
just gotten used to carefully distinguishing such chars.
Now there are many more, and slighter, differences to
get used to.)


Beyond the question of which chars to treat specially,
there's the question of where - in which contexts -
to try to distinguish them.

Contexts include such places as sexps being evaluated,
doc strings, and comments.

They can also include fonts: a given character might
be confusable, or more confusable, in one font than
in another.  Even font size can make a difference
(with some fonts I find myself zooming in to see
whether a quote-thingy might really be a curly quote).

The questions of which chars and where (context) are
both relevant even if we only warn users (#1) and do
not change Lisp syntax (#2).


At the very least, I would hope that if we do anything
at all about this we would start by only warning.
I really hope we will not change Lisp syntax for this,
i.e., I hope we revert the change that has been made so
far for Emacs 27.

> While we're on the topic, I suggest using the Unicode
> confusables list ... to come up with a list of confusing
> alternatives for each character that has a special meaning
> in Emacs Lisp. This should be better than our trying to
> come up with our own, ad-hoc list.
> For example, U+A78C LATIN SMALL LETTER SALTILLO (ꞌ) looks
> almost exactly like an apostrophe on my screen and is in
> the confusables list, but is not a character that Emacs
> currently checks for.

Yup, and that's just one tiny tip of this terribly
tippy iceberg.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]