Character literals for Unicode (control) characters

Lars Ingebrigtsen
Subject: Character literals for Unicode (control) characters
Date: Thu, 03 Mar 2016 05:47:56 +0000
I was implementing support for the <bdo> HTML tag the other day.  (It's
for overriding bidi directionality in text.)  This is what I ended up

(defun shr-tag-bdo (dom)
  (let* ((direction (dom-attr dom 'dir))
         (char (cond
                ((equal direction "ltr")
                 #x202d)                ; LRO
                ((equal direction "rtl")
                 #x202e))))             ; RLO
    (when char
      (insert char))
    (shr-generic dom)
    (when char
      (insert #x202c))))                ; PDF

And it just struck me that it would be kinda nice if Emacs had a literal
character syntax for these things.  I mean, we have such a syntax for
some "problematic" ASCII characters already: We recommend writing ?\s
instead of ? , and we recommend writing ?\n instead of ?
, because that's just very confusing.

And then I thought -- well, if we should have a literal syntax for
Unicode control characters, why not for all of them?  We do have the
mapping already in Emacs, so it wouldn't be very difficult to

So. Three options:

1) Add a new syntax, perhaps something like ?\ucRIGHT-TO-LEFT-OVERRIDE
for the Unicode control characters we care about.

2) Add a syntax for all Unicode characters, like ?\ucPILE-OF-POO.  We
can just write ?đź’©, so this isn't totally necessary, but perhaps it's

c) Do nothing, and continue writing code like the code above.  Or start
using the Unicode control characters directly in the code,
but ‮there lies madness‬.  (Note Unicode control characters around the
last part of the previous sentence.)

