emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs-28 b7d7c2d9e9: Add cross-reference to alternative syntaxes for


From: Robert Pluim
Subject: Re: emacs-28 b7d7c2d9e9: Add cross-reference to alternative syntaxes for Unicode
Date: Tue, 18 Oct 2022 17:05:44 +0200

>>>>> On Fri, 14 Oct 2022 20:42:39 +0300, Eli Zaretskii <eliz@gnu.org> said:
    >> Hmm, so move or copy "General Escape Syntax" under
    >> (emacs)International somewhere, and refer to it from the "You can
    >> insert non-ASCII characters or search for them" section of that node
    >> (since thatʼs where we talk about C-x 8)?

    Eli> Yes, something like that.

Hereʼs a rough attempt:

diff --git c/doc/emacs/custom.texi i/doc/emacs/custom.texi
index 2bc1d3820d..817501b3f8 100644
--- c/doc/emacs/custom.texi
+++ i/doc/emacs/custom.texi
@@ -2794,9 +2794,8 @@ Init Non-ASCII
 
   An alternative to using non-@acronym{ASCII} characters directly is
 to use one of the character escape syntaxes described in
-@pxref{General Escape Syntax,,, elisp, The Emacs Lisp Reference
-Manual}, as they allow all Unicode codepoints to be specified using
-only @acronym{ASCII} characters.
+@xref{Character Escape Syntax}, as they allow all Unicode codepoints
+to be specified using only @acronym{ASCII} characters.
 
   To bind non-@acronym{ASCII} keys, you must use a vector (@pxref{Init
 Rebinding}).  The string syntax cannot be used, since the
diff --git c/doc/emacs/mule.texi i/doc/emacs/mule.texi
index f87c1252d3..c202c21aa4 100644
--- c/doc/emacs/mule.texi
+++ i/doc/emacs/mule.texi
@@ -56,7 +56,9 @@ International
 your keyboard can produce non-@acronym{ASCII} characters, you can select an
 appropriate keyboard coding system (@pxref{Terminal Coding}), and Emacs
 will accept those characters.  Latin-1 characters can also be input by
-using the @kbd{C-x 8} prefix, see @ref{Unibyte Mode}.
+using the @kbd{C-x 8} prefix, see @ref{Unibyte Mode}.  It is also
+possible to write non-@acronym{ASCII} characters using various
+pure-@acronym{ASCII} escape syntaxes, see @ref{Character Escape Syntax}.
 
 With the X Window System, your locale should be set to an appropriate
 value to make sure Emacs interprets keyboard input correctly; see
@@ -67,6 +69,7 @@ International
 
 @menu
 * International Chars::     Basic concepts of multibyte characters.
+* Character Escape Syntax:: Alternative ways to write characters
 * Language Environments::   Setting things up for the language you use.
 * Input Methods::           Entering text characters not on your keyboard.
 * Select Input Method::     Specifying your choice of input methods.
@@ -240,6 +243,63 @@ International Chars
   decomposition: (101 770) ('e' '^')
 @end smallexample
 
+@c This is (almost) verbatim from "General Escape Syntax" in the Emacs
+@c Lisp Reference Manual, please keep in sync.
+@node Character Escape Syntax
+@section Character Escape Syntax
+
+  Input methods provide ways to enter non-@acronym{ASCII} characters,
+but sometimes it is more convenient to use an @acronym{ASCII}-only
+representation, e.g. when there are several similar characters that
+are hard to visually distinguish.  Emacs provides several types of
+escape syntax that you can use to write such characters
+
+@enumerate
+@item
+@cindex @samp{\} in character constant
+@cindex backslash in character constants
+@cindex unicode character escape
+You can specify characters by their Unicode names, if any.
+@code{?\N@{@var{NAME}@}} represents the Unicode character named
+@var{NAME}.  Thus, @samp{?\N@{LATIN SMALL LETTER A WITH GRAVE@}} is
+equivalent to @code{?à} and denotes the Unicode character U+00E0.  To
+simplify entering multi-line strings, you can replace spaces in the
+names by non-empty sequences of whitespace (e.g., newlines).
+
+@item
+You can specify characters by their Unicode values.
+@code{?\N@{U+@var{X}@}} represents a character with Unicode code point
+@var{X}, where @var{X} is a hexadecimal number.  Also,
+@code{?\u@var{xxxx}} and @code{?\U@var{xxxxxxxx}} represent code
+points @var{xxxx} and @var{xxxxxxxx}, respectively, where each @var{x}
+is a single hexadecimal digit.  For example, @code{?\N@{U+E0@}},
+@code{?\u00e0} and @code{?\U000000E0} are all equivalent to @code{?à}
+and to @samp{?\N@{LATIN SMALL LETTER A WITH GRAVE@}}.  The Unicode
+Standard defines code points only up to @samp{U+@var{10ffff}}, so if
+you specify a code point higher than that, Emacs signals an error.
+
+@item
+You can specify characters by their hexadecimal character
+codes.  A hexadecimal escape sequence consists of a backslash,
+@samp{x}, and the hexadecimal character code.  Thus, @samp{?\x41} is
+the character @kbd{A}, @samp{?\x1} is the character @kbd{C-a}, and
+@code{?\xe0} is the character @kbd{à} (@kbd{a} with grave accent).
+You can use any number of hex digits, so you can represent any
+character code in this way.
+
+@item
+@cindex octal character code
+You can specify characters by their character code in
+octal.  An octal escape sequence consists of a backslash followed by
+up to three octal digits; thus, @samp{?\101} for the character
+@kbd{A}, @samp{?\001} for the character @kbd{C-a}, and @code{?\002}
+for the character @kbd{C-b}.  Only characters up to octal code 777 can
+be specified this way.
+
+@end enumerate
+
+  These escape sequences may also be used in strings.
+
 @node Language Environments
 @section Language Environments
 @cindex language environments
diff --git c/doc/lispref/objects.texi i/doc/lispref/objects.texi
index a715b45a6c..35f413c5a5 100644
--- c/doc/lispref/objects.texi
+++ i/doc/lispref/objects.texi
@@ -440,6 +440,8 @@ Basic Char Syntax
 you should write an extra space after the character constant to
 separate it from the following text.)
 
+@c This is reproduced in "Character Escape Syntax" in the Emacs
+@c manual, please keep in sync.
 @node General Escape Syntax
 @subsubsection General Escape Syntax
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]