emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#16216: closed (24.3.50; <control> entries in `ucs-


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#16216: closed (24.3.50; <control> entries in `ucs-names')
Date: Sun, 22 Dec 2013 18:11:01 +0000

Your message dated Sun, 22 Dec 2013 20:10:36 +0200
with message-id <address@hidden>
and subject line Re: bug#16216: 24.3.50; <control> entries in `ucs-names'
has caused the debbugs.gnu.org bug report #16216,
regarding 24.3.50; <control> entries in `ucs-names'
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
16216: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=16216
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: 24.3.50; <control> entries in `ucs-names' Date: Sat, 21 Dec 2013 18:09:17 -0800 (PST)
The doc for `insert-char' and `ucs-names' is sketchy.  But it does at
least say that it is about inserting a character "using its UNICODE
name or its code point."

So what are all of those `<control>' character names about?  Many
characters are listed in `ucs-names' as having this same "character
name", `<control>':

 C-x 8 RET TAB C-g
 C-h v ucs-names
 C-s <control> C-s C-s...

And yet, AFAICT, there is no UNICODE character that has the name
`<control>', or even any name that has that as a substring.
http://www.unicode.org/charts/charindex.html

The seems like a bug.  But since the description of `ucs-names' is
so sketchy it's hard to assert that.  If this is not a bug, then:

1. In what way is `<control>' a "CHAR-NAME" for a character with any
   code point?  What does CHAR-NAME mean in this case?

2. What is the purpose of the multiple `<control>' CHAR-NAMEs?

3. Why are different CHAR-CODE values associated with the same
   CHAR-NAME, `<control>'?  What does that mean?

4. Try `C-x 8 RET <contr TAB RET'.  You get only one particular
   character "named" <control>, the one with code point decimal
   159.  That's the character named "APPLICATION PROGRAM COMMAND".
   Why that one?


In GNU Emacs 24.3.50.1 (i686-pc-mingw32)
 of 2013-12-16 on ODIEONE
Bzr revision: 115543 address@hidden
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --prefix=/c/Devel/emacs/binary --enable-checking=yes,glyphs
 'CFLAGS=-O0 -g3' LDFLAGS=-Lc:/Devel/emacs/lib
 CPPFLAGS=-Ic:/Devel/emacs/include'



--- End Message ---
--- Begin Message --- Subject: Re: bug#16216: 24.3.50; <control> entries in `ucs-names' Date: Sun, 22 Dec 2013 20:10:36 +0200
> Date: Sat, 21 Dec 2013 21:08:35 -0800 (PST)
> From: Drew Adams <address@hidden>
> Cc: address@hidden
> 
> > Look at UnicodeData.txt, near the beginning of the file.
> 
> I see; thanks.  And I recall now that you pointed me to that
> file once before.
> 
> Still, that does not really answer the questions I posed, AFAICT.
> At least not for a user of `ucs-names' or the other functions
> mentioned.

I looked deeper and decided that this was a bug.  The Unicode Standard
explicitly says that control characters have no 'name' property (see
Section 4.8 in the Standard), and that those "<control>" things are
just labels.  The 'name' property cannot have lower-case characters of
"<>" in it anyway.

So starting with trunk revision 115693, all control characters will
have nil as their 'name' property, and "C-x 8 RET < TAB" will say "No
match".  (Some of the control characters have 'old-name' property, so
they still can be called out by name.)

> If `ucs-names' essentially corresponds to UnicodeData.txt, how
> about citing that in its doc?

The exact file is an implementation detail (there's a corresponding
XML file, which could be used if we wanted); the ELisp manual
documents that the properties are derived from UCD, the Unicode
Character Database.

Thanks.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]