[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: X11 Compound Text vs ISO 2022

From: David De La Harpe Golden
Subject: Re: X11 Compound Text vs ISO 2022
Date: Tue, 06 Jul 2010 21:18:58 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100620 Icedove/3.0.5

On 06/07/10 17:21, James Cloos wrote:
While testing my recently applied patch, I've discovered that Emacs will
product ISO-2022 output for COMPOUND_TEXT which other libs and apps --
notably including libX11 -- cannot decode.

As an example, (encode-coding-string "•" 'compound-text) ; U+2022 BULLET
produces "^[$(address@hidden(B".  '$(O' is ISO-IR 228¹, JIS X 2013:2000.  But
libX11 only knows about the $( charsets:  0, 1, A-D and G-M.

A number of characters are output in '^[$-1'; such as:

(encode-coding-string "ℜ" 'compound-text) ; U+211C BLACK-LETTER CAPITAL R
(encode-coding-string "ʻ" 'compound-text) ; U+02BB MODIFIER LETTER TURNED COMMA

That is encoded in mule-unicode-0100-24ff, essentially unknown outside

Other libs/apps prefer to use utf-8³ in compound_text for such chars.

Not really intimately familiar with the area [compound text seems to be a bit of a horror in these days of unicode...]

But anyway, if emacs isn't using one of the character sets listed in the table in sect. 4/5 of "the" spec [1] or utf-8 as per sect.7, presumably it's an emacs bug unless emacs has successfully "registered the encoding with the X consortium" as per sect. 6 (and I don't see that happening...).

Conversely, if emacs is sending a charset that IS listed in the table
in sect. 4/5 or utf-8 as per sect. 7, then libX11 and other apps are "at fault" if they don't recognise them.



But err... the spec on freedesktop.org seems a lot older, not even mentioning utf-8 ???


reply via email to

[Prev in Thread] Current Thread [Next in Thread]