[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: multilingual text in frame
From: |
Kenichi Handa |
Subject: |
Re: multilingual text in frame |
Date: |
Tue, 21 Jan 2003 15:19:38 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Jason Rumney <address@hidden> writes:
> Jason Rumney <address@hidden> writes:
>> It seems from the documentation I can find about XSetWMName (), that
>> compound-text is the correct choice for encoding.
> On closer reading, maybe not. I had assumed that "Host Portable
> Character Encoding" meant compound-text, but apparently the X11R5
> spec defines it as "the same for all locales on a machine" but
> otherwise leaves it unspecified. So UTF-8 and compound-text would both
> be valid choices.
No. Please see the attached document extracted from "Xlib -
C Language X Interface". It's in the source of X,
.../xc/doc/specs/X11/. Perhaps the term "Host Portable
Character Encoding" was intrduced to solve the problem of
ASCII vs EBCDIC.
> Since most X functions (including XSetWMName ())
> state in their documentation that "the result is undefined if the
> string is not in the Host Portable Character Encoding", it would seem
> to be valid for a UTF-8 based X server to not recognize compound-text
> encoding properly.
It's not an X server but a window manager that should recognize them
after executing XGetWMName. Anyway, yes, it's valid for a window
manager to ignore compound-text or even utf-8.
But, a correctly internationalized window manager will do
something like this:
XGetWMName (display, w, &text_prop);
XmbTextPropertyToTextList(dpy, &text_prop, &list, &num);
XmbDrawString (display, title_drawable, font_set, gc, x, y,
list[0], strlen (list[0]))
Of course, because of the fate of internationalization, the
window manager must run in a correct locale, and what it can
display is only the characters supported in that locale.
What we really need is multilingulization.
> But I can't find any way to find out what the Host Portable Character
> Encoding is on a given system. Perhaps Handa-san knows more about the
> I18N features of X we could use to convert the string to something
> the X server is sure to recognize.
As far as I know, there's now way to know which encoding the
window manager can recognize. So, all we can do is to
expect that the window manager is correctly
internationalized. But, any window manager will at least
recognize XA_STRING. So, x_set_name (in xfns.c) does this:
text.encoding = (stringp ? XA_STRING
: FRAME_X_DISPLAY_INFO (f)->Xatom_COMPOUND_TEXT);
---
Ken'ichi HANDA
address@hidden
1.7. Character Sets and Encodings
Some of the Xlib functions make reference to specific char-
acter sets and character encodings. The following are the
most common:
o X Portable Character Set
A basic set of 97 characters, which are assumed to
exist in all locales supported by Xlib. This set con-
tains the following characters:
a..z A..Z 0..9 !"#$%&'()*+,-./:;<=>address@hidden|}~ <space>,
<tab>, and <newline>
This set is the left/lower half of the graphic charac-
ter set of ISO8859-1 plus space, tab, and newline. It
is also the set of graphic characters in 7-bit ASCII
plus the same three control characters. The actual
encoding of these characters on the host is system
dependent.
o Host Portable Character Encoding
The encoding of the X Portable Character Set on the
host. The encoding itself is not defined by this stan-
dard, but the encoding must be the same in all locales
supported by Xlib on the host. If a string is said to
be in the Host Portable Character Encoding, then it
only contains characters from the X Portable Character
Set, in the host encoding.
o Latin-1
The coded character set defined by the ISO 8859-1 stan-
dard.
o Latin Portable Character Encoding
The encoding of the X Portable Character Set using the
Latin-1 codepoints plus ASCII control characters. If a
string is said to be in the Latin Portable Character
Encoding, then it only contains characters from the X
Portable Character Set, not all of Latin-1.
o STRING Encoding
Latin-1, plus tab and newline.
o UTF-8 Encoding
The ASCII compatible character encoding scheme defined
by the ISO 10646-1 standard.
o POSIX Portable Filename Character Set
The set of 65 characters, which can be used in naming
files on a POSIX-compliant host, that are correctly
processed in all locales. The set is:
a..z A..Z 0..9 ._-
Re: multilingual text in frame, Kenichi Handa, 2003/01/20
Re: multilingual text in frame, Phillip Lord, 2003/01/21