[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Probably dumb question: glyph rendering on unicode-2 branch
From: |
Kenichi Handa |
Subject: |
Re: Probably dumb question: glyph rendering on unicode-2 branch |
Date: |
Tue, 25 Oct 2005 10:33:01 +0900 |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Adrian Robert <address@hidden> writes:
> I didn't get any response to the below, let me try asking it in a
> different way:
Sorry for not responding on this matter. It seems that I
missed your original mail.
> unicode-2 branch:
> dispextern.h:
> struct glyph {
> ...
> /* Character code for character glyphs (type ==
> CHAR_GLYPH). */
> unsigned ch;
> ...
> }
> ...
> struct glyph_string {
> ...
> /* Characters to be drawn, and number of characters. */
> XChar2b *char2b;
> int nchars;
> ...
> }
> {x,mac,w32}term.c:
> x_encode_char(int c, XChar2b *char2b, ...)
> {
> ...
> }
> x_draw_glyph_string(struct glyph_string *s)
> {
> ...
> }
> Questions:
> 1) Is 'int c' passed to x_encode_char() the same as 'unsigned ch' in
> struct glpyh?
Mostly yes. The exception is in the case that x_encode_char
is called on an element of composition glyph. In that case,
x_encode_char is called from get_char_face_and_encoding
which is called from BUILD_COMPOSITE_GLYPH_STRING macro on
each element of a composition glyph.
> 2) In either case, what are they -- UCS-2? UTF-16? MULE? UCS-4?
> UTF-32? What is the byte ordering?
It is a character code used in Emacs. The value range is
0x0..0x3FFFFF. Among them, 0x0..0x10FFFF are exactly the
same as Unicode characters. I think it's nonsense to ask
"byte ordering" of (int). That's depends on your hardware
architecture.
> I'll be happy to RTFM if this is documented anywhere..
The file src/character.h contains some documentation about
character code.
>> I apologize if this is a dumb question, but I've been looking
>> through the code and can't figure this one out: on the unicode-2
>> branch, if a font specifies "iso-10646-1" for XLFD registry/
>> encoding (and then fontset.c sets 'charset' accordingly), what
>> exactly is getting passed in struct glyph_string.char2b to
>> x_draw_glyph_string()?
If a font has CHARSET_REGISTRY "iso10646" and
CHARSET_ENCODING "1", the font contains only BMP characters.
Emacs-unicode uses such a font only for BMP characters.
>> Not UTF-8, since it's just 2 bytes.
>> UCS-2? UTF-16? Don't these exclude a lot of unicode characters?
Yes. But, as far as I know, there's no consensus about what
to specify in a font supporting SMP or SIP in
CHARSET_REGISTRY and CHARSET_ENCODING fields.
>> Does emacs provide any internal facility to get UTF-8?
Do you mean a way to convert a character code to UTF-8 byte
sequence in C level? Then you can use the macro CHAR_STRING
(defined in character.h) because Emacs-unicode's internal
string/buffer representation is UTF-8 byte sequence.
---
Kenichi Handa
address@hidden