[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31315: wrong font encoding for fallback font

From: Eli Zaretskii
Subject: bug#31315: wrong font encoding for fallback font
Date: Wed, 02 May 2018 18:22:50 +0300

> Date: Tue, 01 May 2018 21:30:14 +0200 (CEST)
> Cc: address@hidden, address@hidden
> From: Werner LEMBERG <address@hidden>
> > I think we have a terminology problem here, most probably my fault.
> > What exactly do you mean when you say "font backend" in this
> > context?  And what is "the client" in this case?
> OK, sorry.  I mean the X11 font backend.  Here's my global picture.
>           gb18030               unicode
>  Emacs  ----------->   xft   ------------>  DroidSansFallback.ttf
> For me, Emacs is a client of the xft font interface.  In our
> particular case, xft provides `DroidSansFallback.ttf' to Emacs as a
> font encoded in GB18030 – Emacs obviously has requested a font in this
> encoding.  Behind the scenes, however, xft communicates with the
> `DroidSansFallback.ttf' font using Unicode (the font has no other
> cmap).

If by "xft" you mean the part of the X libraries that supports the
APIs used by xfont.c, then I think we are on the same page now.

> > If you received a GB18030 encoded email, it is expected that Emacs
> > will try to find a font that explicitly supports GB18030.
> >
> > This is a feature that AFAIU is very important to CJK users: they
> > expect Emacs to select a font that declares support for the
> > character's charset as set by the decoding machinery.
> While this is correct for other CJK encodings like GB, JIS, KSC, or
> Big5, it is *not* true for GB18030.  This is *only* an encoding and
> *not* a charset!  It is simply another representation of Unicode,
> comparable to UTF-8 or UCS4.  There doesn't exist a single font
> natively encoded in GB18030!  This encoding only exists to be
> code-wise backward compatible with GB 2312.

Maybe so, but GB18030 is a Chinese encoding, and as such it behaves in
Emacs as all the other Chinese encodings.

Emacs employs that logic for every charset it has defined, including
Latin-2, for example: if text was decoded from an encoding which
supports a particular charset, Emacs puts the corresponding 'charset'
text property on the decoded text, and the machinery which selects the
appropriate font tries first to find a font which supports that
charset.  The idea is that users in a particular culture have certain
distinct preferences wrt fonts, and that an encoding that supports a
certain charset or culture provides a hint about those preferences.
This idea is very central in how Emacs selects fonts.

> To a certain extent it is valid to assume that a user of GB18030
> expects Chinese glyph representation forms for characters in the CJK
> range.  However, since full Unicode is supported, this assumption is
> rather weak.

Weak or not, Emacs tries to heed it.

> >> I don't have any fontsets customized in my `.emacs' file.
> >
> > Well, it sounds like you should.  Emacs chooses fonts using
> > techniques that prefer speed to accuracy, and if that gives
> > suboptimal results, the way to improve them is to guide Emacs by
> > tailoring your fontset to the fonts you have installed and to the
> > visual appearance you happen to like.
> For the purpose of reporting this bug I thought it would be best to
> not use further deviations of `emacs -Q'...

My comment was not in the context of the bug report (where your
assumption is absolutely correct), it is rather a response to your
broader complain regarding an ugly font that creeps into display of
text which was encoded in GB18030.  You can tell Emacs to use other
fonts for that charset by customizing your fontset.

> >> Both.  If I open a new file Unicode encoded file, Emacs continues
> >> to use GB18030.2000 as the charset registry/encoding for displaying
> >> fallback characters, failing to convert Unicode to GB18030 before
> >> accessing the characters from the font backend.
> >
> > The former part is not a bug at all.
> I agree.  I only wanted to tell you what I observe.

Well, you called that a "problem".  I understand that we now agree the
first part is not a problem in itself.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]