[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31315: wrong font encoding for fallback font

From: Eli Zaretskii
Subject: bug#31315: wrong font encoding for fallback font
Date: Tue, 01 May 2018 18:22:49 +0300

> Date: Tue, 01 May 2018 08:36:44 +0200 (CEST)
> Cc: address@hidden, address@hidden
> From: Werner LEMBERG <address@hidden>
> > And I think you might be mistaken in your interpretation of what
> > "gb18030.2000" in the font name means: I think it's the font registry,
> > not its encoding.
> Yes, but the font registry implies the used encoding to access the
> font.

Having said that, you seem to contradict yourself right away:

> The real encoding of the font is irrelevant (the Droid Sans Fallback
> font is a standard TrueType font that has only a Unicode cmap);

So I still think we may be miscommunicating.

> what matters is how the font backend provides the font to the
> client.  Calling `xlsfonts' I see that X11 offers access as follows.
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0
>   -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0

I think we have a terminology problem here, most probably my fault.
What exactly do you mean when you say "font backend" in this context?
And what is "the client" in this case?

I'm afraid using xlsfonts doesn't help me understand what am I
missing, because I have only a vague idea of what that command does,
beyond the basic fact that it lists fonts.

> > What we put in font-encoding-alist now was a deliberate change in
> > Jan 2008, in response to a bug report; see
> >
> >   http://lists.gnu.org/archive/html/emacs-devel/2008-01/msg00754.html
> >
> > If fonts like this one need to have characters encoded by gb18030,
> > then I think we need to change what the value says.
> As can be seen above, the font itself doesn't need GB18030.  It's the
> font backend that provides this encoding, and Emacs accesses it.

In my terminology, "font backend" is in Emacs (xfont.c, xftfont.c,
etc.), and the encoding happens in the backend, guided by
font-encoding-alist, among other things.  And your OP vs the
experiment with changing font-encoding-alist clearly shows that
encoding characters correctly for the xfont backend _is_ required to
display the correct glyphs with fonts handled by that backend.

> > But this area in Emacs is under-documented, so I'm not sure I've
> > got it right, in particular what is the effect of ENCODING and
> > REPERTORY in this context.  For most font back-ends, ENCODING is
> > ignored, because the back-end is capable to encode the character we
> > hand to it.  But the xfont back-end indeed uses Emacs's encoding
> > functions to do that externally to the corresponding X APIs.  Which
> > might explain why this problem, if indeed we fail to specify the
> > correct encoding for this charset, was never reported till now:
> > xfont is rarely if ever used.
> Emacs doesn't fail to specify the correct encoding.  The problem is
> that it feeds the font backend with characters in the wrong encoding
> (namely Unicode instead of GB 18030).

"Fails to specify the correct encoding" is the reason why it uses
wrong encoding for the characters in the font backend xfont.c.  I
believe this is again a terminology problem.

> >> It's a completely different question why on my system Emacs uses a
> >> font encoded in GB 18030 as a fallback font.  It's probably related
> >> to the fact that I use `mew' as my e-mail program, manually
> >> extended to cover GB 18030.  Unfortunately, I wasn't able yet to
> >> trigger the issue with `emacs -Q' (which by default uses iso10646
> >> for the fallback font).
> >
> > Well, we cannot try helping you to unlock this unless you tell how
> > you "manually extended" Emacs.
> Oh, I haven't extended Emacs, sorry for the bad wording.  I've simply
> added a line to mew's elisp code to make it recognize GB18030 in
> e-mails.

If you received a GB18030 encoded email, it is expected that Emacs
will try to find a font that explicitly supports GB18030.  This is a
feature that AFAIU is very important to CJK users: they expect Emacs
to select a font that declares support for the character's charset as
set by the decoding machinery.

> > In general, the way to request that Emacs uses fonts you like with
> > certain characters or charsets is by customizing your fontsets.  I
> > cannot say more without hearing the details.
> I don't have any fontsets customized in my `.emacs' file.

Well, it sounds like you should.  Emacs chooses fonts using techniques
that prefer speed to accuracy, and if that gives suboptimal results,
the way to improve them is to guide Emacs by tailoring your fontset to
the fonts you have installed and to the visual appearance you happen
to like.

> >> On the other hand, as soon as the problem happens, it happens with
> >> any buffer containing CJK characters not displayable with the
> >> current font, so it seems a genuine Emacs core bug.
> >
> > What "problem" do you allude to here?  The first (seemingly
> > incorrect encoding) or the second (fallback to this particular
> > font)?
> Both.  If I open a new file Unicode encoded file, Emacs continues to
> use GB18030.2000 as the charset registry/encoding for displaying
> fallback characters, failing to convert Unicode to GB18030 before
> accessing the characters from the font backend.

The former part is not a bug at all.  When Emacs needs to display a
character that is not supported by the frame's default font, it first
tries all the fonts it already has loaded, before it searches the rest
of the fonts on your system.  So once the GB18030.2000 font is loaded,
Emacs will use it for any character not supported by other loaded
fonts.  Or did I miss something?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]