[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] Re: ftfont ISO10646-1 font bug found (was Re: 23.0.60; Heavy dis

From: sand
Subject: [PATCH] Re: ftfont ISO10646-1 font bug found (was Re: 23.0.60; Heavy display problems with new font backend)
Date: Tue, 6 May 2008 21:44:15 -0700

address@hidden writes:
> I think I have found the cause for problem #1.  In ftfont_list(), the
> code gathers a list of candidate fonts that match the
> foundry/family/... requirements:
>     objset = FcObjectSetBuild (FC_FOUNDRY, FC_FAMILY, FC_WEIGHT, FC_SLANT,
>                    FC_WIDTH, FC_PIXEL_SIZE, FC_SPACING,
>                    FC_CHARSET, FC_FILE,
>   #ifdef FC_FONTFORMAT
>                    FC_FONTFORMAT,
>   #endif    /* FC_FONTFORMAT */
>                    NULL);
>   /* ... elided ... */
>   fontset = FcFontList (NULL, pattern, objset);
> Note that this doesn't include any registry restriction.
> The code loops across the returned fontsets, calling
> ftfont_pattern_entity() to generate font_entity structs.  But at no
> point does it attempt to filter the font list by compatible
> registries.  We get, for example:
> (gdb) frame
> #0  ftfont_pattern_entity (p=0x89fce40, frame=148009620, registry=138791553) 
> at /home/upham/src/emacs/Apollo/emacs-cvs/src/ftfont.c:116
> (gdb) p file
> $151 = (FcChar8 *) 0x8a70688 
> "/home/upham/.fonts/jmk/neep-alt-iso8859-1-06x11.pcf.gz"
> (gdb) p registry
> $152 = 138791553
> (gdb) xpr registry
> Lisp_Symbol
> $153 = (struct Lisp_Symbol *) 0x845ca80
> "iso10646-1"
> Emacs will think that "neep-alt-iso8859-1-06x11.pcf.gz" is a valid
> font for displaying "iso10646-1", but it isn't, and we end up with
> missing code points.
> This explains why removing the iso8859-1 fonts fixed the problem
> (except for the mode line file name): the current code also points
> iso8859-1 requesters to iso10646-1 fonts, and those always work.  I
> also think this explains why I don't see this consistently across
> hosts: depending on how the font list is ordered (maybe due to inode
> ordering on disk?), some hosts will get a correct iso10646-1 ->
> iso10646-1 mapping first at display time, while others will get an
> incorrect iso10646-1 -> iso8859-1 mapping.
> Another family that should have the same problem is misc-fixed, as it
> also has both iso8859-1 and iso10646-1 registry fonts.  There may be
> other families that I'm not aware of.

Here's a patch against today's CVS HEAD.

The ftfont_spec_pattern() function generates an FcPattern object that
can be used to list only fonts matching the spec.  For the purposes of
this discussion, there are two "interesting" ways of restricting
patterns: via charset (FcCharSet), or via langset (FcLangSet).  The
former requires the font to have each of the codepoints listed in the
FcCharSet.  The latter requires the font to support all the languages
in the FcLangSet.

1. If we pass a font spec with registry ISO-8859 to
ftfont_spec_pattern(), then the code sets up an FcCharSet that has
every ASCII codepoint (but not Latin-1, that's commented out for some

2. If we pass a font spec with a non-ISO-8859, non-ISO-10646,
non-Unicode-BMP registry, the function immediately returns an empty

3. ISO-10646 and Unicode-BMP registries are handled in a more
complicated manner...

If the ISO-10646 font spec has an associated :script parameter (or an
OpenType spec that refers to a script), the code looks in
'script-representative-chars' for codepoints to put into a charset.
If the font spec has an associated language, the code adds the
language to the langset.

However, an ISO-10646 font spec without a special script or language
ends up with neither a charset nor a langset.  The resulting pattern
will match *any* characters and languages.  In partcular, it will let
an ISO-8859 font match the ISO-10646 spec.

The fix below checks for a missing charset and missing langset.  In
that case, we create a charset with at least one ISO-10646 codepoint
outside of ISO-8859.  The charset should be as small as possible,
since a font missing any of the charset's codepoints becomes
completely invalid.  I have chosen LEFT DOUBLE QUOTATION MARK, which
is associated with English and which I believe is pervasive.

With the new charset restriction, ISO-8859 fonts are no longer
considered matches and the font mismatch problem goes away.

(We could add codepoints 32 through 127 and 192 through 255 to the
ISO-10646 charset, but it's unlikely that any font advertising itself
as ISO-10646 will be missing those codepoints.  If we do need those
extra codepoints, we can copy the implementation from


Derek Upham

------------------------------ cut here ------------------------------

Index: ftfont.c
RCS file: /sources/emacs/emacs/src/ftfont.c,v
retrieving revision 1.9
diff -u -u -r1.9 ftfont.c
--- ftfont.c    3 Apr 2008 08:16:54 -0000    1.9
+++ ftfont.c    6 May 2008 21:08:44 -0000
@@ -38,6 +38,9 @@
 #include "font.h"
 #include "ftfont.h"
+/* Codepoint in ISO-10646 that most English fonts will have. */
 /* Symbolic type of this font-driver.  */
 Lisp_Object Qfreetype;
@@ -521,6 +524,20 @@
+  /* Lack of charset and langset at this point indicates an requested
+     ISO-10646 registry with no special script or language
+     requirement.  We need a charset with some codepoint outside of
+     the ISO-8859-* range that most "English" fonts will have.
+     Otherwise the resulting pattern will also match ISO-8859 fonts.  */
+  if (! charset && ! langset)
+    {
+      charset = FcCharSetCreate ();
+      if (! charset)
+        goto err;
+      if (! FcCharSetAddChar (charset, CODEPOINT_ISO10646_ENGLISH))
+        goto err;
+    }
   pattern = FcPatternCreate ();
   if (! pattern)
     goto err;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]