[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20173: 24.4; Rendering misallocates combining marks on ligatures

From: Richard Wordingham
Subject: bug#20173: 24.4; Rendering misallocates combining marks on ligatures
Date: Tue, 24 Mar 2015 08:28:28 +0000

On Tue, 24 Mar 2015 05:42:18 +0200
Eli Zaretskii <address@hidden> wrote:

> If the setting of composition
> rules for Arabic is not the culprit, then what is?  AFAIK, there are
> no rules that guide Emacs's shaping except what's in
> composition-function-table.  Beyond that, the only other factor is the
> font backend and how it shapes glyphs given the chunks of text Emacs
> presents to it.

The font backend on Unixy systems consists of three components - m17n
(shaping control), libotf (OTL look-up implementation) and Freetype
(glyph rendering).  The glue between them is in Emacs,
most relevantly in function ftfont_drive_otf() in ftfont.c.

My analysis of the problem, which could quite easily be wrong, is as
follows.  To control the positioning of marks for the mark2ligature
lookup, it is necessary to record in some fashion which component of
the ligature a mark applies to.  I cannot see this information being
stored.  The information should be generated and used by libotf, but
needs to be stored between callbacks of ftfont_drive_otf() by m17n.
(The initial settings are implicit in the sequence of codepoints.)
Storing this information would, so far as I can see, require a change to

I may be able to change my font to work round this bug; I can certainly
change it to hide the symptom I observed.  The solution will be to
categorise the ligature NAA <U+1A36, U+1A63> as a base glyph rather
than as a ligature glyph.

There are other places where the HarfBuzz rendering system, which aims
to be compatible with Windows, uses this information.  In particular,
marks applied to a ligature are only allowed to ligate if they apply to
the same component of a ligature, and mark2mark positioning only
applies if the two marks apply to the same component.  This logic is
described as 'the most tricky part of the OpenType specification'.
Part of the trickiness may be that it seems not to have been
published externally (possibly not even internally) by Microsoft.  The
guiding principle seems to be that one should do the right things to the
marks on a ligature of Arabic consonants.

I have become well-acquainted with this logic because the 'same
component logic' seems to be applied by HarfBuzz regardless of whether
the marks are preceded by a base glyph or a ligature glyph.  The
Windows logic seems similar, but is subtly different.  I hit problems
with the Tai Tham NAA ligature, because the marks above on its two
components do interact.  The marks below should probably also interact,
but combinations where I would expect them to have to interact seem not
to occur in natural text.

> > As to what needs fixing in the Arabic section of misc-lang.el:

> Thanks, I will look into these.

You might want to first check whether composed Arabic is
usable. Doesn't making each word a grapheme cluster makes editing
unpleasant?  It might be worth restricting the clustering to
cursively connected sequences of letters within a word.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]