[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic support

From: Kenichi Handa
Subject: Re: Arabic support
Date: Thu, 02 Sep 2010 22:01:07 +0900

In article <address@hidden>, Eli Zaretskii <address@hidden> writes:

> Where can I find the code which decides how to break text into
> LGSTRINGs?  I'd like to see such code for both Arabic and Hebrew,
> unless it's the same code.

A not-yet-shaped LGSTRING is created by autocmp_chars
(composite.c) from a character sequence matching with a
regular expression PATTERN stored in a
composition-function-table.  This pattern is
"[\u0600-\u06FF]+" for Arabic (lisp/language/misc-lang.el),
and a more complicated regex for Hebrew

> For example, can characters like digits or other neutrals be included
> in the same LGSTRING with Arabic and Hebrew?  Or will an LGSTRING
> always include characters from one script only?

LGSTRING always includes characters of the same font.  So,
even if you wrote PATTERN to include the other neutrals, if
a user's font setting (or environment) decides to user a
different font for those neutrals, they are not included in
LGSTRING.  By default, Emacs tries to use the same font for
characters in the same script.

In addition, even if you setup fonts to use the same font
for, for instance, Hebrew and those neutrals, "shape" method
of a font-backend may not support them.  In that case, the
composition fails anyway.

> I'm asking because it's possible that we will need to modify
> w32uniscribe.c to reorder R2L characters before we pass them to the
> Uniscribe ScriptShape API, to let it see the characters in the logical
> order it expects them.  That's if it turns out that Uniscribe cannot
> otherwise shape them correctly.

??? Currently characters and glyphs in LGSTRING are always
in logical order.  A "shape" method should also shape that
LGSTRING in logical order.

Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]