[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appea
From: |
Jason Rumney |
Subject: |
bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear |
Date: |
Sun, 19 Aug 2012 11:02:52 +0800 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) |
Kenichi Handa <handa@gnu.org> writes:
> In article <83txw0aczg.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
>
>> > From: Kenichi Handa <handa@gnu.org>
>> > Cc: eliz@gnu.org, 11860@debbugs.gnu.org, smias@yandex.ru
>> > Date: Sat, 18 Aug 2012 11:45:27 +0900
>> >
>> > So, apparently Emacs on Windows and GNU/Linux uses the
>> > different metrics of glyphs.
Right, but adding the offsets to the corresponding metrics, we get the
same result with both the Windows and GNU/Linux cases, except for the
total height of the font, which I think is because Windows counts
inter-line spacing, while on GNU/Linux, that is separate.
So I'm not sure that this is causing us problems (see Eli's report about
Hebrew), it's just a case of a different reference point being used
between Windows and GNU/Linux.
> For Hebrew too, on Windows, I see the same problem as what
> Steffan <smias@yandex.ru> reported:
If you are seeing something different than Eli for Hebrew with the same
font, then I suspect the cause is linked with the version of Uniscribe
that is installed. Maybe diacritic handling for Hebrew and Arabic is a
more recent addition to Uniscribe than the basic support for those
languages.
>> > For instance, in the above case, we may have to render glyphs in
>> > this order (diacritical mark first):
>> >
>> > [0 1 1593 760 0 3 6 12 4 [1 -2 0]]
>> > [0 1 1593 969 8 1 8 12 4 nil]
I'm curious as to how we ended up with the same C entry in those
vectors. Could this be causing us problems later on? The glyph index
is correct (comparing to the GNU/Linux version), but I wonder if
Uniscribe is referring back to the character at some point and tripping
up because it has been changed.
> I've just read the function uniscribe_shape in
> w32uniscribe.c. It seems that these are the key API for
> uniscribe:
>
> * ScriptItemize -- no idea what is this
This should be a no-op on Emacs, as we already split the string into
LGSTRING components. But if it is not called, subsequent uniscribe
operations fail, so it must also be doing some initialization of
internal structures as well.
> * ScriptShape -- perhaps for glyph substitution (GSUB features of opentype)
> * ScriptPlace -- perhaps for glyph positioning (GPOS features of opentype)
Yes, I think that is correct.
> So at first please check the documentation of ScriptShape
> and figure out how it works for bidi script; i.e. what order
> does it expect for input, and what order does it produce.
>
> Next please find the meaning of this code fragment:
>
> /* Detect clusters, for linking codes back to
> characters. */
> if (attributes[j].fClusterStart)
> {
> while (from < nchars_in_run && clusters[from] < j)
> from++;
> if (from >= nchars_in_run)
> from = to = nchars_in_run - 1;
> else
> {
> int k;
> to = nchars_in_run - 1;
> for (k = from + 1; k < nchars_in_run; k++)
> {
> if (clusters[k] > j)
> {
> to = k - 1;
> break;
> }
> }
> }
> }
>
> The comment refer to "clusters". I don't know what it
> exactly means in uniscribe, but I guess it relates to
> grapheme cluster, and if so, this part seems to relates to
> the ordering of glyphs in this kind of grapheme clauster:
>
> [0 1 1593 969 8 1 8 12 4 nil]
> [0 1 1593 760 0 3 6 12 4 [1 -2 0]]
That seems to be correct. Maybe this is the code that is changing the
character code to 1593. I seem to recall that something like this was
required for Indic languages to let Emacs know which characters had been
linked back into one glyph.
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, (continued)
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/17
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/18
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/18
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/18
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, YAMAMOTO Mitsuharu, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Werner LEMBERG, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/20
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear,
Jason Rumney <=
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Jason Rumney, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/20
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/20
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Kenichi Handa, 2012/08/21
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, YAMAMOTO Mitsuharu, 2012/08/19
- bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear, Eli Zaretskii, 2012/08/19