[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Composing Hebrew diacriticals
From: |
Kenichi Handa |
Subject: |
Re: Composing Hebrew diacriticals |
Date: |
Thu, 01 Jul 2010 14:52:23 +0900 |
In article <address@hidden>, Yair F <address@hidden> writes:
> Sorry about that Please find hebrew-sample2.txt the source file.
> Arial-anottated.png is this file displayed using emacs with Arial font.
> The numbers in red refer to the following comments the general flow is
> top-bottom right-left:
> 1. Shin-Dot should be rendered near the right leg. currently it is
> rendered above the centre leg, this is unreradable.
> 2. All points below should be horizontally centred relative to the
> base letter. Currently it seems that they are align to the left.
> Exception for this rule is letters that have a single leg downward
> such as =D7=95, =D7=A8, =D7=93, =D7=96 the points should be rendered direct=
> ly under the
> leg for these letters.
> 3. The Shva point touches Qof's leg. the result is unreadable.
> 4. The Dagesh point is hidden within the Shin letter.
> 5. This is not Hebrew, but the combining dot above should be composed
> with the letter A.
> 6. The Holam point should be left to the leg, and not right. Result is
> unreadable.
> 7. Shuruq point should be left to the vav letter, and not right.
> Result is unreadable.
All those are glyph positioning problems and can be improved
by adding more code to hebrew-shape-gstring.
> > Anyway, for fonts that don't have OpenType tables for Hebrew
> > script, we can do nothing other than artificially adjusting
> > glyph position. =C2=A0Have you seen any other application
> > rendering Hebrew well with that Arial font?
> Openoffice and Firefox correctly render Hebrew points.
??? When I open your hebrew-sample2.txt with oowriter, and
specify Arial font, the rendering is almost (exactly?) the
same as that of Emacs (see the attached image).
I confirmed that Firefox (and all applications using
Pango/harfbuzz; e.g. gedit) surely do better hebrew
rendering with Arial. By reading the code of Pango, I found
that it has a fallback shaping engine that is used for a
font of no hebrew GPOS OpenType tables. Here's the excerpt
from pango/module/hebrew-shaper.c. You'll see that it
checks various character combinations and adjust glyph
offsets properly. But the code has many magic numbers
(e.g. 3.5, 0.7, 0.5, 1/3, 3/5, ...). I think it's a dirty &
ad-hoc hack.
Theoretically, it is possible to do the same thing in the
function hebrew-shape-gstring. But, is it really worth
doing that? Isn't it enough to tell Hebrew users to use
properly desinged OpenType fonts?
============================================================
void
hebrew_shaper_get_cluster_kerning(gunichar *cluster,
gint cluster_length,
PangoRectangle ink_rect[],
/* input and output */
gint width[],
gint x_offset[],
gint y_offset[])
{
int i;
int base_ink_x_offset, base_ink_y_offset, base_ink_width, base_ink_height;
gunichar base_char = cluster[0];
x_offset[0] = 0;
y_offset[0] = 0;
if (cluster_length == 1)
{
/* Make lone 'vav dot' have zero width */
if (base_char == UNI_SHIN_DOT
|| base_char == UNI_SIN_DOT
|| base_char == UNI_HOLAM
) {
x_offset[0] = -ink_rect[0].x - ink_rect[0].width;
width[0] = 0;
}
return;
}
base_ink_x_offset = ink_rect[0].x;
base_ink_y_offset = ink_rect[0].y;
base_ink_width = ink_rect[0].width;
base_ink_height = ink_rect[0].height;
/* Do heuristics */
for (i=1; i<cluster_length; i++)
{
int gl = cluster[i];
x_offset[i] = 0;
y_offset[i] = 0;
/* Check if it is a point */
if (gl < 0x5B0 || gl >= 0x05D0)
continue;
/* Center dot of VAV */
if (gl == UNI_MAPIQ && base_char == UNI_VAV)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x;
/* If VAV is a vertical bar without a roof, then we
need to make room for the dot by increasing the
cluster width. But how can I check if that is the
case??
*/
/* This is wild, but it does the job of differentiating
between two M$ fonts... Base the decision on the
aspect ratio of the vav...
*/
if (base_ink_height > base_ink_width * 3.5)
{
int j;
double space = 0.7;
double kern = 0.5;
/* Shift all characters to make place for the mapiq */
for (j=0; j<i; j++)
x_offset[j] += ink_rect[i].width*(1+space-kern);
width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
x_offset[i] -= ink_rect[i].width*(kern);
}
}
/* Dot over SHIN */
else if (gl == UNI_SHIN_DOT && base_char == UNI_SHIN)
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* Dot over SIN */
else if (gl == UNI_SIN_DOT && base_char == UNI_SHIN)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x;
}
/* VOWEL DOT above to any other character than
SHIN or VAV should stick out a bit to the left. */
else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
&& base_char != UNI_SHIN && base_char != UNI_VAV)
{
x_offset[i] = base_ink_x_offset -ink_rect[i].x - ink_rect[i].width *
3/ 2;
}
/* VOWELS under resh or vav are right aligned, if they are
narrower than the characters. Otherwise they are centered.
*/
else if ((base_char == UNI_VAV
|| base_char == UNI_RESH
|| base_char == UNI_YOD
|| base_char == UNI_DALED
)
&& ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
gl == UNI_QUBUTS)
&& ink_rect[i].width < base_ink_width
)
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* VOWELS under FINAL KAF are offset centered and offset in
y */
else if ((base_char == UNI_FINAL_KAF
)
&& ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
gl == UNI_QUBUTS))
{
/* x are at 1/3 to take into accoun the stem */
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 1/3 - ink_rect[i].width/2;
/* Center in y */
y_offset[i] = base_ink_y_offset - ink_rect[i].y
+ base_ink_height * 1/2 - ink_rect[i].height/2;
}
/* MAPIQ in PE or FINAL PE */
else if (gl == UNI_MAPIQ
&& (base_char == UNI_PE || base_char == UNI_FINAL_PE))
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 2/3 - ink_rect[i].width/2;
/* Another option is to offset the MAPIQ in y...
glyphs->glyphs[cluster_start_idx+i].geometry.y_offset
-= base_ink_height/5; */
}
/* MAPIQ in SHIN should be moved a bit to the right */
else if (gl == UNI_MAPIQ
&& base_char == UNI_SHIN)
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 3/5 - ink_rect[i].width/2;
}
/* MAPIQ in YUD is right aligned */
else if (gl == UNI_MAPIQ
&& base_char == UNI_YOD)
{
x_offset[i]= base_ink_x_offset - ink_rect[i].x;
/* Lower left in y */
y_offset[i] = base_ink_y_offset - ink_rect[i].y
+ base_ink_height - ink_rect[i].height*1.75;
if (base_ink_height > base_ink_width * 2)
{
int j;
double space = 0.7;
double kern = 0.5;
/* Shift all cluster characters to make space for mapiq */
for (j=0; j<i; j++)
x_offset[j] += ink_rect[i].width*(1+space-kern);
width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
}
}
/* VOWEL DOT next to any other character */
else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
&& (base_char != UNI_VAV))
{
x_offset[i] = base_ink_x_offset -ink_rect[i].x;
}
/* Move nikud of taf a bit ... */
else if (base_char == UNI_TAV && gl == UNI_MAPIQ)
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 5/8 - ink_rect[i].width/2;
}
/* Move center dot of characters with a right stem and no
left stem. */
else if (gl == UNI_MAPIQ &&
(base_char == UNI_BET
|| base_char == UNI_DALED
|| base_char == UNI_KAF
|| base_char == UNI_GIMMEL
))
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width * 3/8 - ink_rect[i].width/2;
}
/* Right align wide nikud under QOF */
else if (base_char == UNI_QOF &&
( (gl >= UNI_HATAF_SEGOL
&& gl <= UNI_HATAF_QAMATZ)
|| (gl >= UNI_TSERE
&& gl<= UNI_QAMATS)
|| (gl == UNI_QUBUTS)))
{
x_offset[i] = base_ink_x_offset + base_ink_width
- ink_rect[i].x - ink_rect[i].width;
}
/* Center by default */
else
{
x_offset[i] = base_ink_x_offset - ink_rect[i].x
+ base_ink_width/2 - ink_rect[i].width/2;
}
}
}
============================================================
> The poetry site
> you mentioned http://www.zemer.co.il/song.asp?id=3D393 uses David and
> being correctly rendered.
> Kate (using pango?) also better render using Arial, David-CLM. It has
> some other issues though, but the result is mostly readable.
As Kate is a KDE application, I think it's not using Pango.
But, if it renders Hebrew with Arial well, it (or rendering
module of KDE/Qt) should have the similar ad-hoc code.
---
Kenichi Handa
address@hidden
oowriter-arial.png
Description: PNG image
- Re: Composing Hebrew diacriticals,
Kenichi Handa <=