emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Composing Hebrew diacriticals


From: Kenichi Handa
Subject: Re: Composing Hebrew diacriticals
Date: Thu, 01 Jul 2010 14:52:23 +0900

In article <address@hidden>, Yair F <address@hidden> writes:

> Sorry about that Please find hebrew-sample2.txt the source file.
> Arial-anottated.png is this file displayed using emacs with Arial font.
> The numbers in red refer to the following comments the general flow is
> top-bottom right-left:
> 1. Shin-Dot should be rendered near the right leg. currently it is
> rendered above the centre leg, this is unreradable.
> 2. All points below should be horizontally centred relative to the
> base letter. Currently it seems that they are align to the left.
> Exception for this rule is letters that have a single leg downward
> such as =D7=95, =D7=A8, =D7=93, =D7=96 the points should be rendered direct=
> ly under the
> leg for these letters.
> 3. The Shva point touches Qof's leg. the result is unreadable.
> 4. The Dagesh point is hidden within the Shin letter.
> 5. This is not Hebrew, but the combining dot above should be composed
> with the letter A.
> 6. The Holam point should be left to the leg, and not right. Result is
> unreadable.
> 7. Shuruq point should be left to the vav letter, and not right.
> Result is unreadable.

All those are glyph positioning problems and can be improved
by adding more code to hebrew-shape-gstring.

> > Anyway, for fonts that don't have OpenType tables for Hebrew
> > script, we can do nothing other than artificially adjusting
> > glyph position. =C2=A0Have you seen any other application
> > rendering Hebrew well with that Arial font?
> Openoffice and Firefox correctly render Hebrew points.

??? When I open your hebrew-sample2.txt with oowriter, and
specify Arial font, the rendering is almost (exactly?) the
same as that of Emacs (see the attached image).

I confirmed that Firefox (and all applications using
Pango/harfbuzz; e.g. gedit) surely do better hebrew
rendering with Arial.  By reading the code of Pango, I found
that it has a fallback shaping engine that is used for a
font of no hebrew GPOS OpenType tables.  Here's the excerpt
from pango/module/hebrew-shaper.c.  You'll see that it
checks various character combinations and adjust glyph
offsets properly.  But the code has many magic numbers
(e.g. 3.5, 0.7, 0.5, 1/3, 3/5, ...).  I think it's a dirty &
ad-hoc hack.

Theoretically, it is possible to do the same thing in the
function hebrew-shape-gstring.  But, is it really worth
doing that?  Isn't it enough to tell Hebrew users to use
properly desinged OpenType fonts?

============================================================
void
hebrew_shaper_get_cluster_kerning(gunichar            *cluster,
                                  gint                cluster_length,
                                  PangoRectangle      ink_rect[],

                                  /* input and output */
                                  gint                width[],
                                  gint                x_offset[],
                                  gint                y_offset[])
{
  int i;
  int base_ink_x_offset, base_ink_y_offset, base_ink_width, base_ink_height;
  gunichar base_char = cluster[0];

  x_offset[0] = 0;
  y_offset[0] = 0;

  if (cluster_length == 1)
    {
      /* Make lone 'vav dot' have zero width */
      if (base_char == UNI_SHIN_DOT
          || base_char == UNI_SIN_DOT
          || base_char == UNI_HOLAM
          ) {
        x_offset[0] = -ink_rect[0].x - ink_rect[0].width;
        width[0] = 0;
      }

      return;
    }

  base_ink_x_offset = ink_rect[0].x;
  base_ink_y_offset = ink_rect[0].y;
  base_ink_width = ink_rect[0].width;
  base_ink_height = ink_rect[0].height;

  /* Do heuristics */
  for (i=1; i<cluster_length; i++)
    {
      int gl = cluster[i];
      x_offset[i] = 0;
      y_offset[i] = 0;

      /* Check if it is a point */
      if (gl < 0x5B0 || gl >= 0x05D0)
        continue;

      /* Center dot of VAV */
      if (gl == UNI_MAPIQ && base_char == UNI_VAV)
        {
          x_offset[i] = base_ink_x_offset - ink_rect[i].x;

          /* If VAV is a vertical bar without a roof, then we
             need to make room for the dot by increasing the
             cluster width. But how can I check if that is the
             case??
          */
          /* This is wild, but it does the job of differentiating
             between two M$ fonts... Base the decision on the
             aspect ratio of the vav...
          */
          if (base_ink_height > base_ink_width * 3.5)
            {
              int j;
              double space = 0.7;
              double kern = 0.5;

              /* Shift all characters to make place for the mapiq */
              for (j=0; j<i; j++)
                  x_offset[j] += ink_rect[i].width*(1+space-kern);

              width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
              x_offset[i] -= ink_rect[i].width*(kern);
            }
        }

      /* Dot over SHIN */
      else if (gl == UNI_SHIN_DOT && base_char == UNI_SHIN)
        {
          x_offset[i] = base_ink_x_offset + base_ink_width
            - ink_rect[i].x - ink_rect[i].width;
        }

      /* Dot over SIN */
      else if (gl == UNI_SIN_DOT && base_char == UNI_SHIN)
        {
          x_offset[i] = base_ink_x_offset - ink_rect[i].x;
        }

      /* VOWEL DOT above to any other character than
         SHIN or VAV should stick out a bit to the left. */
      else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
               && base_char != UNI_SHIN && base_char != UNI_VAV)
        {
          x_offset[i] = base_ink_x_offset -ink_rect[i].x - ink_rect[i].width * 
3/ 2;
        }

      /* VOWELS under resh or vav are right aligned, if they are
         narrower than the characters. Otherwise they are centered.
       */
      else if ((base_char == UNI_VAV
                || base_char == UNI_RESH
                || base_char == UNI_YOD
                || base_char == UNI_DALED
                )
               && ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
                   gl == UNI_QUBUTS)
               && ink_rect[i].width < base_ink_width
               )
        {
          x_offset[i] = base_ink_x_offset + base_ink_width
            - ink_rect[i].x - ink_rect[i].width;
        }

      /* VOWELS under FINAL KAF are offset centered and offset in
         y */
      else if ((base_char == UNI_FINAL_KAF
                )
               && ((gl >= UNI_SHEVA && gl <= UNI_QAMATS) ||
                   gl == UNI_QUBUTS))
        {
          /* x are at 1/3 to take into accoun the stem */
          x_offset[i] = base_ink_x_offset - ink_rect[i].x
            + base_ink_width * 1/3 - ink_rect[i].width/2;

          /* Center in y */
          y_offset[i] = base_ink_y_offset - ink_rect[i].y
            + base_ink_height * 1/2 - ink_rect[i].height/2;
        }


      /* MAPIQ in PE or FINAL PE */
      else if (gl == UNI_MAPIQ
               && (base_char == UNI_PE || base_char == UNI_FINAL_PE))
        {
          x_offset[i]= base_ink_x_offset - ink_rect[i].x
            + base_ink_width * 2/3 - ink_rect[i].width/2;

          /* Another option is to offset the MAPIQ in y...
             glyphs->glyphs[cluster_start_idx+i].geometry.y_offset
             -= base_ink_height/5; */
        }

      /* MAPIQ in SHIN should be moved a bit to the right */
      else if (gl == UNI_MAPIQ
               && base_char == UNI_SHIN)
        {
          x_offset[i]=  base_ink_x_offset - ink_rect[i].x
            + base_ink_width * 3/5 - ink_rect[i].width/2;
        }

      /* MAPIQ in YUD is right aligned */
      else if (gl == UNI_MAPIQ
               && base_char == UNI_YOD)
        {
          x_offset[i]=  base_ink_x_offset - ink_rect[i].x;

          /* Lower left in y */
          y_offset[i] = base_ink_y_offset - ink_rect[i].y
            + base_ink_height - ink_rect[i].height*1.75;

          if (base_ink_height > base_ink_width * 2)
            {
              int j;
              double space = 0.7;
              double kern = 0.5;

              /* Shift all cluster characters to make space for mapiq */
              for (j=0; j<i; j++)
                x_offset[j] += ink_rect[i].width*(1+space-kern);

              width[cluster_length-1] += ink_rect[i].width*(1+space-kern);
            }

        }

      /* VOWEL DOT next to any other character */
      else if ((gl == UNI_SIN_DOT || gl == UNI_HOLAM)
               && (base_char != UNI_VAV))
        {
          x_offset[i] = base_ink_x_offset -ink_rect[i].x;
        }

      /* Move nikud of taf a bit ... */
      else if (base_char == UNI_TAV && gl == UNI_MAPIQ)
        {
          x_offset[i] = base_ink_x_offset - ink_rect[i].x
            + base_ink_width * 5/8 - ink_rect[i].width/2;
        }

      /* Move center dot of characters with a right stem and no
         left stem. */
      else if (gl == UNI_MAPIQ &&
               (base_char == UNI_BET
                || base_char == UNI_DALED
                || base_char == UNI_KAF
                || base_char == UNI_GIMMEL
                ))
        {
          x_offset[i] = base_ink_x_offset - ink_rect[i].x
            + base_ink_width * 3/8 - ink_rect[i].width/2;
        }

      /* Right align wide nikud under QOF */
      else if (base_char == UNI_QOF &&
               ( (gl >= UNI_HATAF_SEGOL
                  && gl <= UNI_HATAF_QAMATZ)
                 || (gl >= UNI_TSERE
                     && gl<= UNI_QAMATS)
                 || (gl == UNI_QUBUTS)))
        {
          x_offset[i] = base_ink_x_offset + base_ink_width
            - ink_rect[i].x - ink_rect[i].width;
        }

      /* Center by default */
      else
        {
          x_offset[i] = base_ink_x_offset - ink_rect[i].x
            + base_ink_width/2 - ink_rect[i].width/2;
        }
    }

}
============================================================

> The poetry site
> you mentioned http://www.zemer.co.il/song.asp?id=3D393 uses David and
> being correctly rendered.
> Kate (using pango?) also better render using Arial, David-CLM. It has
> some other issues though, but the result is mostly readable.

As Kate is a KDE application, I think it's not using Pango.
But, if it renders Hebrew with Arial well, it (or rendering
module of KDE/Qt) should have the similar ad-hoc code.

---
Kenichi Handa
address@hidden

Attachment: oowriter-arial.png
Description: PNG image


reply via email to

[Prev in Thread] Current Thread [Next in Thread]