bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly


From: Eli Zaretskii
Subject: bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly
Date: Sat, 02 Oct 2021 15:18:28 +0300

> From: Rah Guzar <aikrahguzar@gmail.com>
> Date: Sat, 2 Oct 2021 13:43:47 +0200
> 
> Let us consider the word نہیں
> 
> It is composed of four letters. I will use character field from 
> `describe-char` for each of them below 
> 1) ن‎ (displayed as ن‎) (codepoint 1606, #o3106, #x646)
> 2)  ہ‎ (displayed as ہ‎) (codepoint 1729, #o3301, #x6c1)
> 3)  ی‎ (displayed as ی‎) (codepoint 1740, #o3314, #x6cc)
> 4) ں‎ (displayed as ں‎) (codepoint 1722, #o3272, #x6ba)
> 
> It should be displayed with all 4 characters joined together, instead they 
> are all displayed individually.

What font displays them individually?  You should be able to tell that
if you type "C-u C-x =" on one of these characters.

For me, they display joined together.

> If I change to `NotoNastaliqUrdu` this word is displayed correctly. But there 
> is problem with   حرف
> 
> It consist of three letters,
> 1) ح‎ (displayed as ح‎) (codepoint 1581, #o3055, #x62d)
> 2) ر‎ (displayed as ر‎) (codepoint 1585, #o3061, #x631)
> 3) ف‎ (displayed as ف‎) (codepoint 1601, #o3101, #x641)
> 
> The first two characters should be joined and the last one should be on its 
> own. This seems to be the case.
> But the two groups are rendered on top of each other making it illegible.
> 
>  So isn't this a matter of finding a proper font, in particularly given
>  the "Nastaliq vs Naskh" issues?  NotoNastaliqUrdu is not the only font
>  supporting Nastaliq, so perhaps other fonts fare better?
>  
> My knowledge here is very deficient but my impression is Nastaliq and Naskh 
> are styles and shouldn't affect
> composition.
> NotoNastaliqUrdu was the only Urdu font available from my distro.  
> Libreoffice which also uses harfbuzz
> renders it
> correctly so I didn't try another font at first. Like emacs libreoffice also 
> uses a Naskh font by default but all the
> characters are joined properly.
> 
> I did try some fonts from https://urdufonts.net/ after your suggestions and 
> they render correctly. Specifically
> the font I tried
> were: 
> Jameel Noori Nastaleeq Regular
> Alvi Nastaleeq 
> Zohra Unicode
> Manzor Unicode
> 
> I didn't notice a problem with any of them except a very minor one for the 
> last two which have visible
> boundaries where glyphs
> are joined.  

So would it be correct to say that using a proper font solves the
problem?

>  Since Urdu uses the Arabic characters, Emacs uses character
>  composition rules for Arabic when displaying this text.  Do you know
>  if the composition rules for Urdu are different?
> 
> I think using Arabic composition rules might be part of the problem. Urdu 
> alphabet is a superset of Arabic
> alphabet and if I
> don't set a font specifically designed for Urdu, the words where some 
> characters should be joined but aren't
> always seem to
> include a character like ہ which is in Urdu alphabet but not in Arabic. 

I don't think the problem is with compositions, because in the 2
examples you described above, there are no character compositions.

Moreover, our pattern for asking HarfBuzz to shape Arabic text is
this:

   "[\u0600-\u074F\u200C\u200D]+"

which includes all of the characters, including U+06C1 which you say
causes problems.

You could try setting current-iso639-language to the symbol 'ur'
(without the quotes), that should tell HarfBuzz to shape the text as
appropriate for Urdu.  But I think the real problem is with the font,
not with shaping.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]