bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] [PATCH] armscii8 bugfix


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] [PATCH] armscii8 bugfix
Date: Sun, 11 Jul 2010 13:04:10 +0200
User-agent: KMail/1.9.9

Hi Gayane,

Gayane Sargssian wrote:
> I've found a bug in the armscii8 to unicode mapping, the և (ARMSCII-8
> 0xA8) must be mapped not as — (unicode 0x2014) but as և (unicode
> 0x0587).
> 
> I've checked  the /lib/armscii_8.h file and found more errors, here
> follows a patch to fix the issues, thank you.

Thanks for the patch. Dissecting it, I see it makes the following
changes in the ARMSCII-8 -> Unicode direction:

  Code point    Mapping before                            Mapping after

  0xA1          none                                      U+00A9 COPYRIGHT SIGN
  0xA2          U+0587 ARMENIAN SMALL LIGATURE ECH YIWN   U+00A7 SECTION SIGN
  0xA8          U+2014 EM DASH                            U+0587 ARMENIAN SMALL 
LIGATURE ECH YIWN

My references are here:
  - Wikipedia [1]
  - Comparison of conversion tables [2]
  - Linux manual page [3]

What are your references?

The mapping of 0xA1 should, according to [1], be the ARMENIAN ETERNITY SIGN.
But this sign is not in Unicode. It does not seem appropriate to be to use
U+00A9 COPYRIGHT SIGN or (like done in some encodings [2]) the
U+2741 EIGHT PETALLED OUTLINED BLACK FLORETTE for it. So, I'd better leave it
as is.

About the mapping of 0xA2, [1] says: "The code value A2 was used for encoding
the Armenian ligature ew (used as a symbol), but was later replaced by the
section sign punctuation. Some Armenian fonts display this ligature at the
position of the ASCII ampersand symbol..."

"was"? "later replaced"? Do you have a copy of the AST 34.002 standard to
clean up the confusion? The standard is from 1997.

About the mapping of 0xA8, [1] says that it maps to em-dash (U+2014) but
then says U+2015 (which is HORIZONTAL BAR). In any case, I don't see a reason
to map it to U+0587 ARMENIAN SMALL LIGATURE ECH YIWN.

Bruno

[1] http://en.wikipedia.org/wiki/ArmSCII
[2] http://www.haible.de/bruno/charsets/conversion-tables/Armenian.html
[3] http://www.kernel.org/doc/man-pages/online/pages/man7/armscii-8.7.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]