bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv


From: jake
Subject: [bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv
Date: Wed, 2 Jul 2008 13:24:15 -0400 (EDT)

address@hidden wrote:
The multiple unicode codepoints for 0xBD and 0XDB will result two
different unicode strings to be translated into the same MACROMAN
string, making the "return trip" ambiguious. I am curious though
since libiconv already does make a decisive choice when going from
MACROMAN to UTF8(instead of rejecting those characters),
wouldn't it make sense for it to choose the same consistent
behavior from UTF->MACROMAN?

libiconv currently only implements a 1:1 conversion, exactly as listed in the
file libiconv/tests/MacRoman.TXT. I'm also not so much a fan of mapping two
different Unicode code points to the same byte value; because of the round-trip
problem, as you say. It's safer to tell the user clearly that a certain Unicode
code point (such as U+20AC) is not supported in the particular character set.

I am still unclear about the motivation behind Apple Logo,
because even when I am on a linux system(which I am)
it's private-use U+F8FF should still get translated into
ASCII 240(0xF0). should it?

It was not done this way in the MAC-ROMAN mapping table published on
ftp.unicode.org around 2000.


Bruno, thank you for the clarification.

Which text encodings in libiconv contain
unique 1:1 conversion(forward and backward) for
all 256 characters in the 8 bits?

Jake




reply via email to

[Prev in Thread] Current Thread [Next in Thread]