[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FAIL's in encoding unit test, bug in Unicode.m?

From: Richard Frith-Macdonald
Subject: Re: FAIL's in encoding unit test, bug in Unicode.m?
Date: Sun, 8 Dec 2002 07:33:53 +0000

On Saturday, December 7, 2002, at 12:56 am, Willem Rein Oudshoorn wrote:

So the comment suggest that if `iconv' returns a positive integer
it will be a lossy conversion.

And this is exactly the place where the conversion fails.
However, the conversion from the string "ABC" is not lossy.

Also reading the documentation of `iconv' it says:

     If all input from the input buffer is successfully converted and
     stored in the output buffer the function returns the number of
     conversions performed.  In all other cases the return value is
     `(size_t) -1' and `errno' is set appropriately.  In this case the
     value pointed to by INBYTESLEFT is nonzero.

So this suggest it will return the number of successfull converted
characters.  This is consistent with the values I see.

This suggests to me that the code in Unicode.m is wrong, or
that there are two incompatible versions of iconv and I managed
to use the wrong one.

Also, if the intention is to check for lossy conversion, the relevant
part of  the iconv documentation is:

     Since the character sets selected in the `iconv_open' call can be
     almost arbitrary there can be situations where the input buffer
     contains valid characters which have no identical representation
     in the output character set.  The behavior in this situation is
     undefined.  The _current_ behavior of the GNU C library in this
     situation is to return with an error immediately.  This certainly
     is not the most desirable solution.  Therefore future versions
     will provide better ones but they are not yet finished.

It might also be convenient to know that  I am using


I couldn't find a glibc that old on any of the machines of the network
work I work (the oldest there is the three year old libc-2.1.3), and my
system has libc-2.3.1 ... pretty much current.

My guess is that between 2.1.1 and 2.1.3 the glibc support for iconv
improved a lot.

The later documentation states that -
'The iconv function returns the number of characters converted in a non-reversible way during this call; reversible conversions are not counted.'

I think your best bet is to upgrade to a more recent glibc ... if I remember correctly, the iconv in redhat 5 (I think that's around the version you have) had other problems too, like quietly failing to perform conversions of large strings - so for any serious string handling with multiple charactersets, you really need more recent iconv code.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]