[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Small problems with gettext 0.11 on Solaris
From: |
Bruno Haible |
Subject: |
Re: Small problems with gettext 0.11 on Solaris |
Date: |
Tue, 19 Feb 2002 15:30:28 +0100 (CET) |
Drazen Kacar writes:
> The first class of problematic checks looks like this:
>
> msgcat: mcat-test2.in2: warning: Charset "UTF-8" is not supported. msgcat
> relies
> on iconv(),
>
> The first thing to note is that the warning message might stand some
> improvement.
The actual error message was longer than that, but the msgcat-* tests
have partially filtered it away...
> However, the meaningful message should include both charset names
> which were passed to iconv_open() and not just one.
>
> Upon investigation in the debugger, I've found that the attempted
> conversion was from UTF-8 to UTF-8.
Why not?
> Solaris iconv_open() will indeed return an error if one attempts
> this.
I'd send a bug report to the Sun people.
> but I'm wondering if you would be interested in checking for this case in
> the gettext code (function po_lex_charset_set in po-charset.c printed the
> warning). There isn't much point in the overhead caused by unnecessary
> calling iconv() in this case.
The point is verifying that the input is indeed well-formed UTF-8.
> The second problem is that msgconv-1 invoked abort() in msgconv utility,
> but that looks like a Solaris bug.
Yes, I saw this as well. IIRC, Solaris iconv() returns success, and
increments the input pointers, but not the output pointers.
> The next problem is that msgcomm-4, msgcomm-5, msgcomm-6 and msgcomm-7
> fail because iconv() doesn't handle conversion from ASCII to ISO-8859-1.
> It does not, because ASCII is not a valid input character set name. This
> looks serious enough, so I'll do something about it. But could you tell me
> how msgcomm came to the conclusion that it needs to perform the conversion
> from ASCII to ISO-8859-1 (or anything else)? The catalogs in the test
> files specify iso-8859-1 for charset, so I'm not quite sure where ASCII
> comes into picture.
The mcomm-test4.in2 file has no header entry with charset
specification and is therefore assumed to be in ASCII.
Bruno