[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] IRIs not encoded in UTF-8 locale
From: |
Magnus Holmgren |
Subject: |
[Bug-wget] IRIs not encoded in UTF-8 locale |
Date: |
Thu, 17 Feb 2011 21:08:54 +0100 |
User-agent: |
KMail/1.13.5 (Linux/2.6.32-5-amd64; KDE/4.4.5; x86_64; ; ) |
Hi!
Unless someone's fixed it since version 1.12, there's a bug in iri.c causing
international domain names not to be IDN encoded if the current locale is a
UTF-8 one.
The problem is that idn_encode expects remote_to_utf8 to return true iff there
was something to encode, but remote_to_utf8 returns false if do_conversion
didn't change the string, which is the case if the string is pure ASCII *or*
already UTF-8 encoded. The test on line 290 needs to be changed to a check for
high bits. If i->uri_encoding is "UTF-8", the whole iconv bit of course can be
skipped or be replaced with a check for valid UTF-8.
Alternatively, idn_encode should not return NULL immediately when
remote_to_utf8 returns false. remote_to_utf8 may need to differentiate between
"error" and "nothing to encode".
--
Magnus Holmgren address@hidden
signature.asc
Description: This is a digitally signed message part.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-wget] IRIs not encoded in UTF-8 locale,
Magnus Holmgren <=