[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Patch: Make url_file_name also convert remote path to loc

From: Tim Rühsen
Subject: Re: [Bug-wget] Patch: Make url_file_name also convert remote path to local encoded
Date: Mon, 13 Nov 2017 16:36:39 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 11/12/2017 07:46 PM, Eli Zaretskii wrote:
>> From: Tim Rühsen <address@hidden>
>> Date: Sun, 12 Nov 2017 14:50:47 +0100
>> Cc: YX Hao <address@hidden>
>> As I understand, the second patch is still in discussion with Eli. Since I 
>> do 
>> not have Windows, I can't help you here. Though what I saw from the 
>> discussion, you address a portability issue that likely should be solved 
>> within gnulib. Maybe you could (in parallel) send a mail to address@hidden 
>> with a link to your discussion with Eli. There might be some people with 
>> deeper knowledge.
> I don't think it's a Gnulib issue.  The problem is that on Windows,
> the implicit call at the beginning of Wget
>   setlocale (LC_ALL, "C");

Why is there an explicit call with "C" ? There is an explicit call with "".
From the man page:
"If locale is an empty string, "", each part of the locale that should
be modified is set according to the environment variables."

> is not good enough to work in multibyte locales of the Far East,
> because the Windows runtime assumes a single-byte locale after that
> call.  And since Wget happens to need to display text and create files
> with non-ASCII characters, it gets hit more than other programs.

I (hopefully) can understand why this doesn't work. NTFS uses UTF-16 for
the filenames. If your environment specifies a single-character encoding
(e.g. C) and we use at some point a multi-character encoding (e.g.
utf-8), then any automatic conversion to UTF-16 filenames are likely to
fail. For me the question is: a) does wget has a bug (e.g. creating a
filename with a wrong encoded name string or b) does the Windows API has
a problem.

> The proposed solution is to add a special call to setlocale which gets
> this right on Windows.

Why can't we just convert the filename string into the correct encoding
and then create the file ? What do I miss ?

With Best Regards, Tim

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]