Re: [Bug-wget] Problem with ÅÄÖ and wget

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem with ÅÄÖ and wget

From:	Bykov Aleksey
Subject:	Re: [Bug-wget] Problem with ÅÄÖ and wget
Date:	Sun, 15 Sep 2013 01:59:23 +0300
User-agent:	Opera Mail/12.14 (Win32)

Greetings

Great thanks for pushing in correct direction.

With attached patch Wget in Windows can work with UTF-8 names. But - alsoonly with "--restrict-file-names=nocontrol"...

Windows need conversion for all work with wide chars.MultiByteToWideChar() choosed because it allow to force set inputencdoing. And after convertion chars can be checked separatly forrestriction. As variant - restricted symbol replaced and whole stringconverted back to UTF-8 with WideCharToMultiByte().It is possible in UNIX use mbstowcs()/wcstombs with setlocale(LC_ALL,"UTF-8") for same purpose? Or exist some better way to convert shortstringto widestring during character quoting?


--
Best regars, Alex

On Fri, 13 Sep 2013 16:13:10 +0300, Tim Ruehsen <address@hidden> wrote:

On Friday 13 September 2013 12:43:53 Bykov Aleksey wrote:

Greetings
Yes, You show correct cyrillic filename.
Sorry, I'm not aggree that this bug is ready to close.
Your method is mentioned in it.
This bug about filenames in non UTF-8 locales.

Main qoute:
> If you are using a unix-like OS where the filesystem interface uses
> utf-8, there is a workaround of using --restrict-file-names=nocontrol
> (which is still too big, as that would allow problematic control
> characters %01 or %09).
> If you are using Windows, --restrict-file-names=nocontrol still gives
> garbage (the utf-8 characters are treated as if they were in latin1).


Thanks for pointing this out. I missed it.

I'm tried to solve this bug by adding new options
--local-filesystem-encoding
http://lists.gnu.org/archive/html/bug-wget/2013-05/msg00102.html
but patch was (rejected?)/(frozen?)/(lack of demand?).

It seems, there has be no discussion about. I interpret that it might bea

lack of interest - but i am not sure.

But quick net search reveals that NTFS is using UTF-16 (UNICODE) whilefopen()

demands ASCII !?

[1] suggests to feed UTF-8 strings to CreateFile() or wfopen() whenbuilt with

UNICODE. For a non-UNICODE build use CreateFileW() or wfopen().

So maybe your patch used the wrong approach.
You should try to use the above mentioned functions for WINDOWS builds.
If that works, the patch will be just a few lines...

Sorry, I don't know how Björn Mattsson swith it Windows Vista (x64)
filesystem to UTF-8.
In Russian locales Windows 98, XP (x86), Vista (x86) use filesystem
encoding CP866.

Wasn't there something like international language support even forWindows 98? Together with perhaps some new fonts, that should do it... but hey, Iout of

the Windows business since 12 years now and I never regretted it.

[1]http://stackoverflow.com/questions/2050973/what-encoding-are-filenames-in-ntfs-stored-as

[2] http://en.wikipedia.org/wiki/Filename

win_utf-8.diff
Description: Binary data

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/12
  - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/12
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Bykov Aleksey, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Bykov Aleksey <=
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/14
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Bykov Aleksey, 2013/09/15
  - Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
  - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/13
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tony Lewis, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/16
    - Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/17

Prev by Date: Re: [Bug-wget] [PATCH] fix bug #39844, added configure summary
Next by Date: Re: [Bug-wget] Problem with ÅÄÖ and wget
Previous by thread: Re: [Bug-wget] Problem with ÅÄÖ and wget
Next by thread: Re: [Bug-wget] Problem with ÅÄÖ and wget
Index(es):
- Date
- Thread