bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Tilde issue with recursive download when IRI is enabled a


From: Tim Rühsen
Subject: Re: [Bug-wget] Tilde issue with recursive download when IRI is enabled and a page uses Shift JIS
Date: Mon, 06 Feb 2017 22:55:32 +0100
User-agent: KMail/5.2.3 (Linux/4.9.0-1-amd64; KDE/5.28.0; x86_64; ; )

On Montag, 6. Februar 2017 05:02:57 CET William Prescott wrote:
> Hello,
> 
> I'm encountering a problem when recursively downloading from a website when
> the URL contains a tilde and the page encoding claims to be Shift JIS.
> 
> I've tried both Wget 1.17.1 (from Ubuntu 16.04) and 1.19 (from source,
> with Libidn2 0.16).
> I believe my local character encoding is UTF-8.
> 
> The first page will download okay, but then most pages after it will get the
> tilde converted to "%E2%80%BE" ("‾"), which, as one would expect, doesn't
> work.

Hi William,

reproducable by:

$echo '~'|iconv -f SHIFT-JIS -t utf-8
‾

$echo -n '~'|iconv -f SHIFT-JIS -t utf-8|od -t x1
0000000 e2 80 be

So this seems not be a Wget issue, but maybe a general character conversion 
issue. Not sure what Wget could do...

Regards, Tim

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]