[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [PATCH] Re: --convert-links and filenames with colons
From: |
Tim Ruehsen |
Subject: |
[Bug-wget] [PATCH] Re: --convert-links and filenames with colons |
Date: |
Tue, 27 Oct 2015 13:19:22 +0100 |
User-agent: |
KMail/4.14.10 (Linux/4.2.0-1-amd64; KDE/4.14.13; x86_64; ; ) |
Hi Joachim,
could please test the attached patch if it works for you ?
Could anyone else review it !?
Tim
On Monday 26 October 2015 13:42:41 Joachim Breitner wrote:
> Dear wget developers,
>
> it seems that "wget -r -k" is a bit careless with creating relative
> URLs that start with “something:”, which would then be mis-interpreted
> as the protocol specification of an URL.
>
> For example, downloading these two files:
>
> /tmp/wget/input $ head *
> ==> file:with:colon.html <==
> <html>
> <body>
> <a href="./file:with:colon.html">Foo</a>
> <a href="./file_without_colon.html">Bar</a>
> </body>
> </html>
>
> ==> file_without_colon.html <==
> <html>
> <body>
> <a href="./file:with:colon.html">Foo</a>
> <a href="./file_without_colon.html">Bar</a>
> </body>
> </html>
>
> with "wget -k -r" produces this output:
>
> ==> localhost:8000/file:with:colon.html <==
> <html>
> <body>
> <a href="file:with:colon.html">Foo</a>
> <a href="file_without_colon.html">Bar</a>
> </body>
> </html>
>
> ==> localhost:8000/file_without_colon.html <==
> <html>
> <body>
> <a href="file:with:colon.html">Foo</a>
> <a href="file_without_colon.html">Bar</a>
> </body>
> </html>
>
> and the browser will not be able to follow the link to Foo.
>
> This is a practical problem when trying to mirror a mediawiki
> installation.
> I suggest to avoid the issue by prepending relative links with "./",
> either always (why not?), or when there relative file name started with
> something that looks like “foo:”.
>
>
> Thanks,
> Joachim
0001-Fix-URL-conversion-for-colons-in-filenames.patch
Description: Text Data