bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Issue with --content-on-error and --convert-links


From: Yousong Zhou
Subject: Re: [Bug-wget] Issue with --content-on-error and --convert-links
Date: Thu, 16 Oct 2014 15:24:48 +0800

On 13 October 2014 10:25, Joe Hoyle <address@hidden> wrote:
> Hi All,
>
>
> I’m having issues using "--convert-links” in conjunction with 
> "--content-on-error”. Though "--content-on-error” is forcing wget to download 
> the pages, the links to that “errored” page is not update in other pages that 
> link to it.
>
>
> This seems to be hinted at in the man page:
>
>
> "Because of this, local browsing works reliably: if a linked file was 
> downloaded, the link will refer to its local name; if it was not downloaded, 
> the link will refer to its full Internet address rather than presenting a 
> broken link. The fact that the former links are converted to relative links 
> ensures that you can move the downloaded hierarchy to another directory.”
>
>
> However, it would seem in the case of using —content-on-error it should 
> ignore this rule and do all the link substation anyhow.
>
>
> If anyone knows if this *should* work then I’d be eager to hear it, or any 
> other way I can get any 404 pages downloaded and also linked to in the wget 
> mirror.
>

Currently, wget thought pages with 404 status code were not RETROKF
(retrieval was OK) though the 404 page itself was actually downloaded
successfully with `--content-on-error` option enabled.  This behaviour
is mostly acceptable I guess.  But you can try the attached the patch
for the moment.  The other option would be serving the 404 page by
manually setting it up with your web server.

Regards.

               yousong

Attachment: 0001-Let-convert-links-work-with-content-on-error.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]