bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [Wget v1.14 Patch] When using -nc argument


From: Tim Ruehsen
Subject: Re: [Bug-wget] [Wget v1.14 Patch] When using -nc argument
Date: Mon, 18 Mar 2013 11:41:09 +0100
User-agent: KMail/1.13.7 (Linux/3.2.0-4-amd64; KDE/4.8.4; x86_64; ; )

Hi Juan Miguel,

you are just changing the default behaviour of Wget. This is not backward 
compatible.

To solve the issue, there are other options:

1. Introduce a new command line option which changes Wget's behaviour in your 
sense. Then inspect the file to see if it likely contains HTML (e.g. checking 
the first kbyte for some keywords).

or
2. Introduce a 'database' (maybe a flat file), where information from the 
server is saved together with the filename. -nc could lookup the Content-Type 
(and other information) from here. And voila.

Regards, Tim

Am Sunday 17 March 2013 schrieb Juan Miguel Taboada Godoy:
> Hello:
> 
> This patch prevent wget from stopping when -nc argument is in use and
> file in the disk (from a previous download) has a name which doesn't
> finish with htm or html.
> 
> I detected this bug when downloading a website with this URL:
> http://www.abc.com/dirdir/cgi-bin/Search.php?lng=EN&search=Query_Search_Lis
> t
> 
> This was saved in the disk at:
> dirdir/cgi-bin/
> as a file named:
> Search.php?lng=EN&search=Query_Search_List
> 
> When I was using -nc argument, wget couldn't detect that this file could
> have links inside.
> 
> Because I believe wget should check most of the files as text/html just
> in case there are some links inside to visit, I repaired this problem
> not checking for htm or html suffix inside the name of the file.
> 
> Sincerely,

Mit freundlichem Gruß

     Tim Rühsen



reply via email to

[Prev in Thread] Current Thread [Next in Thread]