[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Enqueue logic problems

From: Micah Cowan
Subject: Re: [Bug-wget] Enqueue logic problems
Date: Thu, 2 May 2013 08:17:54 -0700

I believe you want -H -D gnu.org. That's what it's for. Wget doesn't
know which hostnames under a domain should be allowed and which should
not be (do you want images.gnu.org? git.gnu.org? lists.gnu.org?), so
turns 'em all off unless you ask for them explicitly.


On Thu, May 2, 2013 at 4:52 AM, Darshit Shah <address@hidden> wrote:
> I should have been more clear. --span-hosts will enqueue the other files,
> but it will also enqueue files from other hosts. I wish to recursively
> download a website but not other sites that it links to.
> Of course I could add --accept-regex / --reject-regex options to prevent
> wget from wandering onto other hosts. But shouldn't the default --recursive
> option simply handle cases where a www is either added or removed? Or is
> there any scenario that I am missing which would cause undesirable effects
> here?
> On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <address@hidden> wrote:
>> Darshit Shah <address@hidden> writes:
>> > When using the --recursive command with wget, there seems to be a small
>> > issue with the logic that decides whether to enqueue a file to the
>> > downloads list or not.
>> >
>> > By default wget downloads files only from the same host. However, this
>> > causes a problem when the target hostname changes thus:
>> > parent: gnu.org
>> > target: www.gnu.org
>> >
>> > This issue causes wget to stop after just one download on a lot of sites.
>> > I'm not sure if this exists in the older or release since I only have the
>> > development version installed.
>> does --span-hosts fix this scenario for you?
>> Cheers,
>> Giuseppe
> --
> Thanking You,
> Darshit Shah
> Research Lead, Code Innovation
> Kill Code Phobia.
> B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani

reply via email to

[Prev in Thread] Current Thread [Next in Thread]