bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Enqueue logic problems


From: Micah Cowan
Subject: Re: [Bug-wget] Enqueue logic problems
Date: Thu, 2 May 2013 08:17:54 -0700

I believe you want -H -D gnu.org. That's what it's for. Wget doesn't
know which hostnames under a domain should be allowed and which should
not be (do you want images.gnu.org? git.gnu.org? lists.gnu.org?), so
turns 'em all off unless you ask for them explicitly.

HTH,
-mjc

On Thu, May 2, 2013 at 4:52 AM, Darshit Shah <address@hidden> wrote:
> I should have been more clear. --span-hosts will enqueue the other files,
> but it will also enqueue files from other hosts. I wish to recursively
> download a website but not other sites that it links to.
>
> Of course I could add --accept-regex / --reject-regex options to prevent
> wget from wandering onto other hosts. But shouldn't the default --recursive
> option simply handle cases where a www is either added or removed? Or is
> there any scenario that I am missing which would cause undesirable effects
> here?
>
> On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <address@hidden> wrote:
>
>> Darshit Shah <address@hidden> writes:
>>
>> > When using the --recursive command with wget, there seems to be a small
>> > issue with the logic that decides whether to enqueue a file to the
>> > downloads list or not.
>> >
>> > By default wget downloads files only from the same host. However, this
>> > causes a problem when the target hostname changes thus:
>> > parent: gnu.org
>> > target: www.gnu.org
>> >
>> > This issue causes wget to stop after just one download on a lot of sites.
>> > I'm not sure if this exists in the older or release since I only have the
>> > development version installed.
>>
>> does --span-hosts fix this scenario for you?
>>
>> Cheers,
>> Giuseppe
>>
>
>
>
> --
> Thanking You,
> Darshit Shah
> Research Lead, Code Innovation
> Kill Code Phobia.
> B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani



reply via email to

[Prev in Thread] Current Thread [Next in Thread]