bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Behaviour of spanning to accepted domains


From: Tony Lewis
Subject: Re: [Bug-wget] Behaviour of spanning to accepted domains
Date: Sun, 7 Jun 2015 08:19:28 -0700

On Friday, June 05, 2015 1:24 PM, Tim Rühsen wrote:

> > First, I have not dug into the source code to see how -H is implemented.
> > However, it makes sense to me that one ought to be able to specify 
> > both -H and -D together.
> -H (=all domains)
> to exclude some sites use --exclude-domains domain-list

wget --help says about -H: go to foreign hosts when recursive.

It doesn't say that when using -H one *must* take every foreign host that
exists on the Internet and I'm arguing that such an interpretation does not
make sense.

One ought to be able to request that wget go to foreign hosts without that
implying that wget mirror the entire Internet. One obvious way to limit
which foreign hosts are mirrored is to use -H in combination with -D.

> > Consider this scenario: I want to mirror a site including the images 
> > that are stored in a sub-domain, but I don't want to mirror every 
> > external site referenced by the site. So I would try this:
> >
> > wget --mirror http://www.somesite.com -H -D www.somesite.com 
> > images.somesite.com
>
> You can also play with:
>
>       -A acclist --accept acclist
>       -R rejlist --reject rejlist

I can play with lots of wget options, but in the scenario described I want
*all* files from two hosts, but not every other foreign host that might be
referenced by one of those hosts.

What command line would you use for the scenario described?

Tony 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]