bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Behaviour of spanning to accepted domains


From: Tim Rühsen
Subject: Re: [Bug-wget] Behaviour of spanning to accepted domains
Date: Fri, 05 Jun 2015 22:24:16 +0200
User-agent: KMail/4.14.2 (Linux/4.0.0-1-amd64; KDE/4.14.2; x86_64; ; )

Am Freitag, 5. Juni 2015, 08:01:03 schrieb Tony Lewis:
> On June 03, 2015 Tim Ruehsen wrote:
> > This has already been fixed to:
> > 
> > "Set domains to be followed.  domain-list is a comma-separated list of
> > domains.  Note that it does not turn on -H."
> First, I have not dug into the source code to see how -H is implemented.
> However, it makes sense to me that one ought to be able to specify both -H
> and -D together.

Hi Tony,

-H (=all domains)
to exclude some sites use --exclude-domains domain-list


> 
> Consider this scenario: I want to mirror a site including the images that
> are stored in a sub-domain, but I don't want to mirror every external site
> referenced by the site. So I would try this:
> 
> wget --mirror http://www.somesite.com -H -D www.somesite.com
> images.somesite.com
> 
> Which should get all the files from www.somesite.com and images.somesite.com
> without getting files from www.relatedsite.com.

You can also play with:

       -A acclist --accept acclist
       -R rejlist --reject rejlist
           Specify comma-separated lists of file name suffixes or patterns to 
accept or reject. Note that if any of the wildcard characters, *, ?, [ or ],
           appear in an element of acclist or rejlist, it will be treated as a 
pattern, rather than a suffix.  In this case, you have to enclose the pattern
           into quotes to prevent your shell from expanding it, like in -A 
"*.mp3" or -A '*.mp3'.

       --accept-regex urlregex
       --reject-regex urlregex
           Specify a regular expression to accept or reject the complete URL.

Regards, Tim

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]