[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Check external reference, but don't process further

From: Fernando Gont
Subject: [Bug-wget] Check external reference, but don't process further
Date: Tue, 27 Nov 2018 08:20:45 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1


I'm using wget in a script to check for broken links in a web site,
which uses the "--spider" mode.

I'd like wget to operate in recursive mode for pages in the target
domain, but not for pages in other hosts/sites.

That is, if I'm crawling www.example.com, I'd like wget to process all
pages in that domain recursively. However, if there's a link to an
external site, I just want wget to check that URL, but not process that
external reference recursively.

"-D" would seem to prevent checking external references, so I cannot use
it. And "--level" would mean that pages on external sites my still be
processed recursively.

Any advice on how to implement this?



Fernando Gont
SI6 Networks
e-mail: address@hidden
PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492

reply via email to

[Prev in Thread] Current Thread [Next in Thread]