[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [bug #45801] Allowing to configure HTML engine which links to
From: |
Tim Ruehsen |
Subject: |
[Bug-wget] [bug #45801] Allowing to configure HTML engine which links to follow |
Date: |
Tue, 03 Nov 2015 15:25:31 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38.3.0 |
Follow-up Comment #1, bug #45801 (project wget):
There are --accept-regex and --reject-regex.
For your example below you could use
wget -e robots=off -r --regex-type=pcre --accept-regex
'(20151027/$|Scrolling_Survival_Turn_)' --reject-regex ";+"
http://replays.wesnoth.org/1.12/
1. --reject-regex ";+" skips these 'sorting' URLs
2. --accept-regex makes Wget just look into subdir 20151027 and from there
just download URLs containing 'Scrolling_Survival_Turn_'
Note that for --regex-type=pcre you need PCRE compiled in (just try it out),
else you could use POSIX regexes.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?45801>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-wget] [bug #45801] Allowing to configure HTML engine which links to follow,
Tim Ruehsen <=