[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Mirror a website but no sites with special chars like "?"

From: Paul Wratt
Subject: Re: [Bug-wget] Mirror a website but no sites with special chars like "?"
Date: Mon, 19 Mar 2012 20:45:05 +1300

because you are doing a recursive download you can manipulate a robots.txt

the easiest way to get it going is:
wget -r url/image.png  <= or gif/etc

this will build the folder structure and should get the servers robots.txt file
edit the file to exclude the other unwanted urls
then do your actual recursive mirror
just google example robots.txt to hack what you need


On Mon, Mar 19, 2012 at 3:25 AM, Tobias Krais <address@hidden> wrote:
> Hi together,
> I want to mirror a wiki. For this I use the command
> wget -e robots=off -r -k -p -E -N -l inf intranet/mywiki/
> The request takes a long time, because for each site of the wiki exists
> a edit, upload, ... function. All these "unwanted" sites have one thing
> in common: the URL contains a "?".
> My question: Is it possible exclude sites from the download? If yes, how
> can I do it?
> You help is highly appreciated!
> Greetings,
> Tobias

reply via email to

[Prev in Thread] Current Thread [Next in Thread]