[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Bug in WGET?
From: |
Patrick Steil |
Subject: |
Re: [Bug-wget] Bug in WGET? |
Date: |
Sat, 23 Jul 2011 16:26:57 -0500 |
Thanks!
I will try this when I get time...
Here is a note... I am using the --spider option now and it looks like it
also downloads and saves the file to disk and then removes it when it is
done... I don't mind on that, but it doesn't match the documentation...
Also, if I use wget in spider mode, it will at the end of the log output
tell me about all the broken links... but I also need to know what page
those broken links are created on (if the broken link) is on the site I am
getting... this will help me find the 404 on my site...
I have a vision for how this should work to make it awesome...
Any way to do that, or anyone want to add this functionality?
Thanks!
On Sat, Jul 23, 2011 at 7:12 AM, Giuseppe Scrivano <address@hidden>wrote:
> Hello,
>
> Patrick Steil <address@hidden> writes:
>
> > If I run this command:
> >
> > wget www.domain.org/news?page=1 options= -r --no-clobber
> --html-extension
> > --convert-links -np --include-directories=news
>
> > Here is what it does today:
> >
> > 1. When --html-extension is turned on, the --noclobber is not changing
> the
> > name of the downloaded files, but it DOES rewrite the file as the
> date/time
> > stamp changes every time I run the above command.
>
> I couldn't reproduce it. I have `strace'd but I can't see any syscall
> which could modify the time stamp. Can you please attach the strace
> and the wget debug log? You can get it by:
>
> strace -o strace.log wget <args> -d -o wget.log
>
>
>
> > 2. If I turn off --html-extension, then as soon as WGET sees that the
> first
> > file has already been downloaded it stops and does not continue to
> > spider/download any further pages.
>
> AFAICS, the behaviour you get using --no-clobber and -r is documented,
> and it should work exactly as you described it (a newer version is
> ignored). The old version is still traversed for links.
>
> Cheers,
> Giuseppe
>
--
**
*Patrick Steil | ChurchBuzz.org*
Church Website Optimization <http://www.churchbuzz.org/>
Like us on Facebook <http://facebook.com/churchbuzz>!
Mobile: 940-391-9250