[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] download page-requisites with spanning hosts
From: |
Jake b |
Subject: |
[Bug-wget] download page-requisites with spanning hosts |
Date: |
Wed, 29 Apr 2009 18:50:11 -0500 |
I'm trying to download multiple pages from the sijun speedpaint thread
so I can use their images for my random desktop folder. I can download
each page by hand using firefox, but, this becomes unwieldy,
especially since prev button has bit of a delay. ( So I want to
automate it, with delays and/or speedcaps to be friendly to the server
)
The wGet command I am using:
wget.exe -p -k -w 15
"http://forums.sijun.com/viewtopic.php?t=29807&postdays=0&postorder=asc&start=27330"
It has 2 problems:
1) Rename file:
Instead of creating something like: "912.html" or "index.html" it instead
becomes: "address@hidden&postdays=0&postorder=asc&start=27330"
2) images that span hosts are failing.
I have page-resuisites on, but, since some pages are on tinypic, or
imageshack, etc.... it is not downloading them. Meaning it looks like
this:
sijun/page912.php
imageshack.com/1.png
tinypic.com/2.png
randomguyshost.com/3.png
Because of this, I cannot simply list all domains to span. I don't
know all the domains, since people have personal servers.
How do I make wget download all images on the page? I don't want to
recurse other hosts, or even sijun, just download this page, and all
images needed to display it.
[ This one is a lower priority, but someone might already know how to
solve this ]
3) After this is done, I want to loop to download multiple pages. It
would be cool If I downloaded pages 900 to 912, and each pages next
link work correctly to link to the local versions.
I'm not sure if I can use wget's -k command, or, if that won't work
because of recursion on forums can be wierd?
Either way, I have a simple script that can convert 900 to 912 into
the correct URLs, and pausing in between each request.
Maybe I will have to manually modify links using regex's unless you
know a shortcut?
thanks!
--
Jake
- [Bug-wget] download page-requisites with spanning hosts,
Jake b <=