[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] just download HTML content
From: |
Richard Baron Penman |
Subject: |
[Bug-wget] just download HTML content |
Date: |
Sun, 28 Jun 2009 21:31:52 +1000 |
hello,
When mirroring a website how do I just download HTML content (whether
static, PHP, ASP, etc) and ignore images, css, js, and everything else?
At first I thought of creating an accept list, but I can't rely on the file
extension because many HTML pages do not include an extension (eg
http://en.wikipedia.org/wiki/Foo)
Then I thought of a reject list, but there are so many different kinds of
non-HTML content.
Is there a way to do this with wget?
thanks, Richard
- [Bug-wget] just download HTML content,
Richard Baron Penman <=