bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Potential bug or something else?


From: Mike
Subject: [Bug-wget] Potential bug or something else?
Date: Thu, 20 May 2010 16:01:24 +0100

Hi,

I have been downloading some pages off one of my sites, however I
sometimes get two 4-digit hex codes appear in the HTML source:

Here's the start of one page:

"209b
         <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd";>
         <html>

         <head>"

The other 4-digit code appears later on in the page.

Has anyone ever seen this before... it definitely doesn't appear on
the original page.  It appears on all html files in particular
directories, but some directories are clean.

I'm running with this wget call:
wget -A html,php,htm -b --default-page=__SLASH__.html --random-wait
http://www.whateverurl.co.uk -w 10 -r -k -l 100 -U "Botlet"

Any help much appreciated.  I can ad some post-processing to remove
the codes but that feels like a hack.  Any ideas what it might be?

Thanks,
Mike.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]