RE: [Bug-wget] Problem, no getting any response

From: Tony Lewis
Subject: RE: [Bug-wget] Problem, no getting any response
Date: Sat, 21 Nov 2009 20:20:38 -0800

There are several things about the request you're asking wget to send that
don't match the browser's request.

Let's start with the most obvious: your posted data looks nothing like what
the browser is sending. According to your Firebug output, the data posted

Other things that might matter to the server:
- the user agent (many servers reject web crawling software such as wget)
- the content type (Firefox is sending application/json)
- referer
- cookies

Most of these things you can work around with appropriate settings to wget,
but I'm not aware of any way to override the content type.

Run wget with --debug and compare what wget is sending to what Firebug
reports. The closer you can get wget's request to the Firefox request, the
more likely it is to work.

Good luck.
I'm trying to use wget to scrape some data from a page that requires a
posting of some data (the page itself does it via Javascript).   When I use
the command:

$ wget --header="Content-length:84"

.... I never get a response and wget hangs.

My question is, even though I'm sending the exact same post as the browser
does when I view the page in Firefox (I looked at it in firebug), I guess I
must not be sending something right.  I've tried mimicking everything in the
request header, but no matter what, I always get the hang.

Is there something else I can do?  Something obvious I'm doing wrong?  (Am I
not posting the xml properly?)


--- Here is the request, as reported by Firebug:


--- Full request headers as reported by Firebug:
Host: www.tiffany.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US;
rv: Gecko/20091102 Firefox/3.5.5
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Content-Length: 84
Cookie: assortmentid=101; hascookies1=1;
s_vi=[CS]v1|25842D7985010E69-4000010E8017E5DD[CE]; samebrowsersession=;
previoussid=; _UrlReferrer==http%3A//
Pragma: no-cache
Cache-Control: no-cache

