Re: [Bug-wget] request for help with wget (crawling search results of a

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] request for help with wget (crawling search results of a

From:	Tony Lewis
Subject:	Re: [Bug-wget] request for help with wget (crawling search results of a website)
Date:	Sun, 3 Nov 2013 22:52:34 -0800

Altug Tekin wrote:

> To achieve this I created the following wget-command:
>
> wget --reject=js,txt,gif,jpeg,jpg \
>      --accept=html \
>      --user-agent=My-Browser \
>      --recursive --level=2 \
>
www.voanews.com/search/?st=article&k=germany&df=08%2F21%2F2013&dt=09%2F20%2F
2013&ob=dt#article

You need to quote the URL since it contains characters that are interpreted
by your command shell. (Most likely nothing after the "&" was sent to the
web server.

I think you might run into problems with --accept since the URL does not end
with ".html" so you might need to delete that argument to get the results
you want.

Tony

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] request for help with wget (crawling search results of a website), Altug Tekin, 2013/11/03
- Re: [Bug-wget] request for help with wget (crawling search results of a website), Dagobert Michelsen, 2013/11/03
- Re: [Bug-wget] request for help with wget (crawling search results of a website), Tony Lewis <=

Prev by Date: [Bug-wget] Keep copyright year always update.
Next by Date: Re: [Bug-wget] wget alpha release 1.14.96-38327
Previous by thread: Re: [Bug-wget] request for help with wget (crawling search results of a website)
Next by thread: [Bug-wget] Suggestion
Index(es):
- Date
- Thread