bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Please use gzip/gunzip when fetching webpages


From: Tim Rühsen
Subject: Re: Please use gzip/gunzip when fetching webpages
Date: Fri, 3 Feb 2023 14:40:57 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0

On 01.02.23 18:05, itstheworm--- via Primary discussion list for GNU Wget wrote:
More often than not I try recursively downloading a webpage using wget, only to 
have it download a single `index.html.gz` then stop. Obviously wget can't read 
gzipped files so it fails to find any links for recursive downloading... I 
ended up using a wget fork[1] that was last updated 10 years ago and it works 
fine, however I find it odd that such a basic feature never made it into 
mainline wget.

Please add a feature for automatically detecting and uncompressing gzipped 
webpages before crawling them.

Sorry about your experience. This feature have been added years back:
--compression=TYPE choose compression, one of auto, gzip and none. (default: none)

This feature is off by default, but you can add it to your ~/.wgetrc file to permanently enable it (see `man wget`).

Nonetheless, no server should serve gzip compressed pages when not explicitly asked for via `Accept-Encoding: gzip`.

Regards, Tim

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]