Hi, we are happy to announce the release 2.1.0 of GNU Wget2. Wget2 is the successor of GNU Wget, a file and recursive website downloader. Designed and written from scratch it wraps around libwget, th
URL: <https://savannah.gnu.org/bugs/?64203> Summary: wget --warc-dedup has bogus behavior against duplicated digest Group: GNU Wget Submitter: None Submitted: Wed 17 May 2023 01:03:48 AM UTC Category
Hi, I believe I found a bug. While downloading a large file with wget, the connection failed multiple times. Wget retried with a range request until it had the entire file downloaded. In the resultin
Good morning, New to wget and web archiving in general here. I've been trying to use wget to mirror a couple of my websites and output WARC files however I am unable to view the WARCs in webarchivepl
Hi, we are happy to announce the release 2.0.0 of GNU Wget2. Wget2 is the successor of GNU Wget, a file and recursive website downloader. Designed and written from scratch it wraps around libwget, th
Follow-up Comment #1, bug #59086 (project wget): minor correction: and links are also NOT followed _______________________________________________________ Reply to this item at: <https://savannah.gnu
URL: <https://savannah.gnu.org/bugs/?59086> Summary: --page-requisites not always working when creating a warc file Project: GNU Wget Submitted by: thomasegense Submitted on: Wed 09 Sep 2020 08:52:02
2 minor fixes for my previous commits: Optimize deduping expression for version info https://github.com/lifenjoiner/wget-for-windows/commit/8efc59dffc547345168239c0b9f70ba1ffcf6e0e How does this mak
Hi gang, I did not see a reaction to this. Could a kind soul please briefly enlighten me about the status of the WARC support? -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskr
Hi gang, it seems to me that the version of wget I have archec@darni:~/pp$ wget -V | head -1 GNU Wget 1.20.3 built on linux-gnu. has a bug at the point where it generates block digests for WARC revis
Follow-up Comment #2, bug #56648 (project wget): Thanks a lot for the fast reply. xattr does save URL to file attribute but I couldn't find a way to get the attributes when using cat to merge all the
Follow-up Comment #1, bug #56648 (project wget): If you copy all files and headers into one file - did you play with the WARC options ? With --xattr the original URL is saved as extended file attribu
Hi, Sorry for the delay, I've been a little too busy with other things. I've attached a testcase file containing two failing tests: 1. I disable the "Host" header, then set it manually, then I disabl
Thanks for the updated patch and for sticking around through my nitpicky reviews. I'd like to spend some more time reviewing this patch and testing it out. So, a full review will likely have to wait
Hi, Thank you again Darshit for your response. The RejectHeaderField rule rejects ANY header of the header field while RejectHeader rejects ONLY the specified full header. Since we wanted to be sure
Thank you very much Tim for your prompt reply. Kind regards, Mauricio == "Mistakes are always forgivable, if one has the courage to admit them" (Bruce Lee) == Linux user #454569 -- Linux Mint user --
That was a problem with the perl https daemon not supporting IPv6. Here on Debian unstable, we just got a patch that fixes it. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=667738 Regards, Ti
I'm reporting the results of make check on Linux Mint 18.3 Thanks in advance for any comment on this. Kind regards Mauricio Zambrano-Bigiarini, PhD == Department of Civil Engineering Faculty of Engin
URL: <https://savannah.gnu.org/bugs/?54839> Summary: Building takes gnutls include from /usr/local/include, but libgnutls.so from system dir Project: GNU Wget Submitted by: rockdaboot Submitted on: M
Hello all, For the past week or so, I've been attempting to mirror a website with Wget. However, after a couple days of downloading (and approx 38 GB downloaded), Wget eventually exhausts all system