WARC for wget2 is on the list, maybe as an extra library project. Thanks for your feedback - I wasn't aware of WARC users out there ;-) Regards, Tim Attachment: signature.asc Description: OpenPGP dig
Thank you Tim, this sounds like exactly what I was after! (It's especially important when you have wget logged in as a user, to be able to tell it not to go to the logout page.) Though if that featu
Follow-up Comment #2, bug #53968 (project wget): Thanks for fixing it. I applied it as a patch to the tar release of 1.19.5 and it appears to work correctly. _________________________________________
URL: <http://savannah.gnu.org/bugs/?53968> Summary: Decompressed data is written to WARC file when using --compression=gzip Project: GNU Wget Submitted by: None Submitted on: Thu 24 May 2018 12:50:50
Hi Eric, IMO the deadline for GSOC student applications is today 18:00 CEST, so you have to hurry. During 27th March and 23rd April the organizations review and decide for/against proposals. Both, ht
To whom this may concern, My name is Eric Ngo and I am a computer science major at San Francisco State University. I was looking for open-source projects to contribute to in GSOC 2018 and came across
Hi, I'll get to a discussion about the proposal shortly, but in the meantime, may I please request everyone to avoid continuing this email thread on address@hidden That is a generic mailing list for
Since your code will likely use functions from libwget and the other way round, we should place it in libwget/. But if it makes your development easier during GSOC, feel free to put it into a separat
wget --warc-file=httpbin -qO- https://httpbin.org/get How to convert the warc format to the actual header of requests and responses? Greetings WARC is gzipped plain text. wget --warc-file=httpbin --n
Follow-up Comment #2, bug #52705 (project wget): While MHTML was a convenient way to create snapshots of pages, sadly it was never properly standardized and most popular browsers no longer support it
For what it's worth, I confirmed that Heritrix (Internet Archive's crawling tool) produces WARC files without the angle brackets for WARC-Target-URI. Best regards, William Prescott
Hello, It seems that there may be some ambiguity in the WARC standard regarding the usage of angle brackets surrounding the URI given for a WARC-Target-URI field. In short, while the BNF grammar incl
Hi Tim, I think that would be a nice feature to have. We are already linking to libz for the WARC support so gzip compression won't require a new dependency for wget. Regards, Giuseppe
Follow-up Comment #4, bug #51029 (project wget): Hi again, our system hit another website with the same behavior. It's the same call as in the original post but with https://www.sparkasse.at as targe
Hi Vijo, We try to be backward compatible with options (name and functionality). But it's not a must. We are free to fix bugs or change/extend behavior. That's why we call the executable 'wget2'. It
URL: <http://savannah.gnu.org/bugs/?50788> Summary: Build failure against openssl-1.1 that lacks deprecated features Project: GNU Wget Submitted by: polyc Submitted on: Wed 12 Apr 2017 11:05:51 AM CE