Hi Gisle, I guess you misunderstood the purpose of WARC. It is for archiving web sites/content, but not used for caching by Wget. It may have some kind of academical use, not sure what people *really
When using wget 1.14 to generate warc.gz files, e.g. wget -O tempname --warc-file="output" "http://example.com" the files this creates do not play back well using the Internet Archives warc.gz parser
Yes, most changes are necessary. For example, I added a method to add a WARC header that was missing; I changed the WARC version; I changed the handling of file size limits et cetera. Gijs
Follow-up Comment #1, bug #49226 (project wget): I'm new to the wget community, and I'm starting to contribute and would like to work on this bug, so here are a few things I understand - As of I can
I would prefer dynamic linking (-lrpcrt4). If rpcrt4.lib/.dll is *not* a basic windows library, you should check for the library in configure.ac (case "$host_os" in ...). There is an example check: d
And for Windows? I guess the 'UuidCreate()' or 'UuidCreateSequential()' functions from Rpcrt4.dll could be used? I could write a patch for loading Rpcrt4.dll at run-time if there's some interest. Do
Hello list. I have been toying around with the '--warc-*' options in Wget. And it seems to work like a charm with my MingW/Win-XP version. But the question I'm left with is; it's nothing that tells m
'--directory-prefix=PREFIX' Set directory prefix to PREFIX. The "directory prefix" is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The
Thanks Gijs! Guess I missed that line there. Also, this does mean that we need better tests especially for Warc file writing. -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Ph
What is WARC ? What is WARC used for ? Windows or 'nix ? What are its benefits, etc ? -- Dave Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk http://www.pctipp.ch/downloads/dl/35905.asp
WOW great work! It is much better now. I wonder if it is possible to remove the dependency from libuuid, maybe provide replacement for uuid_generate and uuid_unparse when libuuid is not found? Even a
Hi Giuseppe and Ángel, Thanks for looking at the patch. Yes, it's quite big. (I should mention that this was also not my intention to have this complete patch added into the wget repository; it is a
Giuseppe Scrivano wrote: the patch is huge and I think we don't want to add some many files into the wget tree. Can't we assume the user will install the warc tools by herself and let configure check
not something we want to rewrite :-) do they influence the way wget+warc works? that is great, only code which is going into wget has to follow GNU standards. Other libraries can have any style, unti
Hi gang, I did not see a reaction to this. Could a kind soul please briefly enlighten me about the status of the WARC support? -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskr