[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] WARC output
From: |
Gijs van Tulder |
Subject: |
Re: [Bug-wget] WARC output |
Date: |
Wed, 10 Aug 2011 11:38:51 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110617 Lightning/1.0b2 Thunderbird/3.1.11 |
Giuseppe Scrivano writes:
>> The implementation makes use of the open source WARC Tools library
>> (Apache License 2.0):
>> http://code.google.com/p/warc-tools/
>
> how much code is really needed from that library? I wonder if we can
> avoid this dependency at all.
The library comes with some utilities, an HTTrack plugin, a Java module
etc. These extra things are not needed for Wget. But of the C library, I
used pretty much everything. The library handles all the WARC writing
stuff. It can also read WARCs, but that's not needed here.
Rough estimate: 12.000 lines of code (excluding comments).
It's probably important to note that I have changed a few small things
in the warc-tools library. (I have records in Git.)
As for the other dependencies:
- I used an MIT-licenced base32 encoder (there seems to be no such
module in Gnulib), but that's quite small so could be replaced;
- it links to the UUID library.
> Can you please track all contributors? Any contribution to GNU wget
> requires copyright assigments to the FSF.
Yes, it's all in the Git history, so it's easy to make a list. (There's
only one other contributor of code, others helped with testing.)
> In the meanwhile, can you check if you are following the GNU Coding
> Standards for the new code?
I tried to do that. So except for the warc-tools library, which uses a
different standard, all new code follows the GNU standards (I hope).
Thanks,
Gijs