bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] Keep fetched URLs in POSIX extended attributes


From: Darshit Shah
Subject: Re: [Bug-wget] [PATCH] Keep fetched URLs in POSIX extended attributes
Date: Fri, 22 Jul 2016 14:08:27 +0200
User-agent: Mutt/1.6.2-neo (2016-06-11)

On 07/22, Tim Rühsen wrote:
Hi Sean,

thank you very much, definitely a very nice feature !

I extended the commit message with GNU stuff and pushed it.

Has Sean signed the FSF copyright assignment? This is a major code contribution and would require the legalities to be completed.

Once that is clear, I am happy to see this code in the codebase as well.

This feature deserves to be extended :-)
I have the mime type and the content charset in mind.

BTW, what is the 'cost' for this feature regarding disk space ?

Regards, Tim

On Thursday, July 21, 2016 2:33:23 PM CEST Sean Burford wrote:
Hi,

I find it useful to keep track of where files are downloaded from.  POSIX
extended attributes provide a lightweight portable method of keeping this
information across Linux, OS/X, FreeBSD and many other platforms.

This compliments wget's existing WARC support, which serves a related but
different use case closer to tcpdump or tar for web pages.  Extended
attributes can provide a quick answer to "where did I get this file from
again?"

This patch changes:
*   autoconf detects whether extended attributes are available and enables
the code if they are.
*   The new flags --xattr and --no-xattr control whether xattr is enabled.
*   The new command "xattr = (on|off)" can be used in ~/.wgetrc or
/etc/wgetrc
*   The original and redirected URLs are recorded as shown below.
*   This works for both single fetches and recursive mode.

Here is an example, where http://archive.org redirects to
https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/";
user.xdg.referrer.url="http://archive.org/";

These attributes were chosen based on those stored by Google Chrome (
https://bugs.chromium.org/p/chromium/issues/detail?id=45903) and curl (
https://github.com/curl/curl/blob/master/src/tool_xattr.c)




--
Thanking You,
Darshit Shah

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]