[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [PATCH] Keep fetched URLs in POSIX extended attributes
From: |
Sean Burford |
Subject: |
[Bug-wget] [PATCH] Keep fetched URLs in POSIX extended attributes |
Date: |
Thu, 21 Jul 2016 14:33:23 +1000 |
Hi,
I find it useful to keep track of where files are downloaded from. POSIX
extended attributes provide a lightweight portable method of keeping this
information across Linux, OS/X, FreeBSD and many other platforms.
This compliments wget's existing WARC support, which serves a related but
different use case closer to tcpdump or tar for web pages. Extended
attributes can provide a quick answer to "where did I get this file from
again?"
This patch changes:
* autoconf detects whether extended attributes are available and enables
the code if they are.
* The new flags --xattr and --no-xattr control whether xattr is enabled.
* The new command "xattr = (on|off)" can be used in ~/.wgetrc or
/etc/wgetrc
* The original and redirected URLs are recorded as shown below.
* This works for both single fetches and recursive mode.
Here is an example, where http://archive.org redirects to
https://archive.org:
$ wget --xattr http://archive.org
...
$ getfattr -d index.html
user.xdg.origin.url="https://archive.org/"
user.xdg.referrer.url="http://archive.org/"
These attributes were chosen based on those stored by Google Chrome (
https://bugs.chromium.org/p/chromium/issues/detail?id=45903) and curl (
https://github.com/curl/curl/blob/master/src/tool_xattr.c)
--
Sean Burford <address@hidden>
0001-xattr-Keep-fetched-URLs-in-POSIX-extended-attributes.patch
Description: Text Data
- [Bug-wget] [PATCH] Keep fetched URLs in POSIX extended attributes,
Sean Burford <=