Re: [O] How do you store web pages for reference?

From: Karl Voit
Subject: Re: [O] How do you store web pages for reference?
Date: Mon, 16 Jan 2017 17:35:50 +0100
User-agent: slrn/pre1.0.0-18 (Linux)

Hi Alan,

* Alan Schmitt <address@hidden> wrote:

> On 2017-01-16 15:43, Karl Voit <address@hidden> writes:
>> I am using the Firefox plugin Shelve[1] which stores all of my web
>> pages visited. Those HTML files are written with an ISO time-stamp
>> in their file name. Therefore, my Memacs filename module (see sig)
>> is indexing all visited URLs and they appear on my agenda.
>> So I do have a direct link between my agenda and the HTML files of
>> all web pages I have visited.
>> [1] https://addons.mozilla.org/en-US/firefox/addon/shelve/
> This plugin looks interesting, but it seems to rely on the existing
> functionality of Firefox to save web pages. As I want to save a page
> with its picture and CSS, I would need to choose =E2=80=9CWeb page, complet=
> e=E2=80=9D,
> but the FF documentation says =E2=80=9CThis choice allows you to view it as
> originally shown with pictures, but it may not keep the HTML link
> structure of the original page=E2=80=9D, which worries me a little.

Well, this is a hard problem to do differently: when you save a web
page A which has an URL to B, do you want to end up with a local
copy of A that links to the local copy of B (which you might not
have at all) or an URL to online-B. The latter one is easy (no
change when downloading).

> Do you only save the html or the pictures as well. If it's the latter,
> have you had any issues about links not being preserved?

I save everything.

My settings (with self-translated terms from German):

Settings: MIME: Webpage, complete (HTML)

My default shelve:


MIME: Standard

This way, I end up with all web pages stored in my file system. When
I open an URL, the browser shows my local copy. Sometimes, included
stuff is not loaded correctly. All links point to their original
target (of course). So in case I want to stay local, I do not click
on any link in my local copy.

I mainly navigate through my agenda and its links: agenda -> local
copy -> back to agenda -> next local copy -> back to agenda -> ...

get mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML into Org-mode:
       > get Memacs from https://github.com/novoid/Memacs <
Personal Information Management > http://Karl-Voit.at/tags/pim/
Emacs-related > http://Karl-Voit.at/tags/emacs/

