lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Why doesn't lynx cache HTML source?


From: Bela Lubkin
Subject: Re: lynx-dev Why doesn't lynx cache HTML source?
Date: Mon, 9 Nov 1998 23:49:13 -0800

Chuck Martin wrote:

> Sometimes I want to view the HTML source of a document using the "\"
> key, and then switch back to the rendered document when I'm through,
> or I want to switch all images to links and back again, using the "*"
> key, or maybe I want to turn pseudo-ALTs on or off, but every time I
> make a switch in how I view a document, the whole document has to be
> downloaded again.  Is there a reason why it was done this way?  It
> seems to me that caching the HTML source instead of the rendered
> document would make more sense, and would save time when making these
> changes on the fly.  Of course, moving between cached documents might
> be a little slower unless they were cached both ways (source and
> rendered), but not by much, and rendering a cached document would be
> much faster than downloading it repeatedly.  Could someone enlighten
> me?

This issue has been discussed extensively in the past; search the
Lynx-Dev archives for the details.  I used www.altavista.com's "advanced
search" to search for "host:www.flora.org AND source AND cach* AND
text:subject AND text:html", finding 59 matches.

To summarize briefly: Lynx uses a single-pass recursive descent parser,
consuming the HTML source and producing rendered output on the fly.
Only that rendered output is cached.  Since Lynx has many operations
which require it to re-parse the HTML source, many users have suggested
that Lynx cache the source as well.  This usually leads to a discussion
of pros and cons, some of which are:

  Con:

    - would add to the complexity of Lynx
    - caching rules are very difficult to get right
    - would add code, increasing the size of Lynx source and binary
    - would increase the in-core and/or on-disk storage consumed by Lynx
      during operation
    - duplicates functionality which is already provided by other
      programs, i.e. web caches such as Squid -- programs which are
      dedicated to caching functionality and thus can be expected to do
      it better than Lynx could hope to

  Pro:

    - would add to the utility of Lynx
    - greatly speed operations which require a re-parse, including '\'
      view-source, '^V' other-DTD, '*' image-URLs, '"' soft-dquotes, '`'
      and "'" comment-parsing, '[' pseudo-alts, '@' raw-mode, and
      changes in assumed document character set.
    - easier for a regular user to install than a full web proxy
    - persistence of cache can be better tuned to the user (e.g., cached
      objects can persist only for the duration of a session, thus not
      consume disk space while Lynx not running)

Possile techniques have been discussed.  But the bottom line is that
Lynx is a cooperative volunteer effort, and big projects like a revamped
caching system do not happen unless someone contributes the code.

Please feel free to jump in and write it!  ;-}

>Bela<

reply via email to

[Prev in Thread] Current Thread [Next in Thread]