[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Lynx is caching CGI script output. Why?

From: David Woolley
Subject: Re: LYNX-DEV Lynx is caching CGI script output. Why?
Date: Sat, 18 Oct 1997 14:39:34 +0100 (BST)

> I am building an intranet app that makes extensive use of dynamic HTML 
> generated by CGI scripts.  Lynx seems to be caching the output and not
> requesting a new version when the same tag is repeatedly selected.  This

This is correct behaviour for GET without an expiry time.  GETs are not
supposed to have any side effects, so simply refetching shouldn't produce
a different result.  The result, is however, allowed to change as the
result of time or POST/PUT/DELETEs from others, or similar actions by
other means, but that is why browser have refresh buttons.++

> seems to be true in spite of Pragme: no-cache and Expires: [current time].

Pragma: no-cache is a red-herring, as it is an instruction to intermediate
caches, not to the browser.  There is a good case for Expires: current
time, providing both client and server have good clocks (the office
Netware servers are often 20 minutes out!), however I think
it would be inappropriate to apply this for history list or left arrow
selections, as I would argue they are requests for the page one was
previously viewing and working on, not a new version of it; data which
ceases to be of any value the instant it is created is worthless in
a world of finite timescales.  I think you may have a valid case for Lynx
responding to pre-expiry for normal links, although I still think it is
putting an unreasonable constraint on the design of the user agent - to
really enforce this sort of policy, you would have to inhibit printing
and screen grabs, as these are all means for the overall user agent
system to cache old data.

A caution:  MSIE 3 is religious about expires and if you launch a URL aware
helper, like Office 97 Excel, the page will get fetched twice - once to
find out that it isn't HTML and once by the helper.  Worse, if the page
was password controlled (basic authentication, at least) the helper won't
have the password, and the request will fail with an authentication failure!
I had to remove an Expires header because of this.

If the very act of accessing the URL changes the state of the server,
then you should not use GET.

> Netscape seems to flush cache on any URL beginning with /cgi-bin.

That would be a bug if it used /cgi-bin.  More valid heuristics would be
"?" in the URL, or no Last-Modified-Date in the response.  The ? mark
one has had to be put into the HTTP 1.1 caching spec because of abuse of 
GET for pages with side effects - but this is only a requirement on
proxy servers.  The CERN proxy uses the Last-Modified-Date heuristic 
(unfortunately many web space sellers deliberately send HTML without 
a last modified time, presumably to prevent caching and provide better
marketing statistics, although also because of text mode hit counters.)

You might want to note that a similar question is an FAQ on the Microsoft
IIS groups, so, I am pretty certain that the MSIE heuristics are different
from the Netscape ones.

Incidentally, another factor with GUI browsers is that graphic buttons
are often simulated in forms by image mapped input fields (<input
type=image..); it is almost impossible to hit the same pixel each time
and generate exactly the same URL.  Marketing departments love these, even
though they can cause unecessary end to end queries (my application
could legitimately set a future expiry time, as the information doesn't
change quickly and clerical factors are likely to make changes hours to
days late, anyway) and the HTML 4.0 <button> element won't be sufficiently
universal for a few years.

++ A reasonable criticism of Lynx would be that it needs a revalidate
(If-Modified-Since) as well as a refresh (Pragma: no-cache/Cache-Control:
...) operation; the MSIE 3, If-Modified-Since only, strategy is not good,
as it doesn't allow one to override a broken cache entry, or one that
is out of date because of a bad guess at the expiry time.
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]