[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev HTML4.0 and default charset

From: David Woolley
Subject: Re: lynx-dev HTML4.0 and default charset
Date: Thu, 4 Mar 1999 08:00:08 +0000 (GMT)

>    Unfortunately, some older HTTP/1.0 clients did not deal properly with
>    an explicit charset parameter. HTTP/1.1 recipients MUST respect the
>    charset label provided by the sender; and those user agents that have
>    a provision to "guess" a charset MUST use the charset from the

I think guess is really a euphemism for assuming one (probably a compile
time choice) of:

- Windows character set;

- the national character set of the user.

>    content-type field if they support that charset, rather than the
>    recipient's preference, when initially displaying a document.
> The client requirement is clear for the case where there is an explicit
> charset value in the Content-Type header.  (One could quibble about
> the exact meaning of "initial[ly] displaying" though.)  There is no

I think it is fairly clear - the browser must obey the content type, and
render the page accordingly; if the user then decides that the result
is the sort of rubbish that results from a particular wrongly declared
character set, the browser may permit them to select an alternative
character set in which to re-render the page.

> clear prescription for the case of a missing charset value, But the last
> sentence implies that user agents (at least some class of them) are
> allowed to override the default value "ISO-8859-1" defined above.
> So it remains unclear just what the default value of "ISO-8859-1" means,
> and under which circumstances it applies.  One could speculate that, by

It think the mess in the wording is a result of a total failure in
the real world to obey the standards; I think they are asking for a
best effort to use ISO 8859/1 but giving some licence to violate this
if dealing with known broken pages.  However HTTP 1.1 servers can be
assumed never to violate and are supposed to only invoke these clauses
if they are likely to be being accessed by broken HTTP 1.0 clients which
mis-parse the Content-Type when correctly told the charset.

If the status line says HTTP/1.1, and there is no charset, a HTTP 1.1 browser
cannot legitimately assume that it is dealing with, say, Ukrainian, and
in any case must not make such an assumption for HTTP/1.0 material until
the user has had a chance to look at the 8859/1 rendering.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]