Problem with national characters in XHTML

From: Lennart Borgman
Problem with national characters in XHTML
Date: Thu, 29 Sep 2005 15:52:17 +0200
Piet van Oostrum wrote:

Mathias Dahl <address@hidden> (MD) wrote:

MD> I might be wrong here, but doesn't UTF-8 encode all characters in
MD> Latin-1 (ISO 8859-1) exactly as they are *in* Latin-1 encoding?

No. Iso 8859-1 uses 1 byte for all characters, while UTF-8 uses two bytes
for those characters that are in iso-8859-1. What you probably mean is that
the Unicode value (code point) for each iso-8859-1 character is the same as
its encoding in iso-8859-1.
This is not easy. What you say make it even more interesting why C-q 3 4 4 RET is stored as 2276 (or what it was) in the XHTML files. How can that be? (For the context see my earlier mails.)

