[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: unicode in emacs 21

From: Markus Kuhn
Subject: Re: unicode in emacs 21
Date: Sat, 27 Oct 2001 19:27:51 +0100 (BST)

On Thu, 25 Oct 2001, Eli Zaretskii wrote:
> > Is the internal representation still the special MULE format ??~
> Yes.  But the internal representation is not the problem here; ideally,
> users and Lisp programs shouldn't be worrying about how characters are
> represented internally.  The problem is that characters are still not
> unified in Emacs 21.

Not entirely.

Internal representation does matter somewhat when it comes to the handling
of malformed UTF-8 sequences. I think it is highly desireable that the
UTF-8 -> emacs internal -> UTF-8 conversion roundtrip is made 100% binary
transparent. Loading and saving a file that contains malformed UTF-8
sequences should not change them, but character encoding conversions are
prone to throw away information in the case of invalid source byte

Using UTF-8 as the internal Emacs encoding is one way of achieving
continued guaranteed binary transparency, coming up with a tricky encoding
for malformed UTF-8 sequences is another one. I favour the former
approach, which is also what other UTF-8 capable modern editors do today.


Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]