Re: Displaying bytes (was: Inadequate documentation of silly

From: Stefan Monnier
Subject: Re: Displaying bytes (was: Inadequate documentation of silly
Date: Sun, 29 Nov 2009 11:31:55 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux)

> If it turns out that windows-1252 files are the main cause of
> 8-bit-control characters in the buffer, here's another idea.

It may be the case for some users, but it probably isn't the case
in general.  It's clearly not the case for me (I only/mostly see such
characters in Gnus when I receive email that is improperly labelled,
where I'm happy to see tham so that I complain to their originator).

> Here's another idea.  We could employ some heuristics to see if the
> distribution of those characters seems typical for the way those
> characters are used.  For instance, some of the punctuation characters

Using such heursitics might be a good idea in general to automatically
detect which encoding is used, or which language is used.
As time passes, it becomes less and less important for coding-systems in
my experience (utf-8 and utf-16 seem to slowly take over and we already
auto-detect them well).


