[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Displaying bytes (was: Inadequate documentation of silly

From: Juri Linkov
Subject: Re: Displaying bytes (was: Inadequate documentation of silly
Date: Mon, 30 Nov 2009 00:01:29 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (x86_64-pc-linux-gnu)

>> Here's another idea.  We could employ some heuristics to see if the
>> distribution of those characters seems typical for the way those
>> characters are used.  For instance, some of the punctuation characters
> Using such heursitics might be a good idea in general to automatically
> detect which encoding is used, or which language is used.

Unicad (http://www.emacswiki.org/emacs/Unicad) uses statistic models
to auto-detect windows-1252 and many many other coding systems
(auto-detecting windows-1252 is not advertised on the main page,
but actually can be observed in source code).  The theory is described
at http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html
I hope sometime this will be added to Emacs.

Juri Linkov

reply via email to

[Prev in Thread] Current Thread [Next in Thread]