[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: local chars displayed as numbers

From: Eli Zaretskii
Subject: Re: local chars displayed as numbers
Date: Sat, 23 Sep 2006 13:32:15 +0300

> From: Kenichi Handa <address@hidden>
> Date: Sat, 23 Sep 2006 15:29:29 +0900
> Cc: address@hidden, address@hidden
> > Obviously, in the case where the file is using window-1252 encoding, there's
> > no harm in Emacs using the windows-1252 encoding.  But what about the other
> > cases, e.g. if the file is just binary, or slightly incorrect utf-8, or ...?
> At least windows-1252 doesn't cover all eight-bit bytes.
> There are a few invalid bytes: 0x81, 0x8c, 0x8e...
> Anyway, how about thinking the situation this way.
> When one visits a binary file and it's detected as
> windows-1252, usually he can easily notice that the
> auto-detection did bad thing because a binary file tend to
> contain many 8-bit bytes in the first page.  So, he can
> re-read the file by C-x C-m c binary RET C-x C-v RET.  But,
> when one visits a windows-1252 file and it's read as
> raw-text, it's more difficult to notice that the file is not
> correctly decoded because it may not contain a raw-byte in
> the first page.  In this case, he'll notice the problem only
> after he did some editing, and that is too late to re-read
> the file.

I have a better idea: can we detect binary files by searching for null
characters (ASCII code zero)?  With binary files out of our way, the
dilemma of what text encoding to guess becomes much easier.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]