[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Coding problem with Euro sign

From: Reiner Steib
Subject: Re: Coding problem with Euro sign
Date: Thu, 15 Dec 2005 16:16:30 +0100
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (gnu/linux)

On Thu, Dec 15 2005, David Hansen wrote:

> (prefer-coding-system 'latin-1)
> (prefer-coding-system 'latin-9)
> (prefer-coding-system 'windows-1252)
> (prefer-coding-system 'utf-8)

I'd expect that the latin-1 line _after_ windows-1252 doesn't make
sense.  Any file that can possibly be encoded with Latin-1 can also be
encodes using windows-1252 (proper superset).  So Emacs will never
choose Latin-1, I think.  Probably the same argument holds for
Latin-9, but I'm not completely sure (does windows-1252 contain _all_
chars from Latin-9?).

Of course UTF-8 also covers Latin-* and windows-1252, but iso-8859*
encoded files are not valid UTF-8 files.  And valid UTF-8 files with
multi-byte characters are not valid iso-8859 files.  Thus Emacs (or
file(1)) is able to distinguish UTF-8 from iso-8859*.

[ Coming back to Ralf's question: ]
On Wed, Dec 14 2005, Ralf Angeli wrote:
> would it be possible for Emacs to figure out the right coding system
> by itself in the case at hand?  That means without me having to
> specify coding systems explicitely by means of preferred coding
> system options, coding cookies, or `C-x RET c' and similar.

No.  A program cannot distinguish iso-8859-1 from iso-8859-2 or -15
reliably.  Same for windows-1252 vs. windows-1258 (0x80 in your
example file).  Heuristic approaches[1] might be possible, though.

Bye, Reiner.

[1] There was a discussion about this in the German newsreader group
    on this, see the monster thread starting with
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]