[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: those funny non-ASCII characters

From: rusi
Subject: Re: those funny non-ASCII characters
Date: Fri, 1 Jun 2012 09:26:08 -0700 (PDT)
User-agent: G2/1.0

On Jun 1, 12:03 pm, Xah Lee <address@hidden> wrote:
> On May 31, 10:43 pm, rusi <address@hidden> wrote:
> > On Jun 1, 9:23 am, Jason Rumney <address@hidden> wrote:
> > > On Thursday, 31 May 2012 01:15:11 UTC+8, Buchs, Kevin  wrote:
> > > > Xah suggested I embrace Unicode. So I could use (prefer-coding-system
> > > > 'utf-8) or the file variable: -*- coding: utf-8 -*-. Are there drawbacks
> > > > to the former? What about opening an ASCII coded file? Can emacs
> > > > properly detect it or does it come up as UTF-8?
> > > ASCII is a subset of UTF-8, so the problem you are imagining does not 
> > > exist.
> > This does not exactly work that way on windows.
> > eg recently saw a description of how notepad put a BOM mark in a
> > haskell-script which made the haskell scripts unrunnable
> haskell compiler probably should bear the blame. Last i read (~4 years
> ago), the lang spec says source code should be unicode (i forgot if it
> specified a encoding), however, no haskell compiler at the time
> supports it. If your lang spec says unicode, you have to support BOM
> mark.
> 〈Unicode BOM Byte Order Mark 
> Hack〉
>  Xah

(pg 36) "Use of a BOM is neither required nor recommended for UTF-8,
but may
be encountered in contexts where UTF-8 data is converted from other
encoding forms..."

More specifically the non-recommendation of bom:
"Note that some recipients of UTF-8 encoded data do not expect a BOM.
Where UTF-8 is used transparently in 8-bit environments, the use of a
BOM will interfere with any protocol or file format that expects
specific ASCII characters at the beginning, such as the use of "#!" of
at the beginning of Unix shell scripts. "

reply via email to

[Prev in Thread] Current Thread [Next in Thread]