emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: utf-16 not auto-detected when finding file


From: Kenichi Handa
Subject: Re: utf-16 not auto-detected when finding file
Date: Wed, 30 Mar 2005 17:57:01 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Jason Rumney <address@hidden> writes:

> Dave Love <address@hidden> writes:
>>  Yes.  Perhaps someone knows exactly what Windows does (assuming the
>>  only significant use of it is in Windows)?

> I would guess that the presence of a BOM is sufficient
> heuristics. Detecting 0 or other low byte values every second
> byte would work for Latin script based languages, but I don't think
> any heuristic like that would work on Asian text unless you could
> assume a specific language and use a dictionary.

I think BOM is not that safe because there are many charsets
who have normal letters at 0xFE and 0xFF.

What I'm thinking is to detect how LF (0x0A) is encoded
because Unicode doesn't have U+0A00.  If there's no LF, we
must give up detecting.

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]