[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange behaviour with dired and UTF8

From: Kenichi Handa
Subject: Re: Strange behaviour with dired and UTF8
Date: Wed, 7 May 2003 10:08:23 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, "Jan D." <address@hidden> writes:
>>>  I agree that this is bad, but I am not sure anything can be done
>>>  about it.
>>  How about my proposal?   Doesn't it solve this problem?

> It depends on what the file-name-coding-system-alist looks like.  If it
> contains full file name path, it could.  Maybe it is best to try it.

It should contain a regular expression matching a directory
or a file name.

> I think it is bad to hawe multiple information sources that has to
> be consulted to figure out the original file name (the display file
> name, the buffer encoding, file system encoding, and the new alist).
> At some point Emacs must have had the original file name.  It is a
> shame to throw away that knowledge and then try to reconstruct it.

Unless we have a mechanism to always keep that knowlege, it
is not reliable.  For instance, even if we keep the original
filename as a text property of a filename string, a filename
string may be modified in various ways and make the property
value obsolete.  And, I don't know if the names listed in
*Completion* buffer can keep that property.

So, I think keeping the information about the original
filename in an alist is the most reliable way.  In addition,
we can use that information in the future emacs session,
which is also an important point.

> An other approach would be to always keep file names as is (i.e.
> the original file name) and put some sort of property on it that is the
> encoding.  This would require that the display engine can display these
> with right encoding.  That way the manipulations is always done on and
> with the original file name.

I strongly oppose to that method.  Emacs should not work on
undecoded raw bytes.  A filename is a kind of text, and thus
a user should be able to handle it as a text (edit,
copy&paste, etc).

>>>  I am not sure your case covers all cases.  If a file name was
>>>  latin-1 and then converted to UTF8 (outside Emacs), Emacs would think 
>>>  it is
>>>  still latin-1, no?
>>>  It involves a bit of user interaction, making it intrusive.
>>  Yes, but I think Emacs doesn't have to care about such a
>>  case.

> Why not?  I think this is about as bad as the failure of the
> *Completion*  buffer.  Maybe worse, because you can not open the file
> at all.

If that filename is recoded as latin-1 in
file-name-coding-system-alist, we can open that file by
customizing file-name-coding-system-alist.  If that filename
is not recoded in the alist, we can open that file by
switching to utf-8 lang. env., or by setting
file-name-coding-system to utf-8, or by customizing

Ken'ichi HANDA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]