emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange behaviour with dired and UTF8


From: Jan D.
Subject: Re: Strange behaviour with dired and UTF8
Date: Fri, 2 May 2003 10:16:52 +0200

In article <address@hidden>, "Jan D." <address@hidden> writes:
Maybe I am doing this wrong, but here is what I try to do.
My language environment is ISO-8859-1.
I have a directory that contains files with file names in UTF-8.
I start dired on that directory.  I want to see the UTF-8 characters
so I do C-x RET r utf-8.  File names display OK now.

But when trying to operate on a file, say opening it, I get
"File no longer exists; type `g' to update Dired buffer"
It seems that dired does not keep the original file name around, but
tries to open with the display name representation of the file name.

When I type g, I loose the UTF-8 coding and files are now displayed
as ISO-8859-1 again.  Setting buffer coding to UTF-8 does not help.

Do I have to set file-name-coding-system to UTF-8?  This solves the
problem, but my file-name-coding-system is really ISO-8859-1, it is
just this one directory that is UTF-8.

The current Emacs doesn't have a facility to cope with such
a situation well.

How about this?

(1) Make a customizable variable
    file-name-coding-system-alist; the format is the same as
    file-coding-system-alist.

(2) Make the macro ENCODE_FILE and DECODE_FILE to check that
    variable before using file-name-coding-system and
    default-file-name-coding-system.

(3) Enhance the function dired-revert to update
    file-name-coding-system-alist automatically if it is
    called with coding-system-for-read being bound to
    non-nil.  In that case, it may also have to ask a user
    to save that modification for the future session (via
    customize).

What do people think?  Aren't there any better idea?

This sounds very complicated.  As I understand it, dired first gets
the file name from ls (original representation), then converts that to
whatever encoding it shall use to show it in the buffer (view
representation).  When dired operates on the file (opening for example),
it converts back from the view representation, hoping to get the
original representation.  But this may fail, since conversion
from view back to original is not one-to-one.

This work (original representation -> view representation -> original
representation) should not be needed, IMHO.  Why just not keep the
original representation around (some kind of text property on the file
name?) and always use that when operating on the file? That change would
be transparent to users.

I do not know how dired works, but I think a separation of original
representation and view representation would make it easier for
dired to use any encoding to view the files.

        Jan D.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]