emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Displaying bytes (was: Inadequate documentation of silly characters


From: Kenichi Handa
Subject: Re: Displaying bytes (was: Inadequate documentation of silly characters on screen.)
Date: Wed, 25 Nov 2009 10:33:54 +0900

In article <address@hidden>, Richard Stallman <address@hidden> writes:

>     $ od -c euro.txt
>     0000000   T   h   a   t       c   o   s   t   s     200   1   7   .  \n
>     0000020
>     $ emacs euro.txt

>     This is really a windows-1252 file and the strange character is
>     supposed to be a Euro sign.
>     For me, with no particular setup to make Emacs expect windows-1252
>     files that shows in emacs as
>     "That costs \20017." with raw-text-unix.

> Why doesn't Emacs guess right, in this case?

Because some other coding system of the same coding-category of
windows-1252 (coding-category-charset) has the higher priority and
that coding system doesn't contain code \200.

> Could we make it guess right by changing the coding system
> priorities?

Yes.

> If so, should we change the default priorities?

I'm not sure.  As it seems that windows-1252 is a superset of
iso-8859-1, it may be ok to give windows-1252 the higher priority.
How do iso-8859-1 users think?

The better thing is to allow registering multiple coding systems in
one coding-category, but I'm not sure I have a time to work on it.

> It may be that a different set of priorities would cause similar
> problems in some other cases and the current defaults are the best.
> But if we have not looked at the question in several years, it would
> be worth studying it now.

>     In that case revert-buffer-with-coding-system. Ideally I'd like Emacs
>     to ask directly when opening the file
>     in such a case, if it can't determine anything better than raw-bytes.

> Maybe so.

For that, it seems that adding that facility in
after-insert-file-set-coding is good.   Here's a sample patch.  The
actual change should give more information to a user.

--- mule.el.~1.294.~    2009-11-17 11:42:45.000000000 +0900
+++ mule.el     2009-11-25 10:17:49.000000000 +0900
@@ -1893,7 +1893,18 @@
           coding-system-for-read
           (not (eq coding-system-for-read 'auto-save-coding)))
       (setq buffer-file-coding-system-explicit
-           (cons coding-system-for-read nil)))
+           (cons coding-system-for-read nil))
+    (when (and last-coding-system-used
+              (eq (coding-system-base last-coding-system-used) 'raw-text))
+      ;; Give a chance of decoding by some coding system.
+      (let ((coding-system (read-coding-system "Actual coding system: ")))
+       (if coding-system
+           (save-restriction
+             (narrow-to-region (point) (+ (point) inserted))
+             (let ((modified (buffer-modified-p)))
+               (decode-coding-region (point-min) (point-max) coding-system)
+               (setq inserted (- (point-max) (point-min)))
+               (set-buffer-modified-p modified)))))))
   (if last-coding-system-used
       (let ((coding-system
             (find-new-buffer-file-coding-system last-coding-system-used)))

---
Kenichi Handa
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]