[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: why not use unicode if html file has charset=utf-8?

From: Kevin Rodgers
Subject: Re: why not use unicode if html file has charset=utf-8?
Date: Tue, 27 Jul 2004 09:50:56 -0600
User-agent: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv: Gecko/20020406 Netscape6/6.2.2

Dan Jacobson wrote:
> One would think that if some file.html had
> <META http-equiv=Content-Type content="text/html; charset=utf-8">
> near the top, emacs would show it with the unicode charset.
> Browsers get that right.

I think the first step would be to go from the (MIME) charset attribute
value to an Emacs coding system.  But this particular example (utf-8)
returns 8 alternatives on Emacs 21.3:

(let ((mime-charset 'utf-8)     ; more generally: (intern (downcase "UTF-8"))
      (coding-systems '()))
  (mapatoms (lambda (symbol)
              (if (and symbol
                       (coding-system-p symbol)
                       (eq (coding-system-get symbol 'mime-charset)
                  (setq coding-systems (cons symbol coding-systems)))))
  (sort coding-systems 'string-lessp)) =>
(mule-utf-8 mule-utf-8-dos mule-utf-8-mac mule-utf-8-unix utf-8 utf-8-dos 
utf-8-mac utf-8-unix)

What's the right way to choose among them?  Ah, gnus/mm-util.el has
ths: (mm-charset-to-coding-system "UTF-8") => utf-8

The next step would be to call set-buffer-file-coding-system; should
that be done via html-mode-hook, or is that too late?  What about using

Kevin Rodgers

reply via email to

[Prev in Thread] Current Thread [Next in Thread]