[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: url-retrieve and utf-8

From: William Xu
Subject: Re: url-retrieve and utf-8
Date: Thu, 07 Feb 2008 17:05:31 +0900
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.50 (darwin)

Stefan Monnier <address@hidden> writes:

> I can't remember exactly, but I think it doesn't (it just returns the
> raw undecoded bytes).  url-insert-file-contents should try and obey
> "Content-Type"'s charset info, tho.

Hmm, url-insert-file-contents' implementation appears to obey

| ;;;###autoload
| (defun url-insert-file-contents (url &optional visit beg end replace)
|   (let ((buffer (url-retrieve-synchronously url)))
|     (if (not buffer)
|       (error "Opening input file: No such file or directory, %s" url))
|     (if visit (setq buffer-file-name url))
|     (save-excursion
|       (let* ((start (point))
|              (size-and-charset (url-insert buffer beg end)))
|         (kill-buffer buffer)
|         (when replace
|           (delete-region (point-min) start)
|           (delete-region (point) (point-max)))
|         (unless (cadr size-and-charset)
|           ;; If the headers don't specify any particular charset, use the
|           ;; usual heuristic/rules that we apply to files.
|           (decode-coding-inserted-region start (point) url visit beg end 
|         (list url (car size-and-charset))))))

only it never succeeds.  For example, with a header like

| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

it could only find out "text/html", completely missing "charset" value.
It looks like the final header detecting job is fallen on
mm-decode.el. Maybe mm-decode.el's fault?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]