emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: address@hidden: [patch] url-hexify-string does not follow W3C spec]


From: Jason Rumney
Subject: Re: address@hidden: [patch] url-hexify-string does not follow W3C spec]
Date: Mon, 31 Jul 2006 11:46:18 +0100
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.0.4) Gecko/20060516 Thunderbird/1.5.0.4 Mnenhy/0.7.4.666

Thien-Thi Nguyen wrote:
YAMAMOTO Mitsuharu <address@hidden> writes:

  
This change breaks the following case:

(concat
 "file://localhost"
 (mapconcat 'url-hexify-string
	    (split-string
	     (encode-coding-string "/SOME/NONASCII/FILE/NAME"
				   (or file-name-coding-system
				       default-file-name-coding-system))
	     "/")
	    "/"))

Maybe suppress encoding with UTF-8 for unibyte strings?
    

if the result of this _expression_ is to be used as a URI, then that means
the change exposes improper use of `url-hexify-string'; according to the
RFC (as i understand it) URIs require utf-8.
  
There is a recent RFC that mandates utf-8 encoding for URIs, but previous RFCs either said nothing, or specified Latin-1, so there are many implementations that do not use utf-8. We need some way to interoperate with such implementations.

if we want `url-hexify-string' to handle "URI-like" transformations
(i.e., not strictly produce URI-conformant results), we can add an
optional arg MAKE-UNIBYTE that specifies a function to do the conversion
to unibyte.  in most cases, i guess that would be `string-as-unibyte',
but i don't know for sure.
  
Alternatively, we could add an optional arg ENCODING, for specifying an encoding other than utf-8. That might be a cleaner interface than requiring the user to make the string unibyte before passing it to url-hexify-string.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]