Thien-Thi Nguyen wrote:
YAMAMOTO Mitsuharu <address@hidden> writes:
This change breaks the following case:
(concat
"file://localhost"
(mapconcat 'url-hexify-string
(split-string
(encode-coding-string "/SOME/NONASCII/FILE/NAME"
(or file-name-coding-system
default-file-name-coding-system))
"/")
"/"))
Maybe suppress encoding with UTF-8 for unibyte strings?
if the result of this _expression_ is to be used as a URI, then that means
the change exposes improper use of `url-hexify-string'; according to the
RFC (as i understand it) URIs require utf-8.
There is a recent RFC that mandates utf-8 encoding for URIs, but
previous RFCs either said nothing, or specified Latin-1, so there are
many implementations that do not use utf-8. We need some way to
interoperate with such implementations.
if we want `url-hexify-string' to handle "URI-like" transformations
(i.e., not strictly produce URI-conformant results), we can add an
optional arg MAKE-UNIBYTE that specifies a function to do the conversion
to unibyte. in most cases, i guess that would be `string-as-unibyte',
but i don't know for sure.
Alternatively, we could add an optional arg ENCODING, for specifying an
encoding other than utf-8. That might be a cleaner interface than
requiring the user to make the string unibyte before passing it to
url-hexify-string.
|