[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: address@hidden: [patch] url-hexify-string does not follow W3C spec]
From: |
Thien-Thi Nguyen |
Subject: |
Re: address@hidden: [patch] url-hexify-string does not follow W3C spec] |
Date: |
01 Aug 2006 10:47:07 -0400 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 |
YAMAMOTO Mitsuharu <address@hidden> writes:
> [review]
thanks, that was very pleasant to read.
> * Rev 1.14
> The argument is assumed to be either a sequence of characters or a
> sequence of octets depending on the multibyteness of the string.
> Incompatibility still remains for a multibyte string containing
> eight-bit-control or eight-bit-graphic, but usually negligible.
>
> I'm not sure if encoding with UTF-8 is really useful, but I don't
> strongly oppose it if compatibility for the unibyte case is preverved.
conversion to utf-8 is per the RFC, which seems to be the primary context for
this function; avoiding that conversion means noncompliance w/ the RFC.
i think rev 1.14 is almost ok; anything that deviates from the RFC should be
under user control (via optional arg) and should be documented. i assume that
(a) conversion of multibyte utf-8 is unconditionally desirable (a "negligible"
problem is no problem), and (b) that there exist non utf-8 unibyte encodings
that which callers wish to "hexify as is". please correct me if these
assumptions do not hold. on the other hand, if they do hold, how about:
(defun ... (string &optional unibyte-as-is-p)
...
(if (or (multibyte-string-p string)
(not unibyte-as-is-p))
(encode-coding-string string 'utf-8 t)
string)
...)
?
this way, RFC-compliance is the default, but suppressing the conversion to
utf-8 is still possible for unibyte strings by specifying UNIBYTE-AS-IS-P.
thi
- Re: address@hidden: [patch] url-hexify-string does not follow W3C spec],
Thien-Thi Nguyen <=
Re: address@hidden: [patch] url-hexify-string does not follow W3C spec], YAMAMOTO Mitsuharu, 2006/08/01