[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: treatment of U+002E that is produced by NFKC

From: Simon Josefsson
Subject: Re: AW: treatment of U+002E that is produced by NFKC
Date: Mon, 14 Jan 2008 16:11:14 +0100
User-agent: Gnus/5.110007 (No Gnus v0.7) Emacs/22.1 (gnu/linux)

"Erik van der Poel" <address@hidden> writes:

>> Yes, we should definitely document the problem in the manual.  Erik, do
>> you know of any good links that discuss this issue?
> The only discussion of this that I know of is in the idna-update
> archives. The Internet Drafts may soon be updated to include this
> issue too.

Ok.  Pointers to the mailing list may suffice if we can give a good
explanation of the problem.  Maybe we can develop such an explanation
here.  I'm not yet sure whether actually providing a mechanism (like the
one I proposed in the patch) to work around the problem is a good thing.
The mechanism could just as well cause other problems.

>> Fortunately, all the idna_* APIs in libidn takes a 'flags' parameter.
>> It would be possibly to add a new flag IDNA_TREAT_U2024_AS_DOT and have
>> the code treat U+2024 as a dot character as per RFC 3490 section 3.1 if
>> the flag is given.  I've confirmed that this makes libidn produce the
>> same output as MSIE/Firefox output.
> Note that U+2024 is not the only character that produces one or more
> U+002Es in NFKC. See the Unicode 3.2 version of the UnicodeData.txt
> file.

That is worrisome, and is my reason for preferring more pondering on
whether the fix may be worse than the disease before applying it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]