[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode 15 support - using UTC instead of IANA as table source? On U+19

From: Simon Josefsson
Subject: Unicode 15 support - using UTC instead of IANA as table source? On U+19DA
Date: Tue, 18 Oct 2022 21:13:27 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)


I am considering switching to UTC as the source of our derived IDNA2008
tables, for simple support of Unicode > 12.  For Unicode <= 12 this has
no difference except for U+19DA which UTC has as PVALID and IANA as
DISALLOWED.  This means idn2 behaviour changes from:

jas@latte:~$ echo ᧚|idn2
idn2: toAscii: string contains a disallowed character


jas@latte:~/src/libidn2/src$ echo ᧚|./idn2

This actually goes back to libidn2 0.11 behaviour, which also resulted
in xn--pkf since it used Unicode < 6.0.0:

jas@latte:~/src/libidn2-0.11/src$ ./idn2 --version|head -1
idn2 (idn2) 0.11
jas@latte:~/src/libidn2-0.11/src$ echo ᧚|./idn2

The xn--pkf output is consistent with some other IDNA2008

There may be other differences between UTC derived values and IANA
derived values for Unicode > 12 and <= 15 once IANA gets around to
publishing tables, but we can't tell until that happens and I'm not
holding my horses since they haven't published anything for 12.1.0
(2019-03), 13.0.0 (2019-11), 14.0.0 (2021) nor 15.0.0 (2022-05).

I don't have a strong opinion on this, but some of the factors involved

1) consistency with other implementations

2) importance of U+19DA (which is rare) and practical problems resulting
from this change (apparently little)

3) support Unicode > 12 now (most important of these factors IMO)

4) domain name stability: once derived for a code point, the property
shouldn't change in the future.  thus, the change in 0.12 could be
considered the bug here.  I believe I agreed with the approach used by
RFC 6452 at the time it was published, but revisiting this issue today I
find myself in the opposite camp.  It is a subjective judgement call,
and there are good arguments for both sides.

If you want to provide feedback on this, please respond here or to this


Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]