bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uninorm/nfc - Unicode version?


From: Simon Josefsson
Subject: Re: uninorm/nfc - Unicode version?
Date: Sun, 09 Jan 2011 12:00:54 +0100
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.2 (gnu/linux)

Bruno Haible <address@hidden> writes:

> Hi Simon,
>
>> There is no hurry, I'm mostly curious about what kind of
>> non-trivial changes there would be.  I know that between 5.2 and 6.0
>> there were some changes made that would affect a IDNA2008 implementation
>> for example.
>> 
>> The best would be if the process to re-generate the files were
>> documented, then I could generate them on the fly to test my code with a
>> 5.1, 5.2 and 6.0 Unicode library, which would be useful for
>> compatibility and regression testing.
>> 
>> If there were significant changes in any _algorithm_, as opposed to data
>> tables, that would be interesting to know.  I recall the NFKC algorithm
>> changed slightly between Unicode 3.2.0 and the next version but
>> hopefully 5.1/5.2/6.0 doesn't see any changes like that any more.
>
> I have now updated the Unicode related modules to Unicode 5.2.0.
> The process involves more than just regenerating data files. It also
> requires to update some functions in gen-uni-tables.c to match the updated
> Unicode Standard Annexes.

Thank you very much!  Unicode 5.2.0 is better than 6.0.0 for me, since
IDNA2008 reference 5.2.0 normatively in some aspects.

Once I have established a good set of self tests, I will run them both
against libunistring for 5.0.0 and 5.2.0 to see if I can find any string
that behaves differently.

>> Btw, I'm (finally) working on a IDNA2008 implementation, and it is using
>> your libunistring.
>
> How will this work with the glibc add-on? Will it incorporate some parts
> of libunistring literally, or will it load libunistring dynamically?

I have no idea yet.  Right now, libidna doesn't even link to
libunistring dynamically because I want to make sure I get the "right"
libunistring implementation.

Given the complexities in IDNA2008 I am wondering whether it might not
make more sense to let glibc ask a system daemon to do the string
conversion rather than to do everything in glibc.  There is still a lot
of work being done on various pre- and post- IDNA2008 mappings because
IDNA2008 by itself is neither backwards/forwards compatible or safe to
use.  This may be something you want to configure on a per-system basis.

/Simon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]