[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: uninorm/nfc - Unicode version?
From: |
Simon Josefsson |
Subject: |
Re: uninorm/nfc - Unicode version? |
Date: |
Sun, 09 Jan 2011 12:00:54 +0100 |
User-agent: |
Gnus/5.110011 (No Gnus v0.11) Emacs/23.2 (gnu/linux) |
Bruno Haible <address@hidden> writes:
> Hi Simon,
>
>> There is no hurry, I'm mostly curious about what kind of
>> non-trivial changes there would be. I know that between 5.2 and 6.0
>> there were some changes made that would affect a IDNA2008 implementation
>> for example.
>>
>> The best would be if the process to re-generate the files were
>> documented, then I could generate them on the fly to test my code with a
>> 5.1, 5.2 and 6.0 Unicode library, which would be useful for
>> compatibility and regression testing.
>>
>> If there were significant changes in any _algorithm_, as opposed to data
>> tables, that would be interesting to know. I recall the NFKC algorithm
>> changed slightly between Unicode 3.2.0 and the next version but
>> hopefully 5.1/5.2/6.0 doesn't see any changes like that any more.
>
> I have now updated the Unicode related modules to Unicode 5.2.0.
> The process involves more than just regenerating data files. It also
> requires to update some functions in gen-uni-tables.c to match the updated
> Unicode Standard Annexes.
Thank you very much! Unicode 5.2.0 is better than 6.0.0 for me, since
IDNA2008 reference 5.2.0 normatively in some aspects.
Once I have established a good set of self tests, I will run them both
against libunistring for 5.0.0 and 5.2.0 to see if I can find any string
that behaves differently.
>> Btw, I'm (finally) working on a IDNA2008 implementation, and it is using
>> your libunistring.
>
> How will this work with the glibc add-on? Will it incorporate some parts
> of libunistring literally, or will it load libunistring dynamically?
I have no idea yet. Right now, libidna doesn't even link to
libunistring dynamically because I want to make sure I get the "right"
libunistring implementation.
Given the complexities in IDNA2008 I am wondering whether it might not
make more sense to let glibc ask a system daemon to do the string
conversion rather than to do everything in glibc. There is still a lot
of work being done on various pre- and post- IDNA2008 mappings because
IDNA2008 by itself is neither backwards/forwards compatible or safe to
use. This may be something you want to configure on a per-system basis.
/Simon
- uninorm/nfc - Unicode version?, Simon Josefsson, 2011/01/04
- Re: uninorm/nfc - Unicode version?, Ben Pfaff, 2011/01/09
- Re: uninorm/nfc - Unicode version?, Simon Josefsson, 2011/01/18
- Re: uninorm/nfc - Unicode version?, Bruno Haible, 2011/01/18
- Re: uninorm/nfc - Unicode version?, Simon Josefsson, 2011/01/19