bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Update to Unicode 16.0.0


From: Bruno Haible
Subject: Update to Unicode 16.0.0
Date: Fri, 13 Sep 2024 15:37:12 +0200

These two patches update Gnulib to Unicode 16.0.0.


2024-09-13  Bruno Haible  <bruno@clisp.org>

        Implement a new property, added by Unicode 16.0.0.
        * lib/gen-uni-tables.c (is_property_modifier_combining_mark): New
        function.
        (output_properties): Output also the property modifier_combining_mark.
        * lib/unictype.in.h (UC_PROPERTY_MODIFIER_COMBINING_MARK,
        uc_is_property_modifier_combining_mark): New declarations.
        * m4/unictype_h.m4 (gl_UNICTYPE_H_REQUIRE_DEFAULTS): Initialize
        GNULIB_UNICTYPE_PROPERTY_MODIFIER_COMBINING_MARK.
        * modules/unictype/base (Makefile.am): Substitute
        GNULIB_UNICTYPE_PROPERTY_MODIFIER_COMBINING_MARK.
        * lib/unictype/pr_modifier_combining_mark.c: New file.
        * lib/unictype/pr_modifier_combining_mark.h: New generated file.
        * modules/unictype/property-modifier-combining-mark: New file.
        * tests/unictype/test-pr_modifier_combining_mark.c: New generated file.
        * modules/unictype/property-modifier-combining-mark-tests: New file.
        * lib/unictype/pr_byname.gperf: Add modifier_combining_mark.
        * lib/unictype/pr_byname.c
        (UC_PROPERTY_INDEX_MODIFIER_COMBINING_MARK): New enum item.
        (uc_property_byname): Handle it.
        * modules/unictype/property-byname (Depends-on): Add
        unictype/property-modifier-combining-mark.
        * modules/unictype/property-all (Depends-on): Likewise.
        * MODULES.html.sh (func_all_modules): Add
        unictype/property-modifier-combining-mark.

2024-09-13  Bruno Haible  <bruno@clisp.org>

        Update to Unicode 16.0.0.

        * lib/gen-uni-tables.c (PROP_MODIFIER_COMBINING_MARK): New enum item.
        (fill_properties): Recognize property Modifier_Combining_Mark.
        (UC_JOINING_GROUP_KASHMIRI_YEH): New enum item.
        (fill_arabicshaping, joining_group_as_c_identifier): Handle
        UC_JOINING_GROUP_KASHMIRI_YEH.
        (LBP_*): Split LBP_AL into LBP_AL1 and LBP_AL2.
        (LBP_AL): New enum item.
        (get_lbea): New function.
        (get_lbp): Use it. Update such that unilbrk/lbrkprop.txt comes out as
        expected. Map U+25CC to LBP_AL2.
        (PROP_EA, PROP, EA): New macros.
        (debug_output_lbp): Print either LBP_AL1, LBP_AL2 as LBP_AL.
        (lbp_value_to_string): Handle LBP_AL1, LBP_AL2 instead of LBP_AL.
        (struct lbpea_table): Renamed from struct lbp_table.
        (output_lbpea): Renamed from output_lbp. Store both the line break
        property and the line break EastAsian bit in the same table entry.
        (output_lbrk_tables): Update.
        (output_lbrk_rules_as_tables): Update for LBP_AL change. Implement rules
        LB28a, LB25, LB19, LB15d, LB13 as specified by
        https://www.unicode.org/reports/tr14/tr14-53.html.
        (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected.

        * lib/unictype.in.h (UC_JOINING_GROUP_KASHMIRI_YEH): New enum item.
        * lib/unictype/joininggroup_byname.gperf: Handle it.
        * lib/unictype/joininggroup_name.h: Likewise.

        * lib/unilbrk/lbrktables.h: Split LBP_AL into LBP_AL1 and LBP_AL2.
        (LBP_AKLS_VI): New enum item, for rule LB28a.
        (PROP, EA, PROP_EA): New macros.
        (unilbrk_table): Update bounds.
        * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop):
        Use LBP_AL1 instead of LBP_AL. Use 2 characters of lookahead, for rules
        LB15c, LB19a, LB25, LB28a. New variables prev_ea, prev2_ea, for rule
        LB19a. New variable prev_initial_hyphen, for rule LB20a. New variable
        prev_nus, for rule LB25. Implement rules LB15c, LB19a, LB20a, LB21a,
        LB25, LB28a,  as specified by
        https://www.unicode.org/reports/tr14/tr14-53.html.
        * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop):
        Likewise.
        * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop):
        Likewise.
        * modules/unilbrk/base (Depends-on): Add stdbool.

        * tests/uninorm/test-u32-normalize-big.h
        (struct normalization_test_file): Now 6 parts.
        * tests/uninorm/test-u32-normalize-big.c (read_normalization_test_file):
        Fill in 6 parts.
        (test_specific, free_normalization_test_file): Now handle 6 parts.

        * tests/uniwidth/test-uc_width2.sh: Update expected test result.

        * All generated files under lib/uni* and tests/uni*: Regenerate.
        * tests/uniname/NameAliases.txt: Update.
        * tests/uniname/UnicodeData.txt: Update.
        * tests/uninorm/NormalizationTest.txt: Update.
        * tests/unigbrk/GraphemeBreakTest.txt: Update.
        * tests/uniwbrk/WordBreakTest.txt: Update.

        * All the affected modules: Bump required libunistring version.

https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=02a0e123b2dc0265a8295be9711765afed4693ea
https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8fc1946792c94ba7bf206a4bd53aa08ae9f57f2e






reply via email to

[Prev in Thread] Current Thread [Next in Thread]