bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

new module 'c32rtomb'


From: Bruno Haible
Subject: new module 'c32rtomb'
Date: Thu, 09 Jan 2020 02:14:58 +0100
User-agent: KMail/5.1.3 (Linux/4.4.0-170-generic; KDE/5.18.0; x86_64; ; )

The function c32rtomb() is like wcrtomb(), except that it takes a 32-bit wide
character (char32_t) as argument, not a wchar_t.

While implementing this module, I noted a mistake in the 'mbrtoc32' module:
It assumed that when wchar_t is 32-bit and mbrtoc32() exists in libc,
mbrtoc32() is equivalent to mbrtowc(); in other words, that the char32_t
encoding and the wchar_t encoding of the same multibyte sequence are the
same. But this is not the case! On FreeBSD 12 and Solaris 11.4, the
two encodings are different. The FreeBSD 12 wchar_t encoding is apparently
based on ISO 2022 (very old).

The fix is to use mbrtoc32() on platforms where this is possible, namely
on FreeBSD.

On Solaris 11.4 and native Windows, however, it is not good to use the
system's mbrtoc32() because it refuses to convert some multibyte sequences
that mbrtowc() supports!

So, we end up using the system's mbrtoc32() and c32rtomb() functions on
  - glibc,
  - FreeBSD,
  - AIX,
and not using them on
  - Solaris 11.4,
  - mingw,
  - MSVC.


2020-01-08  Bruno Haible  <address@hidden>

        mbrtoc32: Use the system's mbrtoc32 if it exists and basically works.
        * m4/mbrtoc32.m4 (gl_MBRTOC32_SANITYCHECK): New macro.
        (gl_FUNC_MBRTOC32): Require it. Set REPLACE_MBRTOC32 if mbrtoc32 exists
        but is not working.
        * lib/mbrtoc32.c: Include hard-locale.h, <locale.h>.
        (mbrtoc32): If the char32_t encoding and the wchar_t encoding may
        differ, use the system's mbrtoc32, adding workarounds.
        * modules/mbrtoc32 (Depends-on): Add hard-locale.
        * doc/posix-functions/mbrtoc32.texi: Mention the Solaris and native
        Windows problem.
        * lib/btoc32.c: Include <stdio.h>, <string.h>.
        (btoc32): If the char32_t encoding and the wchar_t encoding may differ,
        use mbrtoc32, not btowc.
        * modules/btoc32 (Depends-on): Add mbrtoc32.
        * lib/mbsrtoc32s.c (mbsrtoc32s): If the char32_t encoding and the
        wchar_t encoding may differ, use mbrtoc32, not mbsrtowcs.
        * modules/mbsrtoc32s (Depends-on): Update conditions.
        (configure.ac): Compile mbsrtoc32s-state.c unconditionally.
        * lib/mbsnrtoc32s.c (mbsnrtoc32s): If the char32_t encoding and the
        wchar_t encoding may differ, use mbrtoc32, not mbsnrtowcs.
        * modules/mbsnrtoc32s (Depends-on): Update conditions.
        (configure.ac): Compile mbsrtoc32s-state.c unconditionally.

2020-01-08  Bruno Haible  <address@hidden>

        c32rtomb: Add tests.
        * tests/test-c32rtomb.c: New file, based on tests/test-wcrtomb.c.
        * tests/test-c32rtomb.sh: New file, based on tests/test-wcrtomb.sh.
        * tests/test-c32rtomb-w32.c: New file, based on
        tests/test-wcrtomb-w32.c.
        * tests/test-c32rtomb-w32-1.sh: New file, based on
        tests/test-wcrtomb-w32-1.sh.
        * tests/test-c32rtomb-w32-2.sh: New file, based on
        tests/test-wcrtomb-w32-2.sh.
        * tests/test-c32rtomb-w32-3.sh: New file, based on
        tests/test-wcrtomb-w32-3.sh.
        * tests/test-c32rtomb-w32-4.sh: New file, based on
        tests/test-wcrtomb-w32-4.sh.
        * tests/test-c32rtomb-w32-5.sh: New file, based on
        tests/test-wcrtomb-w32-5.sh.
        * tests/test-c32rtomb-w32-6.sh: New file, based on
        tests/test-wcrtomb-w32-6.sh.
        * tests/test-c32rtomb-w32-7.sh: New file, based on
        tests/test-wcrtomb-w32-7.sh.
        * modules/c32rtomb-tests: New file.

        c32rtomb: New module.
        * lib/uchar.in.h (c32rtomb): New declaration.
        * lib/c32rtomb.c: New file, based on lib/unistr/u8-uctomb-aux.c.
        * m4/c32rtomb.m4: New file.
        * m4/uchar.m4 (gl_UCHAR_H): Test whether c32rtomb is declared.
        (gl_UCHAR_H_DEFAULTS): Initialize GNULIB_C32RTOMB, HAVE_C32RTOMB,
        REPLACE_C32RTOMB.
        * modules/uchar (Makefile.am): Substitute GNULIB_C32RTOMB,
        HAVE_C32RTOMB, REPLACE_C32RTOMB.
        * modules/c32rtomb: New file.
        * tests/test-uchar-c++.cc: Test the signature of c32rtomb.
        * doc/posix-functions/c32rtomb.texi: Document the new module.
        * doc/posix-functions/wcrtomb.texi: Mention the new module.

2020-01-08  Bruno Haible  <address@hidden>

        c32tob: Make consistent with mbrtoc32.
        * lib/c32tob.c: Include <stdio.h>, <string.h>, <wchar.h>.
        (c32tob): If the char32_t encoding and the wchar_t encoding may differ,
        use c32rtomb, not wctob.
        * modules/c32tob (Files): Add m4/mbrtoc32.m4.
        (Depends-on): Add c32rtomb.
        (configure.ac): Require gl_MBRTOC32_SANITYCHECK.

Attachment: 0001-mbrtoc32-Use-the-system-s-mbrtoc32-if-it-exists-and-.patch
Description: Text Data

Attachment: 0002-c32rtomb-New-module.patch
Description: Text Data

Attachment: 0003-c32rtomb-Add-tests.patch
Description: Text Data

Attachment: 0004-c32tob-Make-consistent-with-mbrtoc32.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]