bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grep-devel] handling of non-BMP characters


From: Corinna Vinschen
Subject: Re: [Grep-devel] handling of non-BMP characters
Date: Wed, 19 Dec 2018 15:44:14 +0100
User-agent: Mutt/1.9.2 (2017-12-15)

On Dec 19 15:41, Corinna Vinschen wrote:
> On Dec 19 08:51, Bruno Haible wrote:
> > Corinna Vinschen wrote in
> > <https://lists.gnu.org/archive/html/grep-devel/2018-12/msg00039.html>:
> > > it would be
> > > pretty nice if that code could get reverted back in to support
> > > non-BMP charsets even on Cygwin.
> > 
> > I agree that support for beyond-BMP characters should be added back to 
> > 'grep'.
> > 
> > Your earlier fix from 2013-08-16 (and the fact that the test failure is
> > occurring exactly on Windows and AIX platforms) shows that the problem is
> > with wchar_t being only 16-bit wide on these platforms.
> > 
> > The type 'char32_t' has been introduced in C11 to overcome this 
> > limitation.[1]
> > 
> > I propose to
> > 
> >   1) introduce in gnulib support for <uchar.h>, char32_t, and mbrtoc32, so
> >      that we can use these instead of <wchar.h>, wchar_t, and mbrtowc
> >      portably,
> > 
> >   2) change those gnulib modules that don't behave well with beyond-BMP
> >      characters on Windows and AIX to use char32_t instead of wchar_t.
> > 
> > Then the 'grep' code can be changed in a similar way, and this will
> > fix the bug on Cygwin and AIX (though not on native Windows [2]).
> > 
> > The advantage of this approach are minimal code changes in 'grep': just
> > change some type and function names here and there, and add code for
> > the additional (size_t)(-3) return value of mbrtoc32.
> 
> IIUC this would also drop the requirement for #ifdef CYGWIN'ed code.

  ... in grep.

> Sounds like a great idea to me!
> 
> 
> Corinna
> 
> 
> 
> > 
> > Bruno
> > 
> > [1] 
> > https://stackoverflow.com/questions/21264035/why-did-c11-introduce-the-char16-t-and-char32-t-types
> > [2] https://lists.gnu.org/archive/html/bug-gnulib/2011-02/msg00175.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]