[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: UTF-16 surrogate pair handling in grep -i option
From: |
Jim Meyering |
Subject: |
Re: UTF-16 surrogate pair handling in grep -i option |
Date: |
Fri, 16 Aug 2013 07:42:27 -0700 |
Hi Corina,
Thanks a lot for the patcb. It is almost perfect.
[ - the git one-line summary should be readable.
- comment nit: s/ as/ as a/
- a style issue: we want curly braces around the 1-line
else body in the first #ifdef block
- please attribute the reporter (or a list URL) in the commit log
]
Do any of the existing tests trigger this malfunction?
If not, can you create a small example that triggers the
problem on cygwin? Even better would be the addition of a new
script in tests/, which is required for any bug-fix patch.
Also, it'd be great if you would add a NEWS entry that
describes your fix. That said, there's no pressure.
If you can tell me how to reproduce the failure, I'll
make time to write both the test and NEWS addition, and
amend them onto your patch.
PS. Your timing is great. I'm planning to make a release pretty soon.
On Wed, Aug 14, 2013 at 9:32 AM, Corinna Vinschen <address@hidden> wrote:
> Hi,
>
> two days ago we got a report on the Cygwin mailing list that under some
> circumstances grep on Cygwin SEGVed. I tracked this down to grep's -i
> option, which calls the function mbtolower. This function works fine on
> systems with UCS-2 or UCS-4 wchar_t's, but it doesn't handle UTF-16
> surrogates on UTF-16 wchar_t systems. It does especially not handle
> the case where wcrtomb returns 0, which is what causes the SEGV.
>
> The below patch fixes this at least for Cygwin. Actually, I don't
> know any other OS which uses UTF-16 and provides this set of functions,
> so I assume that this solution is very system-specific, unless another
> Newlib based OS uses UTF-16 wchar_t as well.
>
> I hope the patch is ok to go into mainline. I added a lot of comment
> to explain what happens. Feel free to ask any question.
>
> Please keep me CCed, I'm not subscribed to bugs-grep.
>
>
> Thanks,
> Corinna
>
> --
> Corinna Vinschen
> Cygwin Maintainer
> Red Hat
- UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/14
- Re: UTF-16 surrogate pair handling in grep -i option,
Jim Meyering <=
- Re: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/16
- Re: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/18
- Re: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/19
- Re: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/19
- Re: UTF-16 surrogate pair handling in grep -i option, Paul Eggert, 2013/08/19
- Re: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/20
- Re: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/25
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/26
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/27
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/27