bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51698: surrogate-pair test fails under Cygwin


From: Duncan Roe
Subject: bug#51698: surrogate-pair test fails under Cygwin
Date: Tue, 9 Nov 2021 13:48:02 +1100

The 3rd surrogate-pair test fails under Cygwin:
> # Also test whether a surrogate-pair in the search string works.
Fails at grep-3.7 or latest commit.

Reproduces easily enough from the command line:
> printf '%s\n' "$(printf '\360\220\220\205')" >in
> LANG=en_US.utf8
> locale
> src/grep --file=in in

Reports a match under Linux but not under Cygwin. Tested Cygwin64 on Windows 7
Home and Windows 10.

Comparing gdb sessions between the platforms, I noticed:
> linux:  sbclen = '\001' <repeats 128 times>, '\377' <repeats 66 times>, 
> '\376' <repeats 60 times>, "\377\377"
> cygwin: sbclen = '\001' <repeats 128 times>, '\377' <repeats 64 times>, 
> '\376' <repeats 53 times>, '\377' <repeats 11 times>
in `dfa` (i.e. dfa.localeinfo.sbclen).

Also this:
> linux:  enlistnew (cpp=0x, new=0x "\360\220\220\205") at dfa.c:3928
> cygwin: enlistnew (cpp=0x, new=0x "\360\355\260\205") at dfa.c:3928

Locale data is different for the same locale on the 2 systems. I investigated
this further by breakpointing the code as it starts to compute sbclen[250] which
is \376 ubder Linux but \377 under Cygwin. I captured the gdb sessions using
`script` and have attached them in the hope they are some help.

If your system rejects the tar.gz attachment I'll send them plaintext in
separate emails. They compare best in a side-by-side diff highlighting changed
characters. I find `tkdiff` good for this: from View choose "Show inline
comparison (recursive)".

Uninteresting changes between the sessions are removed:
 Automatic
 - strip hex numbers (addresses usually) to plain 0x
 - remove escape sequences (colouring &c.)
 - probably other stuff
 Specifics
 - force matching locale names
 - insert blank lines at linux:72 to line up return stmt
 - split linux:100 to more easily see later args

Cheers ... Duncan.

Attachment: gdb_sessions.tar.gz
Description: application/tar-gz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]