grep-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grep-devel] handling of non-BMP characters


From: Bruno Haible
Subject: Re: [Grep-devel] handling of non-BMP characters
Date: Sun, 16 Dec 2018 21:57:02 +0100
User-agent: KMail/5.1.3 (Linux/4.4.0-139-generic; KDE/5.18.0; x86_64; ; )

Hi Jim,

> welcome any attempt to revive it, especially if the result
> includes a test that will be easy to run on a non-cygwin system.

The test was running also on glibc, *BSD, and other systems.

In fact, it was the *only* test that verifies that 'grep' handles
beyond-BMP characters correctly.

Now you have a test gap: If, by changes in glibc, in regex, in dfa,
in the UTF-8 converters, or elsewhere beyond-BMP characters stop working
on glibc or *BSD systems, no automated test will catch that.

Therefore I would suggest to

  - revive the test,

  - rename it from 'surrogate-pair' to 'beyond-bmp' (to match what
    it does, from a user perspective),

  - change line 2 from
      # Trigger a segfault-inducing bug with -i in grep-2.14 on Cygwin.
    to
      # Check the handling of characters outside the Unicode BMP.

  - Add a comment
      # Known failures: This test currently fails on Cygwin and AIX.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]