bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16481: dfa.c and Rational Range Interpretation


From: arnold
Subject: bug#16481: dfa.c and Rational Range Interpretation
Date: Mon, 10 Feb 2014 02:00:07 -0700
User-agent: Heirloom mailx 12.4 7/29/08

Paolo Bonzini <address@hidden> wrote:

> Il 10/02/2014 03:35, Paul Eggert ha scritto:
> > Paolo Bonzini wrote:
> >> The correct course of action for grep is to defer range interpretation
> >> to regex, because otherwise you can get mismatches between regexes with
> >> backreferences and those without.
> >
> > It depends on what one means by "correct".  POSIX doesn't say what to do
> > in this situation, so it's OK as far as POSIX is concerned for grep to
> > use RRI in the typical case (i.e., without backreferences), and for grep
> > to use some other interpretation in the rare cases when backreferences
> > are used.
> >
> > The documentation for 'grep' attempts to address this issue, perhaps not
> > as clearly as it could.  Maybe the installation instructions should talk
> > about it as well, and suggest --with-included-regex for people who care
> > about this sort of thing.
>
> Yeah, that makes sense.  I will revert the commit.

I think this is the wrong course of action. Paul suggested updating the
doc to be more clear, not reverting the code.

Personally, I think grep should always use the included regex so that
then the behavior is consistent across all platforms everywhere; this
is why gawk always uses its own regex.

If the only way to use collating sequences and equivalence classes is
with GLIBC, then I think it'd be better to pull the __LIBC bits out into
the standalone regex somehow.

In reponse to another question: Making GLIBC's regex support RRI isn't
hard - getting the GLIBC maintainers to accept the patch, is. :-(

My two cents: Jim & Paul will have to decide.

Thanks,

Arnold





reply via email to

[Prev in Thread] Current Thread [Next in Thread]