bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about "-m1 -A99" tests in foad1.sh


From: Charles Levert
Subject: Re: Question about "-m1 -A99" tests in foad1.sh
Date: Thu, 3 Nov 2005 15:27:11 -0500
User-agent: Mutt/1.4.1i

* On Thursday 2005-11-03 at 16:54:02 +0000, Julian Foad wrote:
> Charles Levert wrote:
> >grep_test "4/40/"  "4/40/"  "^4"  -m1 -A99
> >grep_test "4/04/"  "4/04/"  "^4"  -m1 -A99
> 
> >As I understand the intent, the second line of
> >input should be printed whether it would match or
> >not, solely because it is part of the specified
> >number of context lines.
> >
> >However, if other options such as
> >--with-filename, --line-number, --byte-offset,
> >and/or --color=always are used, and if any
> >line that is part of the after-context would
> >also have matched on its own given the chance,
> >should it be marked as matching or not?  I.e.,
> >what separator should be used for such a line
> >(':' or '-') and how should it be colorized?
> 
> That's a good question.  I can't yet think of a reason to prefer one way 
> over the other.
> 
> My gut feeling is that we should display such lines as context lines 
> regardless of whether they happen to match.

Ok.  I'll try to see what this implies for the code.


> (One thing is certain: if we choose to display a context line that happens 
> to match as as matching line, we must not re-start the counting of context 
> lines.)

Absolutely.


The reason I ask is this.  I now have a personal
version of grep where just about all tests
that used to fail are fixed.  These were not.

The other ones that weren't are related to the
hairy situations involving case-less matching
and Unicode characters such as

   LATIN CAPITAL LETTER I WITH DOT ABOVE
   LATIN SMALL LETTER DOTLESS I
   LATIN SMALL LETTER LONG S
   GREEK PROSGEGRAMMENI
   OHM SIGN
   KELVIN SIGN
   ANGSTROM SIGN

It's not clear that these are worth fussing too
much over.

What I have solved:

   -- The "egrep" and "fgrep" script vs. program
      situation.  I have a solution with programs
      (no scripts or argv[0]-recognition)
      where those two are smaller than the full
      featured "grep", yet still standalone
      (much smaller for "fgrep", which doesn't
      use the dfa or regex stuff).

   -- All situations with various anchor
      types (^, \<) not working properly
      with --color and --only-matching.

   -- An obscure bug about towlower()ing
      multi-octet characters that change
      octet-length while doing so, in main().

If I prepare patches for these, would they stand
a good chance of being accepted for CVS commit at
this point in time?  I ask because preparing nice
patches can be as effort consuming as solving
the problems in the first place.  And it has to
be redone every time something else that touches
the same lines is checked in.


What I rely on to get other things right:

   -- The remaining stuff from the
      Red Hat/Fedora Core patches.

   -- The regex stuff that's in gnulib CVS.

What I haven't fixed:

   -- Command line problems such as the one with
      -11 being expressed as -1 -1.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]