[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep: obnoxious egrep vs i18n bug
From: |
Aharon Robbins |
Subject: |
Re: grep: obnoxious egrep vs i18n bug |
Date: |
Thu, 15 Jul 2004 12:43:05 +0300 |
Greetings. Re this:
In article <address@hidden> you write:
>In egrep version 2.5.1, I'm seeing this behavior. Note particularly the
>output of the third command:
>
>address@hidden echo ABC | egrep -i -e 'abc'
>ABC
>address@hidden echo ABC | egrep -i -e 'abc|xxx'
>ABC
>address@hidden echo ABC | egrep -i -e 'a[b]c|xxx'
>address@hidden echo ABC | egrep -i -e 'a[b]c'
>ABC
>address@hidden echo abc | egrep -i -e 'a[b]c|xxx'
>abc
>address@hidden echo $LANG
>en_US.UTF-8
>address@hidden echo ABC | LANG= egrep -i -e 'a[b]c|xxx'
>ABC
>
>I'm only slightly familiar with i18n, but I can't believe this output is
>correct. I ran into this on a new RedHat machine, where $LANG is
>apparently being set by default.
>
>Here are a few particulars:
>address@hidden ldd $(type -p grep)
> libpcre.so.0 => /lib/libpcre.so.0 (0x40024000)
> libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
> /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
>address@hidden ls -l /lib/tls/libc.so.6
>lrwxrwxrwx 1 root root 13 Nov 17 2003
>/lib/tls/libc.so.6 -> libc-2.3.2.so*
>address@hidden ls -l /lib/libpcre.so.0
>lrwxrwxrwx 1 root root 16 Nov 17 2003
>/lib/libpcre.so.0 -> libpcre.so.0.0.1*
>address@hidden egrep --version
>egrep (GNU grep) 2.5.1
>
>Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
>This is free software; see the source for copying conditions. There is
>NO
>warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
>PURPOSE.
>
>Mike
>
>Mike Coleman, Scientific Programmer, +1 816 926 4419
>Stowers Institute for Biomedical Research
>1000 E. 50th St., Kansas City, MO 64110
This is an interesting test case. It turns out that gawk, which shares
the dfa matcher, has the same problem. The following patch seems to
fix the problem. Your line numbers may vary.
Hope this helps.
Arnold Robbins
-----------------------
Thu Jul 15 12:36:25 2004 Arnold D. Robbins <address@hidden>
* dfa.c (parse_bracket_exp_mb): If doing case folding,
include the other case for regular characters inside [...].
--- dfa.c.save 2004-06-01 19:08:26.000000000 +0300
+++ dfa.c 2004-07-15 12:06:53.000000000 +0300
@@ -689,6 +689,19 @@
REALLOC_IF_NECESSARY(work_mbc->chars, wchar_t, chars_al,
work_mbc->nchars + 1);
work_mbc->chars[work_mbc->nchars++] = (wchar_t)wc;
+ if (case_fold)
+ {
+ wint_t altcase;
+
+ if (iswlower((wint_t) wc))
+ altcase = towupper((wint_t) wc);
+ else if (iswupper((wint_t) wc))
+ altcase = towlower((wint_t) wc);
+
+ REALLOC_IF_NECESSARY(work_mbc->chars, wchar_t, chars_al,
+ work_mbc->nchars + 1);
+ work_mbc->chars[work_mbc->nchars++] = (wchar_t) altcase;
+ }
}
}
while ((wc = wc1) != L']');
--
Aharon (Arnold) Robbins --- Pioneer Consulting Ltd. arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381 Fax: +1 530 688 5518
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL