bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

grep does not process non-ASCII characters correctly


From: Bruno Haible
Subject: grep does not process non-ASCII characters correctly
Date: Tue, 8 May 2001 15:43:08 +0200 (CEST)

Hi,

grep-2.5a has severe problems with multibyte character encodings.
According to SUSV2, the LANG/LC_CTYPE/LC_ALL environment variables should
influence the character notion of grep. But it doesn't in grep-2.5a.

A test script is appended below, to be executed in an UTF-8 locale (e.g.
glibc-2.2.2 ko_KR.UTF-8 locale). The regexp engine in glibc-2.2.2 has now
all i18n support. The remaining problems in grep appear to be located in
dfa.h and dfa.c.

Bruno


begin 644 grep-sample-run-good
M)"!E8VAO("?#I,.VP[PG('address@hidden)E<"`G6ULZ86QP:&$Z75TG"L.DP[;#O`HD
I(&5C:&\@)V'#ML.VP[PG('address@hidden)E<"`GP[9<>S)<?2<*8<.VP[;#O`H`
`
end
begin 644 grep-sample-run-bad
M)"!E8VAO("?#I,.VP[PG('address@hidden)E<"`G6ULZ86QP:&$Z75TG"address@hidden;R`G
:8<.VP[;#O"<@?"!G<F5P("?#MEQ[,EQ])PH`
`
end




reply via email to

[Prev in Thread] Current Thread [Next in Thread]