bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8


From: Jim Meyering
Subject: Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Date: Fri, 01 Jun 2012 23:13:11 +0200

Paul Eggert wrote:
> On 06/01/2012 08:18 AM, Jim Meyering wrote:
>> I've implemented the above, and have begun testing.
>
> Thanks.  One thing I noticed when looking at the code is
> that the mbtolower API causes the compiler to refuse to
> put the 'size' local into a register.  The following patch
> fixes this inefficiency.  This undoubtedly interacts with
> what you're doing so I'm just putting it into this email
> now as a heads-up.

Odd... that seems like a transformation that gcc should be able to perform.
If gcc someday learns to recognize and optimize this sort of code,
is it really worth making this change?

In any case, thanks for holding off.
I expect to push the turkish-I fix tomorrow.

> Subject: [PATCH] grep: tune by allowing compiler to put size in register
>
> * src/dfasearch.c (EGexecute):
> * src/kwsearch.c (Fcompile, Fexecute):
> Pass address of otherwise-unused variable to mbtolower, so that
> the compiler can keep the often-used size variable in a register.
> ---
>  src/dfasearch.c |    4 +++-
>  src/kwsearch.c  |    8 +++++---
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/dfasearch.c b/src/dfasearch.c
> index bd09aa6..b7c0da7 100644
> --- a/src/dfasearch.c
> +++ b/src/dfasearch.c
> @@ -223,10 +223,12 @@ EGexecute (char const *buf, size_t size, size_t 
> *match_size,
>          {
>            /* mbtolower adds a NUL byte at the end.  That will provide
>               space for the sentinel byte dfaexec may add.  */
> -          char *case_buf = mbtolower (buf, &size);
> +          size_t case_size = size;
> +          char *case_buf = mbtolower (buf, &case_size);
>            if (start_ptr)
>              start_ptr = case_buf + (start_ptr - buf);
>            buf = case_buf;
> +          size = case_size;
>          }
>      }
>
> diff --git a/src/kwsearch.c b/src/kwsearch.c
> index f1a802e..829fe12 100644
> --- a/src/kwsearch.c
> +++ b/src/kwsearch.c
> @@ -33,10 +33,10 @@ void
>  Fcompile (char const *pattern, size_t size)
>  {
>    char const *err;
> -  size_t psize = size;
>    char const *pat = (match_icase && MB_CUR_MAX > 1
> -                     ? mbtolower (pattern, &psize)
> +                     ? mbtolower (pattern, &size)
>                       : pattern);
> +  size_t psize = size;
>
>    kwsinit (&kwset);
>
> @@ -87,10 +87,12 @@ Fexecute (char const *buf, size_t size, size_t 
> *match_size,
>      {
>        if (match_icase)
>          {
> -          char *case_buf = mbtolower (buf, &size);
> +          size_t case_size = size;
> +          char *case_buf = mbtolower (buf, &case_size);
>            if (start_ptr)
>              start_ptr = case_buf + (start_ptr - buf);
>            buf = case_buf;
> +          size = case_size;
>          }
>      }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]