bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: major gawk bug


From: Aharon Robbins
Subject: Re: major gawk bug
Date: Wed, 9 Jun 2004 15:08:49 +0300

I'm glad my patches work.  I may send you some further patches
for testing.

Code using tolower() is marginally slower for things like

        BEGIN {
                IGNORECASE = 1
                for (i = 1; i < 10000000; i++)
                        val += ("ONE STRING" == "one string")
                print val
        }

I have a fast machine, making it hard for me to judge whether the difference
is worth keeping the current code.  I need to think about it some more.

I do believe that just using RE_ICASE will work and will probably make tht
the main solution for re.c.

I am also concerned about portability issues; while GLIBC tolower() is
highly functional etc, GLIBC and Linux are not my entire customer base. :-)

Arnold

> Date: Wed, 9 Jun 2004 15:20:54 +0400
> From: Stanislav Ievlev <address@hidden>
> To: Aharon Robbins <address@hidden>
> Cc: Stepan Kasal <address@hidden>, address@hidden
> Subject: Re: major gawk bug
>
> Hello,
>
> On Tue, Jun 08, 2004 at 06:59:48PM +0300, Aharon Robbins wrote:
> > > I beleive the right fix for regexes is to use RE_ICASE flag instead
> > > of the translate table.
> > > The hard-coded table is also used in gawk for various case-insensitive
> > > comparisons; these should be replaced by a call to tolower().
> > > The hard-coded table should be then removed.
> > 
> > I have some tentative changes in place that work this way.  It passes
> > `make check'.  I am still concerned about performance, especially
> > the use of tolower().
> > 
> > If you or Mr. Ievlev can test them and give me some feedback, let
> > me know and I'll send them to you.
> Arnold, your patch works well.
> (little improvement:
> -     if (strcmp(cp, "C") == 0 || strcmp(cp, "POSIX") == 0)
> +       if (!cp || strcmp(cp, "C") == 0 || strcmp(cp, "POSIX") == 0)
> )
>
> As I understand, we also have a solution with toupper()/tolower() functions.
>
> I agree with Stepan that these functions already have good optimization in
> glibc. Solution with toupper()/tolower() is better, because currently we
> have two translation tables (first in  glibc and second in gawk) and copy one 
> to other
> during initialization (load_ignorecase ), it looks strange.
>
> If interpretation of contents of these two tables is identical in gawk
> algorithms, it's eazy to replace one another.
>
> --
> With best regards
> Stanislav Ievlev
>
> ALT Linux Team.
>
>
> #####################################################################################
> This Mail Was Scanned by 012.net Anti Virus Service - Powered by TrendMicro 
> Interscan
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]