bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Optimize bash string handling?


From: Chet Ramey
Subject: Re: Optimize bash string handling?
Date: Wed, 31 Jul 2019 15:15:37 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 7/26/19 5:55 AM, Alkis Georgopoulos wrote:
> While handling some big strings, I noticed that bash is a lot slower
> than other shells like dash, posh and busybox ash.
> I came up with the following little benchmark and results.
> While the specific benchmark isn't important, maybe some developer
> would like to use it to pinpoint and optimize some internal bash
> function that is a lot slower than in other shells?

Thanks for the report. There are places in bash where it copies and
re-processes strings too many times, and you uncovered a couple.

> 
> # Avoid UTF-8 complications
> export LANG=C
> 
> # Run the following COMMANDs with `time bash -c`
> # or `time busybox ash -c`
> # The time columns are in seconds, on an i5-4440 CPU
> 
> ASH BASH  COMMAND
> 0.1  0.1  printf "%100000000s" "." >/dev/null
> 0.7  1.1  x=$(printf "%100000000s" ".")

The first assignment is dominated by the command substitution and
reading the data through a pipe.

> 0.8  2.4  x=$(printf "%100000000s" "."); echo ${#x}
> 0.9  3.7  x=$(printf "%100000000s" "."); echo ${#x}; echo ${#x}

The length function was too general, and didn't optimize for the common
case. Bash would expand the parameter name following the `#' as if the
`#' were not present, then take the length of the results. Most uses don't
need that generality, or the common error handling if `set -u' is enabled.
Factoring out the common case provides substantial improvement:

$ time ./bash ./x1a

real    0m1.215s
user    0m0.959s
sys     0m0.248s
$ time ./bash ./x1b
100000000

real    0m1.242s
user    0m0.982s
sys     0m0.256s
$ time ./bash ./x1c
100000000
100000000

real    0m1.290s
user    0m1.020s
sys     0m0.265s

where the three scripts are the three cases above.

There's always more work to do, though.

Chet


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    address@hidden    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]