help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The difference between `X=x f | cat` and `{ X=x; f; } | cat`


From: Peng Yu
Subject: Re: The difference between `X=x f | cat` and `{ X=x; f; } | cat`
Date: Fri, 20 Jan 2023 17:25:10 -0600

I made a program to check for even longer variable name lengths. It
seems that when the variable length is really long, there is a
significant and reproducible difference. So then it seems that `{ X=x;
f; } | true`  is always preferred over `X=x f | true` in terms of
runtime as long as they produce the same results.

Who knows the underlying mechanism that causes the difference?

$ cat main.sh
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:

varname=$(builtin printf 'X%.0s' {1..100000})
eval "function f {
        echo \"\$$varname\"
}"

eval "function f1 {
        for ((i=0;i<1000;++i)); do { $varname=x; f; } | true; done
}"
eval "function f2 {
        for ((i=0;i<1000;++i)); do $varname=x f | true; done
}"
f1
f2
set -v
time f1
time f2
$ ./main.sh
time f1

real    0m2.322s
user    0m1.618s
sys     0m1.395s
time f2

real    0m2.668s
user    0m1.893s
sys     0m1.474s

On 1/20/23, Peng Yu <pengyu.ut@gmail.com> wrote:
> On 1/19/23, Kerin Millar <kfm@plushkava.net> wrote:
>> On Thu, 19 Jan 2023 09:10:19 -0600
>> Peng Yu <pengyu.ut@gmail.com> wrote:
>>
>>> On 1/19/23, Kerin Millar <kfm@plushkava.net> wrote:
>>> > On Thu, 19 Jan 2023 08:02:07 -0600
>>> > Peng Yu <pengyu.ut@gmail.com> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> f is a function that uses a variable called X, which is not declared
>>> >> in
>>> >> f.
>>> >>
>>> >> As far as I can tell, the following two ways produce the same
>>> >> results.
>>> >>
>>> >> - `X=x f | cat`
>>> >> - `{ X=x; f; } | cat`
>>> >
>>> > They are not equivalent.
>>> >
>>> > $ bash -c 'f() { declare -p X; }; X=x f; declare -p X'
>>> > declare -x X="x"
>>> > bash: line 1: declare: X: not found
>>> >
>>> > $ bash -c 'f() { declare -p X; }; X=x; f; declare -p X'
>>> > declare -- X="x"
>>> > declare -- X="x"
>>> >
>>> > The first version defines X in such a way that it affects only the
>>> > operating
>>> > environment in which f is run. Note, also, that the variable is marked
>>> > as
>>> > being exportable, meaning that it would be present in the environment
>>> > of
>>> > any
>>> > subprocesses that might be launched by f. For a more rigorous
>>> > explanation,
>>> > refer to the SIMPLE COMMAND EXPANSION section of the manual.
>>> >
>>> > The second version just defines X as an ordinary shell variable then
>>> > proceeds to run f. The use of a { list; } ensures that both of those
>>> > things
>>> > happen within the same subshell - the one implied by the left hand
>>> > side
>>> > of
>>> > |.
>>>
>>> In my question, X is never accessed after calling f.
>>>
>>> Let's put my question this way to make it clearer.
>>>
>>> Suppose that in `X=x f | cat`, f is a function, that uses X read-only,
>>> no external commands called by f directly or indirectly use X. And I
>>> know that `{ X=x; f; } | cat` produces the same results that I care
>>> about (for example, X=x is available even after f is something I don't
>>> care).
>>>
>>> In this case, is `{ X=x; f; } | cat` faster than `X=x f | cat`?
>>
>> I would consider it highly unlikely but would also suggest that you
>> conduct
>> your own benchmarks.
>>
>> time for ((i=0; i < 1000; i++)); do X=x f | cat >/dev/null; done
>> time for ((i=0; i < 1000; i++)); do { X=x; f; } | cat >/dev/null; done
>>
>> Obviously, you may need to tune the number of iterations in order to
>> incur
>> an acceptably substantive wall time.
>
> I see a slight difference (the long XX... name is to make the
> assignment takes more time). It seems that `{ X=x; f; } | true` is
> faster. Is it always true? What is done under the hood that leads to
> the timing difference?
>
> time for ((i=0; i < 1000; i++)); do
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=x
> f | true; done
>
> real  0m0.952s
> user  0m0.572s
> sys   0m0.951s
> time for ((i=0; i < 1000; i++)); do {
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=x;
> f; } | true; done
>
> real  0m0.932s
> user  0m0.540s
> sys   0m0.912s
>
> --
> Regards,
> Peng
>


-- 
Regards,
Peng



reply via email to

[Prev in Thread] Current Thread [Next in Thread]