help-gawk
[Top][All Lists]

## Re: How to Generate a Long String of the Same Character

 From: Neil R. Ormos Subject: Re: How to Generate a Long String of the Same Character Date: Mon, 19 Jul 2021 12:34:33 -0500 (CDT)

```Wolfgang Laun wrote:

> [...] Neil points out that doubling in the while
> loop may overshoot the desired length by almost
> 100%, potentially causing the algorithm to
> fail. However, it is quite simple to avoid this:

> function srep(n, s){   # *dbl*
>     while( length(s)*2 <= n )
>         s = s s;
>     return s substr( s, 1, n - length(s) );
> }
> [...]

> I [Wolfgang] contributed:
> function srep(n, s,   h){  # *rec*
>      if( n == 0 ) return "";
>      h = srep( int(n/2), s )
>      return n % 2 == 1 ? h h s : h h;
> }
>
> I have used this code together with /usr/bin/time:
> BEGIN {
>     for( j = 1; j <= 300000; ++j ){
>         srep( j%1000, "a" );
>         srep( j%1000, "abcde" );
>     }
> } [...]

> The results for the four versions:
>  *rec*   0m1,436s
>  *dbl*   0m2.322s
>  *rpt*  0m13.543s
>  *sub*  0m27.290s

I was a little surprised that the recursive
algorithm was so much faster in Wolfgang's tests.

My original suggestion was intended to be easy to
type and easy to understand.  Since those
constraints have been relaxed, I offer a
replacement function.  (In order to work with
Wolfgang's test fixture, it adopts his
generalization allowing the repeated item to be a
string of arbitrary length.)

function neil(n, s,      l, s0l){
l=1;
s0l=length(s);
while (l*2<=n) {
l=l+l;
s=s s;
};
if (l<n) s=s substr(s, 1, (n-l)*s0l);
return s;
};

Testing 20 successive runs, using Wolfgang's
(j=1; j <= 300000; i++) calling loop to generate
shorter strings, my replacement function seems to
be about twice as fast as Wolfgang's recursive
function (mean elapsed time):

Wolgang:  1.659 s
Neil:     0.886 s

Returning to the original problem, using the
functions to generate strings of 10^8 "x"s produced
elapsed times that varied so much as to be
unilluminating.  I tried the functions on 10^9
"x"s, and they seemed to be about equally fast
(again, mean elapsed time of 20 successive runs):

Wolfgang: 0.833 s
Neil:     0.828 s

```