help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to Generate a Long String of the Same Character


From: Neil R. Ormos
Subject: How to Generate a Long String of the Same Character
Date: Wed, 14 Jul 2021 17:31:46 -0500 (CDT)

In a message on the bug-gawk list, Ed Mortin wrote:

> On an online forum someone asked how to generate a
> string of 100,000,000 "x"s. They had tried this in
> a BEGIN section:
> 
>    for(i=1;i<=100000000;i++) s = s "x"
> 
> and wanted to know if there was a better
> approach. Someone suggested:
> 
>    s=sprintf("%*s",1000000000,""); gsub(/ /,"x",s)}
> 
> which is also what I'd have also suggested, but
> upon testing that they found that the sprintf+gsub
> approach was slower than the loop in gawk 5.1.0
> and while I couldn't reproduce that exactly on
> cygwin, I can confirm that the sprintf+gsub
> solution is much slower than I expected. [...]

I am posting here to reply to the original
question because my comment does not relate to the
apparent speed-of-gsub() bug Ed was reporting.

Building a big string by iterating in tiny chunks
would seem to invite poor performance.

Instead, why not append the string to itself,
doubling its size with each iteration?  For
example:

time ~/.local/bin/gawk-5.1.0 \
  'BEGIN{sizelim=100000000; a="x"; while (length(a) < sizelim) {a=a a}; 
a=substr(a, 1, sizelim); print length(a);}'

On my not-very-fast machine, according to the time
built-in, that takes 0.17 seconds of elapsed time.

Yes, worst-case, if the intended string has length
(2^N)+1, you wastefully build a string of size
2^(N+1) and trim off almost half.  So maybe on
some machines, building the string in
single-character units would work but the doubling
would not.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]