gsub() is very slow in gawk 5.1.0

bug-gawk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gsub() is very slow in gawk 5.1.0

From:	Ed Morton
Subject:	gsub() is very slow in gawk 5.1.0
Date:	Wed, 14 Jul 2021 08:20:57 -0500
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On an online forum someone asked how to generate a string of 100,000,000"x"s. They had tried this in a BEGIN section:


   for(i=1;i<=100000000;i++) s = s "x"

and wanted to know if there was a better approach. Someone suggested:

   s=sprintf("%*s",1000000000,""); gsub(/ /,"x",s)}

which is also what I'd have also suggested, but upon testing that theyfound that the sprintf+gsub approach was slower than the loop in gawk5.1.0 and while I couldn't reproduce that exactly on cygwin, I canconfirm that the sprintf+gsub solution is much slower than I expected:


   $ time awk 'BEGIN{for(i=1;i<=100000000;i++) s = s "x"}'

   real    1m19.439s
   user    0m28.562s
   sys     0m50.811s

   $ time awk 'BEGIN{s=sprintf("%*s",100000000,""); gsub(/ /,"x",s)}'

   real    0m36.604s
   user    0m36.093s
   sys     0m0.390s

If I remove the gsub() then it runs in half a second:

   $ time awk 'BEGIN{s=sprintf("%*s",100000000,"")}'

   real    0m0.423s
   user    0m0.171s
   sys     0m0.202s

so the gsub() itself is taking over 36 seconds to run. Someone else ranthe script on a Mac with BSD awk 20070501 and got:


   $ time awk  'BEGIN {s = sprintf("%*s", 100000000, ""); gsub(/ /,
   "x", s)}'

   real    0m1.744s
   user    0m1.645s
   sys 0m0.098s

i.e. it ran in under 2 seconds and yet another person said the gawksolution took 23.5 seconds on their Mac.

So, something is causing gsub() in gawk 5.1.0 is running very slowly forthis case.

Ed.

[Prev in Thread]

Current Thread

[Next in Thread]

gsub() is very slow in gawk 5.1.0, Ed Morton <=
- Re: gsub() is very slow in gawk 5.1.0, Neil R. Ormos, 2021/07/14
  - Re: gsub() is very slow in gawk 5.1.0, Ed Morton, 2021/07/14
- Re: gsub() is very slow in gawk 5.1.0, Ed Morton, 2021/07/14
- Re: gsub() is very slow in gawk 5.1.0, arnold, 2021/07/15
  - Re: gsub() is very slow in gawk 5.1.0, Ed Morton, 2021/07/15
    - Re: gsub() is very slow in gawk 5.1.0, arnold, 2021/07/15
    - Re: gsub() is very slow in gawk 5.1.0, Wolfgang Laun, 2021/07/15
    - Re: gsub() is very slow in gawk 5.1.0, Ed Morton, 2021/07/15

Prev by Date: Re: RS='.^' apparently ignores the RS setting
Next by Date: Re: gsub() is very slow in gawk 5.1.0
Previous by thread: RS='.^' apparently ignores the RS setting
Next by thread: Re: gsub() is very slow in gawk 5.1.0
Index(es):
- Date
- Thread