[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to Generate a Long String of the Same Character

From: Bob Proulx
Subject: Re: How to Generate a Long String of the Same Character
Date: Mon, 19 Jul 2021 11:55:30 -0600

Neil R. Ormos wrote:
> That seems really odd.  It takes under 0.5 seconds
> of elapsed time on a machine with a 25-watt mobile
> Core 2 Duo CPU that maxes out at 2.26 GHz.

I have no idea.  This system also has 8GB of RAM.  So maybe there is
memory stress happening?  Linux 5.10.0-6-amd64 kernel here.  Debian.

    $ /usr/bin/time awk 'BEGIN{sizelim=100000000; a="x"; while (length(a) < 
sizelim) {a=a a}; a=substr(a, 1, sizelim); print length(a);}'
    0.96user 0.49system 0:01.45elapsed 99%CPU (0avgtext+0avgdata 
    0inputs+0outputs (0major+249625minor)pagefaults 0swaps

But I am particularly interested in pursuing the performance
differences on this system of mine.  It's not an itch for me.  In the
end it would probably come down to choices made by the package
maintainer and the libraries used and compile flags used at program
compilation time building the Debian package.  Coupled with
particularities of my system here that are different from your system

> I tried your dd | tr | gawk solution and found the
> times vary bizarrely on machines where the pure
> gawk solution has run-times roughly in-line with
> what I'd expect.  Even the elapsed times of
> consecutive individual runs of the dd | tr | gawk
> solution vary strangely.

Hmm...  I rather expect that the large data size being passed through
the pipeline would cause an I/O bottleneck through the pipeline.  It's
all character I/O such as using tr for single character translation.
Handling individual characters is often an inefficient bottleneck.  I
expected it to be worse and more slow.  But then as I timed the
pipeline it fell out that it was quite fast for me so I just kept
moving forward with it.

On multi-core CPUs I would expect the pipeline would naturally split
into parallel processes nicely.  But that adds complexity and perhaps
cache hit and misses being different depending upon other processes
perturbing the flow is causing the differences?

For me the times for the pipeline case seem fairly repeatable and

This is such a small synthetic case.  It's a fun diversion.  But it
isn't an itch for me to try to figure it out further.  Too much else
to do!  But I appreciate your discussion about it.

> Also, I think the blocksize parameter should be
> bs=1MB to get blocks of 10^6 bytes and not 2^20
> bytes.

Oh!  You are correct.  I introduced a bug there.  My bad.  I am so
much in the habit of always using binary powers of two and I threw
that together so quickly that I did not notice that I had added that
bug.  Thank you for pointing it out! :-)

    $ dd status=none if=/dev/zero bs=1M count=100 | tr "\0" "x" | wc -c

    $ dd status=none if=/dev/zero bs=1MB count=100 | tr "\0" "x" | wc -c
I am reminded of this modified misremembered saying.

    There are two hard things in computer science: cache invalidation,
    naming things, and off-by-one errors.
    --With apologies to Phil Karlton, Netscape



reply via email to

[Prev in Thread] Current Thread [Next in Thread]