help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Counting words, fast!


From: Dennis Williamson
Subject: Re: Counting words, fast!
Date: Tue, 16 Mar 2021 22:30:08 -0500

On Tue, Mar 16, 2021, 5:32 PM Jesse Hathaway <jesse@mbuki-mvuki.org> wrote:

> On Tue, Mar 16, 2021 at 5:19 PM Leonid Isaev (ifax)
> <leonid.isaev@ifax.com> wrote:
> > Nice, but I don't understand why did you choose to implement insertion
> sort
> > in bash? Replacing it with GNU sort(1) significantly cuts down execution
> time:
>
> I was following what I took as the spirit of the constraints:
>
> https://benhoyt.com/writings/count-words/#problem-statement-and-constraints
>
> which don't allow anything but a language's standard library. Surprisingly,
> the bulk of the processing time is in reading of the file and constructing
> the frequency associative array, not in the sorting.
>


I've been playing with your optimized code changing the read to grab data
in chunks like some of the other optimized code does - thus extending your
move from by-word to by-line reading to reading a specified larger number
of characters.

IFS= read -r -N 4096 var

And appending the result of a regular read to end at a newline. This seemed
to cut about 20% off the time. But I get different counts than your code.
I've tried using read without specifying a variable and using the resulting
$REPLY to preserve whitespace but the counts still didn't match.

In any case this points to larger chunks being more efficient.

>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]