bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: feature request: gzip/bzip support for sort


From: Jim Meyering
Subject: Re: feature request: gzip/bzip support for sort
Date: Thu, 18 Jan 2007 23:49:27 +0100

Philip Rowlands <address@hidden> wrote:

> On Thu, 18 Jan 2007, Jim Meyering wrote:
>
>> I've done some more timings, but with two more sizes of input.
>> Here's the summary, comparing straight sort with sort --comp=gzip:
>>
>>  2.7GB:   6.6% speed-up
>>  10.0GB: 17.8% speed-up
>
> It would be interesting to see the individual stats returned by wait4(2)
> from the child, to separate CPU seconds spent in sort itself, and in the
> compression/decompression forks.
>
> I think allowing an environment variable to define the compressor is a
> good idea, so long as there's a corresponding --nocompress override
> available from the command line.
>
>>  $ seq 9999999 > k
>>  $ cat k k k k k k k k k > j
>>  $ cat j j j j > sort-in
>>  $ wc -c sort-in
>>  2839999968 sort-in
>
> I had to use "seq -f %.0f" to get this filesize.

Odd.
Here's what those generate for me:

  $ seq 9999999 > k
  $ wc -c < k
  78888888

  $ tail -1 k
  9999999

The remaining "cat" commands merely write 36 copies of that data to sort-in:

  $ (wc -c < k|tr -d '\n'; echo '* 36')|bc
  2839999968

What happens differently for you?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]