[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22357: grep -f not only huge memory usage, but also huge time cost
From: |
Jim Meyering |
Subject: |
bug#22357: grep -f not only huge memory usage, but also huge time cost |
Date: |
Thu, 10 Mar 2016 09:26:43 -0800 |
On Thu, Mar 10, 2016 at 3:00 AM, JQK <address@hidden> wrote:
> If in the following situation,
>
> ===========
> file1 has numbers from 1 to 200000, 200000 lines
> file2 has several lines(about 200 ~300lines) of random numbers in the
> range of 1-200000
> ===========
>
> The time cost for finishing the following command could be over 15
> minutes on linux -- a little huge.
>
> $ grep -v -f file1 file2
>
> (FYI, on AIX it could only be less than 1 second)
>
> Maybe there is also a room for optimization not only on the memory usage
> but also on the time cost.
What version of grep are you using?
With the latest (grep-2.23), this takes
less than 1.5s on a core-i7-4770S-based system:
$ env time grep -v -f <(seq 200000) <(shuf -i 1-200000 -n 250)
1.27user 0.16system 0:01.43elapsed 100%CPU (0avgtext+0avgdata
839448maxresident)k
0inputs+0outputs (0major+233108minor)pagefaults 0swaps
- bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK, 2016/03/10
- bug#22357: grep -f not only huge memory usage, but also huge time cost,
Jim Meyering <=
- bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK, 2016/03/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/03/11
- bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK, 2016/03/14
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/03/14
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruce Dubbs, 2016/03/14