[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Speedup wc -l
From: |
Sami Kerola |
Subject: |
Re: [PATCH] Speedup wc -l |
Date: |
Sun, 15 Mar 2015 23:11:06 +0000 |
On 15 March 2015 at 22:18, Pádraig Brady <address@hidden> wrote:
> On 15/03/15 21:14, Kristoffer Brånemyr wrote:
>>
>>
>>
>>
>>>Den söndag, 15 mars 2015 20:13 skrev Pádraig Brady <address@hidden>:
>>>
>>>
>>>>On 15/03/15 08:33, Kristoffer Brånemyr wrote:
>>>>
>>>> Hi,
>>>>
>>>> I did some tests and found out you can actually beat memchr with a simple
>>>> loop. Tests were done on >>a Intel Xeon E3-1231v3 (4*3.4GHz), on a 4GB
>>>> file that was already cached in memory. >>Benchmarking >was done simply
>>>> with the 'time' command. I don't know how this code would run on >>other
>>>> >architectures, but I guess you could put it in an #ifdef?
>>>>
>>>> Coreutils 2.83 version, compiled with -O3:
>>>> 507755520 /home/ztion/words
>>>>
>>>> real 0m3.126s
>>>> user 0m2.699s
>>>> sys 0m0.429s
>>>>
>>>>
>>>> Improved version compiled with -O2:
>>>> 507755520 /home/ztion/words
>>>>
>>>> real 0m2.857s
>>>> user 0m2.461s
>>>> sys 0m0.396s
>>>>
>>>> Improved version compiled with -O3:
>>>> 507755520 /home/ztion/words
>>>>
>>>> real 0m1.518s
>>>> user 0m1.157s
>>>> sys 0m0.361s
>>>>
>>>> I studied the generated assembly and with -O3 gcc generates some fancy SSE
>>>> code, getting some nice speedups. memchr is also SSE optimized as far as I
>>>> know, so it's interesting that this is so much faster, twice as fast
>>>> actually.
>>>>
>>>> In case you don't like turning -O3 on for some reason (the default in
>>>> coreutils is -O2 i think), the best version I could put together for -O2
>>>> was this:
>>>>
>>>> Improved version 2, compiled with -O2:
>>>> 507755520 /home/ztion/words
>>>>
>>>> real 0m2.206s
>>>> user 0m1.827s
>>>> sys 0m0.379s
>>
>>
>>>Interesting. Thanks for the results.
>>>I use 'gcc -march=native -g -O3' locally, and with that can't see a
>>>difference in performance.
>>>
>>>What version of glibc and gcc are you using?
>>>gcc-4.9.2-1.fc21.x86_64 and glibc-2.20-7.fc21.x86_64 here.
>>>
>>>thanks,
>>>Pádraig.
>>
>>
>> Hi,
>>
>> This is with gcc 4.9.2-7 and glibc 2.19-17 on Debian amd64. The difference
>> is still there for me when compiling with your CFLAGS. Have they improved
>> memchr in glibc 2.20? I don't think they have that yet in debian
>> unfortunately.
>>
>> What cpu do you have?
>
>
> i3-2310M
>
> I was doing a very quick test with _short_ lines
> Specifically /usr/share/dict/words
>
> Note GCC should be using builtin_memchr here so not
> hitting the function call overhead.
>
> I'll look in more detail later.
Build from coreutils & gnulib git checkouts from point v8.23-149-gd95cdcc
real 0m0.824s
real 0m0.828s
real 0m0.830s
real 0m0.831s
real 0m0.875s
After Kristoffer's change
real 0m0.774s
real 0m0.776s
real 0m0.778s
real 0m0.779s
real 0m0.780s
I'm using up to date testing archlinux.
$ pacman -Q gcc glibc linux
gcc 4.9.2-4
glibc 2.21-2
linux 3.19.1-1
Built with: gcc -O3 -Ofast
CPU: AMD E1-1200
Reference. My test input had following data:
$ time wc test-input
1141570 8211600 49489140 test-input
--
Sami Kerola
http://www.iki.fi/kerolasa/
- [PATCH] Speedup wc -l, Kristoffer Brånemyr, 2015/03/15
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/15
- Message not available
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/18
- Re: [PATCH] Speedup wc -l, Bernhard Voelker, 2015/03/19
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/19
- Re: [PATCH] Speedup wc -l, Bernhard Voelker, 2015/03/19
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/19
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/23
- Re: [PATCH] Speedup wc -l, Bernhard Voelker, 2015/03/24
- Re: [PATCH] Speedup wc -l, Pádraig Brady, 2015/03/24
- RE: [PATCH] Speedup wc -l, William Bader, 2015/03/15