coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improve sha1sum speed


From: Pádraig Brady
Subject: Re: Improve sha1sum speed
Date: Tue, 06 Sep 2011 14:44:48 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110707 Thunderbird/5.0

On 09/06/2011 02:25 PM, Loïc Le Loarer wrote:
> Hi Pádraig,
> 
> Thank you for your answer.
> 
> 2011/9/6 Pádraig Brady <address@hidden <mailto:address@hidden>>
> 
>     A few general points.
>     You essentially used Linus' code (albeit by
>     very helpfully isolating the significant differences).
>     It might be easier/required to just include it in gnulib?
>     There are a few files in gnulib that are not copyright of the FSF,
>     so would Nicolas and Linus need to assign copyright?
> 
> 
> Yes, this is what I did. I don't thing that including Linus' is easier as the 
> functions have a different prototype. Also, sha1, sha256 and sha512 share the 
> same structure in gnulib, changing one without changing the other would be 
> weird. But if you thing it is required, I have not problem with that.

Ok, let's just use your patches to gnulib so.
The techniques were fairly generic anyway.

> 
> By the way, I have done a test on sha512 and I have improved the speed on the 
> same 1Gb zero file from 4.5 to 3.9s. Please find the patch attached. So I 
> thing that using the same technics, we could improve all sha's speed.
> 
>     For performance testing I've found gcc generates
>     much more deterministic results with a -march
>     as close to native as possible or otherwise
>     the code is very susceptible to alignment issues etc.
>     Your compiler supports -march=native.
>     Note also gcc 4.6 has much better support for your sandy bridge CPU,
>     either with -march=native or -march=corei7-avx
> 
> 
> I tried using gcc-4.6.1 (I recompiled it under my ubuntu 10.10) but I 
> couldn't see any differences. For me, using any combination of -march=native 
> or not and gcc 4.4.5 or 4.6.1 doesn't make a difference, all the times are in 
> the measurement margin.

OK that at least confirms the improvement is fairly deterministic.

> 
>     As for the SSE version, I would also like to see that included,
>     given the proportion of hardware supporting that these days.
>     I previously noticed a coreutils SSE2 patch here:
>     http://www.arctic.org/~dean/crypto/sha1.html 
> <http://www.arctic.org/%7Edean/crypto/sha1.html>
>     Though we'd probably need some runtime SSE detection to include that.
> 
>  
> Ok, I could try to work on this. The real problem is to test that compilation 
> and SSE detection is done correctly on several platform. I only have access 
> to a few x86 machines, what is the usual way to test more platforms ?

It would probably be best to get an account on the GCC compile farm.
http://gcc.gnu.org/wiki/CompileFarm

cheers,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]