[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Improve sha1sum speed
From: |
Pádraig Brady |
Subject: |
Re: Improve sha1sum speed |
Date: |
Tue, 06 Sep 2011 14:44:48 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:5.0) Gecko/20110707 Thunderbird/5.0 |
On 09/06/2011 02:25 PM, Loïc Le Loarer wrote:
> Hi Pádraig,
>
> Thank you for your answer.
>
> 2011/9/6 Pádraig Brady <address@hidden <mailto:address@hidden>>
>
> A few general points.
> You essentially used Linus' code (albeit by
> very helpfully isolating the significant differences).
> It might be easier/required to just include it in gnulib?
> There are a few files in gnulib that are not copyright of the FSF,
> so would Nicolas and Linus need to assign copyright?
>
>
> Yes, this is what I did. I don't thing that including Linus' is easier as the
> functions have a different prototype. Also, sha1, sha256 and sha512 share the
> same structure in gnulib, changing one without changing the other would be
> weird. But if you thing it is required, I have not problem with that.
Ok, let's just use your patches to gnulib so.
The techniques were fairly generic anyway.
>
> By the way, I have done a test on sha512 and I have improved the speed on the
> same 1Gb zero file from 4.5 to 3.9s. Please find the patch attached. So I
> thing that using the same technics, we could improve all sha's speed.
>
> For performance testing I've found gcc generates
> much more deterministic results with a -march
> as close to native as possible or otherwise
> the code is very susceptible to alignment issues etc.
> Your compiler supports -march=native.
> Note also gcc 4.6 has much better support for your sandy bridge CPU,
> either with -march=native or -march=corei7-avx
>
>
> I tried using gcc-4.6.1 (I recompiled it under my ubuntu 10.10) but I
> couldn't see any differences. For me, using any combination of -march=native
> or not and gcc 4.4.5 or 4.6.1 doesn't make a difference, all the times are in
> the measurement margin.
OK that at least confirms the improvement is fairly deterministic.
>
> As for the SSE version, I would also like to see that included,
> given the proportion of hardware supporting that these days.
> I previously noticed a coreutils SSE2 patch here:
> http://www.arctic.org/~dean/crypto/sha1.html
> <http://www.arctic.org/%7Edean/crypto/sha1.html>
> Though we'd probably need some runtime SSE detection to include that.
>
>
> Ok, I could try to work on this. The real problem is to test that compilation
> and SSE detection is done correctly on several platform. I only have access
> to a few x86 machines, what is the usual way to test more platforms ?
It would probably be best to get an account on the GCC compile farm.
http://gcc.gnu.org/wiki/CompileFarm
cheers,
Pádraig.