[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reed-Solomon optimised

From: Vladimir 'φ-coder/phcoder' Serbinenko
Subject: Re: Reed-Solomon optimised
Date: Mon, 14 Nov 2011 11:26:02 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20111010 Iceowl/1.0b2 Icedove/3.1.15

On 13.11.2011 23:41, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> Hello, all. When I was implementing raidz, I thought of a more
> convenient, compact and faster representation of GF(256). Using it I
> could make raid6_recover more compact and Reed-Solomon more fast. Later
> is very important since we have to run Reed-Solomon check (but not
> necessarily much slower recovery) on every boot. I expected 8-fold
> speedup but got 16-18-fold one. The idea was the following:
> - In standard representation of GF(256) it's fast to add 2 numbers: it's
> just a XOR.
> - We also know that every non-zero number in GF(256) is represented as a
> power of X: a=X^s
> So if we know that a=X^s and b=X^t then
> c=ab=X^sX^t=X^(s+t)
> Since there are only 255 such elements for each one of them we can
> precompute s s.t. X^s=a and we can also precompute powers of X. By
> choosing s in interval 0...254 we ensure that 0<=s+t<=508, so we need to
> store only 508 elements and then we can multiply any 2 elements by first
> checking that both are non-zero (and output 0 if any of them is), one
> addition and 3 table lookups (which are probably in cache given their
> tiny size) which is much faster than the loop with shifts.
> Same idea is usable for GF(2^n) up to n=16. Currently in the code we
> have GF(256), GF(2^32), GF(2^64) and GF(2^128).
> GF(2^32) and GF(2^64) are used for CRC-32/CRC-64 where they are implicit
> with all 256 values precomputed.
Correction,only CRC with irreducible polynomial is connected with GF.
Other CRC are connected with other rings which aren't even integral
domains. And it looks like the polynoms in the CRCs used by GRUB aren't
irreducible. Sorry for this mistake.
> GF(2^128) is used in cryptodisk and ZFS-crypto. In cryptodisk it's used
> in LRW and XTS. In LRW thanks to the trick from LRW we need only 2
> multiplications per sector. XTS does only multiplications with X which
> are fast (although the version in GRUB could be optimised by shifting
> uint64 and not uint8 at time but it needs care to avoid endianness
> problems).
> ZFS-crypto in GCM on the other hand probably has the problem of slow
> GF(2^128) multiplication. It's possible to optimise it somehow using
> some precomputation but it hasn't been done.
> Also if you notice any other ways to speed up the boot or make critical
> modules or kernel more compact, please share your ideas.

Vladimir 'φ-coder/phcoder' Serbinenko

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]