qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication


From: Benoît Canet
Subject: Re: [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication
Date: Wed, 2 Jan 2013 19:40:53 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Le Wednesday 02 Jan 2013 à 12:26:37 (-0600), Troy Benjegerdes a écrit :
> The probability may be 'low' but it is not zero. Just because it's
> hard to calculate the hash doesn't mean you can't do it. If your
> input data is not random the probability of a hash collision is
> going to get scewed.
> 
> Read about how Bitcoin uses hashes.
> 
> I need a budget of around $10,000 or so for some FPGAs and/or GPU cards,
> and I can make a regression test that will create deduplication hash
> collisions on purpose.

It's not a problem as Eric pointed out while reviewing the previous patchset
there is a small place left with zeroes on the deduplication block.
A bit could be set on it when a collision is detected and an offset could point
to a cluster used to resolve collisions.

> 
> 
> On Wed, Jan 02, 2013 at 06:33:24PM +0100, Beno?t Canet wrote:
> > > How does this code handle hash collisions, and do you have some regression
> > > tests that purposefully create a dedup hash collision, and verify that the
> > > 'right thing' happens?
> > 
> > The two hash function that can be used are cryptographics and not broken 
> > yet.
> > So nobody knows how to generate a collision.
> > 
> > You can do the math to calculate the probability of collision using a 256 
> > bit
> > hash while processing 1EiB of data the result is so low you can consider it
> > won't happen.
> > The sha256 ZFS deduplication works the same way regarding collisions.
> > 
> > I currently use qemu-io-test for testing purpose and iozone with the -w 
> > flag in
> > the guest.
> > I would like to find a good deduplication stress test to run in a guest.
> > 
> > Regards
> > 
> > Beno?t
> > 
> > > It's great that this almost works, but it seems rather dangerous to put
> > > something like this into the mainline code without some regression tests.
> > > 
> > > (I'm also suspecting the regression test will be a great way to find 
> > > flakey hardware)
> > > 
> > > --------------------------------------------------------------------------
> > > Troy Benjegerdes                'da hozer'                 address@hidden
> > > 
> > > Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
> > > software & hardware (http://q3u.be) stuff and not get a real job.
> > > Charles Shultz had the best answer:
> > > 
> > > "Why do musicians compose symphonies and poets write poems? They do it
> > > because life wouldn't have any meaning for them if they didn't. That's why
> > > I draw cartoons. It's my life." -- Charles Shultz
> 
> -- 
> --------------------------------------------------------------------------
> Troy Benjegerdes                'da hozer'                 address@hidden
> 
> Somone asked my why I work on this free (http://www.fsf.org/philosophy/)
> software & hardware (http://q3u.be) stuff and not get a real job.
> Charles Shultz had the best answer:
> 
> "Why do musicians compose symphonies and poets write poems? They do it
> because life wouldn't have any meaning for them if they didn't. That's why
> I draw cartoons. It's my life." -- Charles Shultz
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]