qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 1/1] qcow2: add ZSTD compression feature


From: Kevin Wolf
Subject: Re: [Qemu-devel] [RFC 1/1] qcow2: add ZSTD compression feature
Date: Thu, 23 Mar 2017 22:20:01 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 23.03.2017 um 16:35 hat Denis V. Lunev geschrieben:
> On 03/23/2017 06:04 PM, Kevin Wolf wrote:
> > Am 23.03.2017 um 15:17 hat Eric Blake geschrieben:
> >> On 03/23/2017 08:28 AM, Denis V. Lunev wrote:
> >>> ZSDT compression algorithm consumes 3-5 times less CPU power with a
> >> s/ZSDT/ZSTD/
> >>
> >>> comparable comression ratio with zlib. It would be wise to use it for
> >> s/comression/compression/
> >>
> >>> data compression f.e. for backups.
> > Note that we don't really care that much about fast compression because
> > that's an one time offline operation. Maybe a better compression ratio
> > while maintaining decent decompression performance would be the more
> > important feature?
> >
> > Or are you planning to extend the qcow2 driver so that compressed
> > clusters are used even for writes after the initial conversion? I think
> > it would be doable, and then I can see that better compression speed
> > becomes important, too.
> we should care about backups :) they can be done using compression
> event right now and this is done in real time when VM is online.
> Thus any additional CPU overhead counts, even if compressed data is
> written only once.

Good point. I have no idea about ZSTD, but maybe compression speed vs.
ratio can even be configurable?

Anyway, I was mostly trying to get people to discuss the compression
algorithm. I'm not against this one, but I haven't checked whether it's
the best option for our case.

So I'd be interested in which algorithms you considered, and what was
the reason to decide for ZSTD?

> >>> The patch adds incompatible ZSDT feature into QCOW2 header that indicates
> >>> that compressed clusters must be decoded using ZSTD.
> >>>
> >>> Signed-off-by: Denis V. Lunev <address@hidden>
> >>> CC: Kevin Wolf <address@hidden>
> >>> CC: Max Reitz <address@hidden>
> >>> CC: Stefan Hajnoczi <address@hidden>
> >>> CC: Fam Zheng <address@hidden>
> >>> ---
> >>> Actually this is very straightforward. May be we should implement 2 stage
> >>> scheme, i.e. add bit that indicates presence of the "compression
> >>> extension", which will actually define the compression algorithm. Though
> >>> at my opinion we will not have too many compression algorithms and 
> >>> proposed
> >>> one tier scheme is good enough.
> >> I wouldn't bet on NEVER changing compression algorithms again, and while
> >> I suspect that we won't necessarily run out of bits, it's safer to not
> >> require burning another bit every time we change our minds.  Having a
> >> two-level scheme means we only have to burn 1 bit for the use of a
> >> compression extension header, where we can then flip algorithms in the
> >> extension header without having to burn a top-level incompatible feature
> >> bit every time.
> > Header extensions make sense for compatible features or for variable
> > size data. In this specific case I would simply increase the header size
> > if we want another field to store the compression algorithm. And I think
> > having such a field is a good idea.
> >
> >>>  docs/specs/qcow2.txt | 5 ++++-
> >>>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> >>> index 80cdfd0..eb5c41b 100644
> >>> --- a/docs/specs/qcow2.txt
> >>> +++ b/docs/specs/qcow2.txt
> >>> @@ -85,7 +85,10 @@ in the description of a field.
> >>>                                  be written to (unless for regaining
> >>>                                  consistency).
> >>>  
> >>> -                    Bits 2-63:  Reserved (set to 0)
> >>> +                    Bits 2:     ZSDT compression bit. ZSDT algorithm is 
> >>> used
> >> s/ZSDT/ZSTD/
> >>
> >> Another reason I think you should add a compression extension header:
> >> compression algorithms are probably best treated as mutually-exclusive
> >> (the entire image should be compressed with exactly one compressor).
> >> Even if we only ever add one more type (say 'xz') in addition to the
> >> existing gzip and your proposed zstd, then we do NOT want someone
> >> specifying both xz and zstd at the same time.  Having a single
> >> incompatible feature bit that states that a compression header must be
> >> present and honored to understand the image, where the compression
> >> header then chooses exactly one compression algorithm, seems safer than
> >> having two separate incompatible feature bits for two opposing algorithms
> > Actually, if we used compression after the initial convert, having
> > mixed-format images would make a lot of sense because after an update
> > you could then start using a new compression format on an image that
> > already has some compressed clusters.
> >
> > But we have neither L2 table bits left for this nor do we use
> > compression for later writes, so I agree that we'll have to make them
> > mututally exclusive in this reality.
> >
> > Kevin
> There are compression magics, which could be put into data at the cost
> of some additional bytes. In this case compression header must report
> all supported compression algorithms and this indeed are incompatible
> header bits. The image can not be opened if some used compression
> algorithms are not available.

Hmm... I don't think it's really necessary, but it could be an option.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]