qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] QCOW2 support for LZO compression


From: Peter Lieven
Subject: Re: [Qemu-block] QCOW2 support for LZO compression
Date: Mon, 26 Jun 2017 22:54:35 +0200


> Am 26.06.2017 um 22:30 schrieb Denis V. Lunev <address@hidden>:
> 
>> On 06/26/2017 11:28 AM, Kevin Wolf wrote:
>> [ Cc: qemu-devel; don't post to qemu-block only! ]
>> 
>> Am 26.06.2017 um 09:57 hat Peter Lieven geschrieben:
>>> Hi,
>>> 
>>> I am currently working on optimizing speed for compressed QCOW2
>>> images. We use them for templates and would also like to use them for
>>> backups, but the latter is almost infeasible because using gzip for
>>> compression is horribly slow. I tried to experiment with different
>>> options to deflate, but in the end I think its better to use a
>>> different compression algorithm for cases where speed matters. As we
>>> already have probing for it in configure and as it is widely used I
>>> would like to use LZO for that purpose. I think it would be best to
>>> have a flag to indicate that compressed blocks use LZO compression,
>>> but I would need a little explaination which of the feature fields I
>>> have to use to prevent an older (incompatible) Qemu opening LZO
>>> compressed QCOW2 images.
>>> 
>>> I also have already some numbers. I converted a fresh Debian 9 Install
>>> which has an uncomressed QCOW2 size of 1158 MB with qemu-img to a
>>> compressed QCOW2.  With GZIP compression the result is 356MB whereas
>>> the LZO version is 452MB. However, the current GZIP variant uses 35
>>> seconds for this operation where LZO only needs 4 seconds. I think is
>>> is a good trade in especially when its optional so the user can
>>> choose.
>>> 
>>> What are your thoughts?
>> We had a related RFC patch by Den earlier this year, which never
>> received many comment and never got out of RFC:
>> 
>> https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg04682.html
>> 
>> So he chose a different algorithm (zstd). When I asked, he posted a
>> comparison of algorithms (however a generic one and not measured in the
>> context of qemu) that suggests that LZO would be slightly faster, but
>> have a considerable worse compression ratio with the settings that were
>> benchmarked.
>> 
>> I think it's clear that if there is any serious interest in compression,
>> we'll want to support at least one more algorithm. What we still need to
>> evaluate is which one(s) to take, and whether a simple incompatible flag
>> in the header like in Den's patch is enough or whether we should add a
>> whole new header field for the compression algorithm (like we already
>> have for encryption).
>> 
>> Kevin
> I have been contacted today Yann Collet who is ZSTD maintainer, he has
> dropped
> nowadays status of ZSTD, which could be useful for the discussion:
> 
> "_1. zstd package availability_
> 
> We have been tracking distribution availability since Zstandard official
> release, in September 2016 :
> https://github.com/facebook/zstd/issues/320
> There is also this tool which tracks availability of packages :
> https://repology.org/metapackage/zstd/versions
> 
> zstd seems now available as a package in most recent distributions.
> It’s even part of “core” for recent BSD releases.
> Zstandard v1.0 is still less than 1 year old, so older distributions
> typically do not have it (or support a development version).
> That’s the main limitation currently. We expect things to improve over time.
> 
> 2.
> 
>    _Compression speed is good but does not matter
>    _For such scenarios, it’s possible to trade speed for more compression.
>    At its maximum compression level (--ultra -22), zstd compression
>    ratio (and speed) is close to lzma.
>    A nice property though is that decompression speed remains roughly
>    the same at all compression levels,
>    about 10x faster than lzma decompression speed (about 1 GB/s on
>    modern CPU).
> 
> 3.
> 
>    _zstd is multi-threaded, and it’s dangerous_
> 
> libzstd is single-threaded.
> There is a multi-thread extension, which is enabled in the CLI, but not
> in the library.
> There is also an experimental target which makes it possible to produce
> a MT-enabled library.
> Even in this case, the API remains single-threaded by default.
> It’s necessary to use dedicated entry points to enable multi-threading.
> TL;DR : zstd supports multithreading, but is single threaded by default.
> 
> 
> 
> 4.
> 
>    _How to identify gz format from zstd one ?
>    _Many implementations assume they require to add some custom header
>    in order to identify gz from zstd.
>    That’s not the case: well-formed compression format already provide
>    a header with enough information to guarantee their identity.
>    Such “good” compression format include gz, zstd, xz, lz4-frame, to
>    name a few.
>    For zstd, the identifier is a 4-bytes value, documented in the
>    compression format :
>    
> https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#zstandard-frames
>    As an example, zstd project provides a zlib-wrapper which is able to
>    dynamically recognize an input as gz or zstd, and route to
>    appropriate decoder, without any special header :
>    https://github.com/facebook/zstd/tree/dev/zlibWrapper
> 
> 
> Unfortunately, not all compression algorithm do provide unambiguous
> standard header.
> LZO, for example, does not by default.
> Behind a single name, lzo effectively groups multiple incompatible
> variants, which must be correctly identified for proper decoding."
> 
> Den
> 

Hi Den,

thanks for the update.

I am about to have an RFC patchset ready for addition of the compression 
algorithm header. It will be easy to add support for zstd on top of that.

I would prefer to have the same algorithm for all compressed clusters and avoid 
detecting for each cluster. If you want to change the algorithm you would have 
to recode. This way you can also easily detect at open time of the image if you 
support the compression algorithm and fail directly if necessary.

Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]