qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Block alignment of qcow2 compress driver


From: Richard W.M. Jones
Subject: Re: Block alignment of qcow2 compress driver
Date: Fri, 28 Jan 2022 13:36:31 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Jan 28, 2022 at 02:19:44PM +0100, Kevin Wolf wrote:
> Am 28.01.2022 um 13:30 hat Hanna Reitz geschrieben:
> > > > I just changed that line of code [2], as shown in [4].  I suppose
> > > > the better thing to do would be to have an option for the NBD server
> > > > to force-change the announced request alignment, because it can
> > > > expect the qemu block layer code to auto-align requests through
> > > > RMW.  Doing it in the client is wrong, because the NBD server might
> > > > want to detect that the client sends unaligned requests and reject
> > > > them (though ours doesn’t, it just traces such events[5] – note that
> > > > it’s explicitly noted there that qemu will auto-align requests).
> > > I know I said I didn't care about performance (in this case), but is
> > > there in fact a penalty to sending unaligned requests to the qcow2
> > > layer?  Or perhaps it cannot compress them?
> > 
> > In qcow2, only the whole cluster can be compressed, so writing compressed
> > data means having to write the whole cluster.  qcow2 could implement the
> > padding by itself, but we decided to just leave the burden of only writing
> > full clusters (with the COMPRESSED write flag) on the callers.
> > 
> > Things like qemu-img convert and blockdev-backup just adhere to that by
> > design; and the compress driver makes sure to set its request alignment
> > accordingly so that requests to it will always be aligned to the cluster
> > size (either by its user, or by the qemu block layer which performs the
> > padding automatically).
> 
> I thought the more limiting factor would be that after auto-aligning the
> first request by padding with zeros, the second request to the same
> cluster would fail because compression doesn't allow using an already
> allocated cluster:
> 
>     /* Compression can't overwrite anything. Fail if the cluster was already
>      * allocated. */
>     cluster_offset = get_l2_entry(s, l2_slice, l2_index);
>     if (cluster_offset & L2E_OFFSET_MASK) {
>         qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
>         return -EIO;
>     }
> 
> Did you always just test a single request or why don't you run into
> this?

I didn't test that one specifically and yes it does fail:

$ qemu-img create -f qcow2 output.qcow2 1M
Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16
$ qemu-nbd -t --image-opts 
driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2
 &
[1] 2069037

$ nbdsh -u nbd://localhost
nbd> h.set_strict_mode(h.get_strict_mode() & ~nbd.STRICT_ALIGN)
nbd> buf = b'1' * 1024
nbd> h.pwrite(buf, 0)
nbd> h.pwrite(buf, 1024)
Traceback (most recent call last):
  File "/usr/lib64/python3.10/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
  File "/usr/lib64/python3.10/site-packages/nbd.py", line 1631, in pwrite
    return libnbdmod.pwrite(self._o, buf, offset, flags)
nbd.Error: nbd_pwrite: write: command failed: Input/output error (EIO)

So what I said in the previous email about about minimum vs preferred
is wrong :-(

What's more interesting is that nbdcopy still appeared to work.
Simulating what that was doing would be something like which
also fails when I do it directly:

nbd> h.pwrite(buf, 0)
nbd> h.zero(1024, 1024)
Traceback (most recent call last):
  File "/usr/lib64/python3.10/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
  File "/usr/lib64/python3.10/site-packages/nbd.py", line 1782, in zero
    return libnbdmod.zero(self._o, count, offset, flags)
nbd.Error: nbd_zero: write-zeroes: command failed: Input/output error (EIO)

Anyway back to poking at nbdcopy to make it support block sizes ...

> I guess checking L2E_OFFSET_MASK is strictly speaking wrong because it's
> invalid for compressed clusters (qcow2_get_cluster_type() feels more
> appropriate), but in practice, you will always have non-zero data there,
> so it should error out here.
> 
> Kevin

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]