qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] qemu-{img,nbd}: Don't report zeroed cluster as a hole


From: Eric Blake
Subject: Re: [PATCH] qemu-{img,nbd}: Don't report zeroed cluster as a hole
Date: Mon, 7 Jun 2021 17:04:54 -0500
User-agent: NeoMutt/20210205

On Mon, Jun 07, 2021 at 04:22:27PM -0500, Eric Blake wrote:

[replying to myself]

> > Here is simpler reproducer:
> > 
> >     # Create a qcow2 image with a raw backing file:
> >     $ qemu-img create base.raw $((4*64*1024))
> >     $ qemu-img create -f qcow2 -b base.raw -F raw top.qcow2
> > 
> >     # Write to first 3 clusters of base:
> >     $ qemu-io -f raw -c "write -P 65 0 64k" base.raw
> >     $ qemu-io -f raw -c "write -P 66 64k 64k" base.raw
> >     $ qemu-io -f raw -c "write -P 67 128k 64k" base.raw
> > 
> >     # Write to second cluster of top, hiding second cluster of base:
> >     $ qemu-io -f qcow2 -c "write -P 69 64k 64k" top.qcow2
> > 
> >     # Write zeroes to third cluster of top, hiding third cluster of base:
> >     $ qemu-io -f qcow2 -c "write -z 128k 64k" top.qcow2

Aha. While reproducing this locally, I typoed this as 'write -z 12k
64k', which absolutely changes the map produced...

> 
> $ ./qemu-nbd -r -t -f qcow2 top.qcow2 -A
> $ nbdinfo --map=qemu:allocation-depth nbd://localhost
>          0      131072    1  local
>     131072      131072    2  backing depth 2
> 
> However, _that_ output looks odd - it claims that clusters 0 and 1 are
> local, and 2 and 3 come from a backing file.  Without reading code, I
> would have expected something closer to the qcow2 view, claiming that
> clusters 1 and 2 are local, while 0 and 3 come from a backing file (3
> could also be reported as unallocated, but only if you use a qcow2 as
> the backing file instead of raw, since we have no easy way to
> determine which holes map to file system allocations in raw files).

and totally explains my confusion here.

> 
> /me goes to debug...  I'll need to reply in a later email when I've
> spent more time on that.
> 

After recreating the file properly, by writing the zeroes at 128k
instead of 12k, I now see:

$ nbdinfo --map=qemu:allocation-depth nbd://localhost
         0       65536    2  backing depth 2
     65536      131072    1  local
    196608       65536    2  backing depth 2

which is EXACTLY what I expected.  And sufficient for you to recreate
your backing chain:

Cluster 0 is backing depth 2 + allocated, so it comes from the backing
file; nothing to write in your replacement top.qcow2.  Cluster 1 is
local + allocated, so it comes from top.qcow2 and consists of actual
data, definitely write that one.  Cluster 2 is local + hole,zero, so
it reads as zero, but comes from top.qcow2 without any allocation;
when building your replacement .qcow2 file, you MUST write this
cluster to match the local allocation and override anything being
inherited from the backing file, but it is up to you whether you write
it as allocated zeroes or as an unallocated but reads-as-zero cluster.
Cluster 3 is backing depth 2 + hole,zero, which means it was read from
the backing file, and you can safely omit it from your replacement
top.qcow2.

> In short, I agree that the current situation is awkward, but I'm not
> sure that this patch is right.  Rather, I'm wondering if we have a bug
> in qemu:allocation-depth, and where once that is fixed, you should be
> using that alongside base:allocation when deciding how to guess on how
> to reconstruct a qcow2 backing chain using only information learned
> over NBD.

And since the problem was in my command line transcription skills, and
not in qemu proper, I don't think we want this patch; rather, I feel
we are better served if you could fix your downstream tooling to start
using qemu:allocation-depth if you are trying to recreate which
portions of a qcow2 file MUST be written in order for that qcow2 file
backed by a different image to provide the same data as seen over NBD.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]