[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] qcow2: add discard-no-unref option
|
From: |
Hanna Czenczek |
|
Subject: |
Re: [PATCH] qcow2: add discard-no-unref option |
|
Date: |
Wed, 31 May 2023 14:48:15 +0200 |
|
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 |
On 26.05.23 16:30, Jean-Louis Dupond wrote:
On 26/05/2023 15:31, Hanna Czenczek wrote:
[...]
I don’t follow how this patch is cleaner than the “block: Add zeroes
discard option” patch. That patch’s diff stat is +14/-5, this one’s
is 120+/57-.
Multiple reasons :)
- It's made for this use-case only, as there might be no other
use-cases except this one so the scope for discard=zeroes might be to big
That sounds either like a disadvantage (only usable for qcow2), or like
it makes no practical difference because even with the other approach,
only qcow2 would support this kind of zeroing anyway. In terms of
complexity, I don’t really see the difference whether this is a global
flag for all formats or just for qcow2.
- This one still handles the discards and passes it to the lower
layer, which might be an important reason for the fact you enable
discards
Indeed this is an important benefit that probably conflicts with and
outweighs point 1 above, though not really a reason why this patch would
be “cleaner”.
- The diffstat is mostly bigger because of indention changes in
update_refcount
The fact that update_refcount() is updated at all is reason for caution
to me. Changing how refcounts are updated is generally more on the
dangerous side of things.
Furthermore, going from a net total of +9 to +63 can’t be explained
solely through indentation changes. I understand some of it is also
comments and documentation, but, well.
Now that I understand this patch better, I can see that it probably
better fulfills the intended purpose, but it still seems less clean to
me. Not that it really matters; I was just wondering because you
advertised it as cleaner. O:)
As for better, I don’t want to discount that, but at the same time I
don’t know the reasoning for it. As far as I understand, this patch
makes qcow2 retain the cluster mapping as-is, and only discards on
the lower layer. So effectively, instead of marking the cluster as
zero on the qcow2 level, we do so on the filesystem level. I’m not
sure I see all the implications of that difference.
We want to keep the cluster mapping to avoid creating holes in the
qcow2 cluster.
But we still want to do discards for 2 reasons:
- Mark the cluster as ZERO, so that incremental backups using dirty
bitmaps can just skip this block
- Discard the data on the lower layer for efficiency reasons (for
example if the lower layer has some dedup/compression/whatever), we
still want the lower layer to know the block has been emptied
Agreed, I missed that the cluster is stilled marked as ZERO.
The advantage I see is that this free up disk space, which the
discard=zeroes wouldn’t do. I wonder whether that couldn’t be solved
with an orthogonal qcow2-only option, though, which would have the
qcow2 driver always discard zeroed clusters.
This is an option also indeed. But we will end up with a similar patch
(also in size).
On the other hand, one thing to note is that we’ve had performance
problems in the past with holes in the filesystem level (on tmpfs at
least). qemu generally can get the information on which clusters are
zero quicker from qcow2 than from the filesystem, which suggests that
even if we want to discard something on the filesystem layer, we
probably also want to mark it as zero on the qcow2 layer.
This is what we do in discard_in_l2_slice, we mark the entry as
QCOW_OFLAG_ZERO when we receive a discard request.
Thanks! I missed that.
I also have a small concern about fragmentation on the filesystem
layer – if you use these options to prevent fragmentation, with this
patch, you’re only doing so on the qcow2 layer. Because the cluster
is then discarded in the filesystem, it’s possible to get
fragmentation there, which might not be desirable. I don’t think
that’s too important, but I think it’d just be nice to have a
configuration in which the guest can tell qemu what areas it doesn’t
care about, qemu marks these as zero, so that backups are more
efficient; but at the same time, everything stays allocated, so no
fragmentatoin is introduced.
That would be an additional option/improvement indeed. But it's not
that this patch makes the fragmentation worse then it's already the
case when you enable discard.
It doesn’t, but then again, it is supposed to reduce fragmentation. Only
qcow2 fragmentation, right, but the direction is there. On the other
hand...
If you really want this you might even want your lower level storage
to just ignore discards instead of fixing it in qcow2.
...this makes much more sense! Agreed.
I missed the fact that this patch still marks the cluster as zero on the
qcow2 level. I agree that it’s better to keep the discard on the lower
layer, which the other patch variant can’t achieve, so there is merit to
this approach. I’ll go about reviewing the patch in detail!
Hanna